Articles by John Mount

Campaign Response Testing no longer published on Udemy

June 8, 2017 | John Mount

Our free video course Campaign Response Testing is no longer published on Udemy. It remains available for free on YouTube with all source code available from GitHub. I’ll try to correct bad links as I find them. Please read on for the reasons. Udemy recently unilaterally instituted a new ... [Read more...]

More on safe substitution in R

June 7, 2017 | John Mount

Let’s worry a bit about substitution in R. Substitution is very powerful, which means it can be both used and mis-used. However, that does not mean every use is unsafe or a mistake. From Advanced R : substitute: We can confirm the above code performs no substitution: a
[Read more...]

There is usually more than one way in R

June 5, 2017 | John Mount

Python has a fairly famous design principle (from “PEP 20 — The Zen of Python”): There should be one– and preferably only one –obvious way to do it. Frankly in R (especially once you add many packages) there is usually more than one way. As an example we will talk about the ... [Read more...]

R summary() got better!

June 4, 2017 | John Mount

Here is a really nice feature found in the current 3.4.0 version of R: summary() has become a lot more reasonable. summary(15555) # Min. 1st Qu. Median Mean 3rd Qu. Max. # 15555 15555 15555 15555 15555 15555 Please read on for some background. In older versions of R (say R 3.3.1) the above code … Continue reading R summary() got ... [Read more...]

In defense of wrapr::let()

June 1, 2017 | John Mount

Saw this the other day: In defense of wrapr::let() (originally part of replyr, and still re-exported by that package) I would say: let() was deliberately designed for a single real-world use case: working with data when you don’t know the column names when you are writing the code (...
[Read more...]

Summarizing big data in R

May 30, 2017 | John Mount

Our next "R and big data tip" is: summarizing big data. We always say "if you are not looking at the data, you are not doing science"- and for big data you are very dependent on summaries (as you can’t actually look at everything). Simple question: is there ... [Read more...]

Managing Spark data handles in R

May 26, 2017 | John Mount

When working with big data with R (say, using Spark and sparklyr) we have found it very convenient to keep data handles in a neat list or data_frame. Please read on for our handy hints on keeping your data handles neat. When using R to work over a big ...
[Read more...]

On indexing operators and composition

May 18, 2017 | John Mount

In this article I will discuss array indexing, operators, and composition in depth. If you work through this article you should end up with a very deep understanding of array indexing and the deep interpretation available when we realize indexing is an instance of function composition (or an example of ...
[Read more...]

dplyr in Context

May 6, 2017 | John Mount

Introduction Beginning R users often come to the false impression that the popular packages dplyr and tidyr are both all of R and sui generis inventions (in that they might be unprecedented and there might no other reasonable way to get the same effects in R). These packages and their ...
[Read more...]

Why to use wrapr::let()

May 2, 2017 | John Mount

I have written about referential transparency before. In this article I would like to discuss “leaky abstractions” and why wrapr::let() supplies a useful (but leaky) abstraction for R programmers. Abstractions A common definition of an abstraction is (from the OSX dictionary): the process of considering something independently of its ...
[Read more...]

Programming over R

April 21, 2017 | John Mount

R is a very fluid language amenable to meta-programming, or alterations of the language itself. This has allowed the late user-driven introduction of a number of powerful features such as magrittr pipes, the foreach system, futures, data.table, and dplyr. Please read on for some small meta-programming effects we have ...
[Read more...]

Visualizing relational joins

April 4, 2017 | John Mount

I want to discuss a nice series of figures used to teach relational join semantics in R for Data Science by Garrett Grolemund and Hadley Wickham, O’Reilly 2016. Below is an example from their book illustrating an inner join: Please read on for my discussion of this diagram and teaching ...
[Read more...]

Coordinatized Data: A Fluid Data Specification

March 29, 2017 | John Mount

Authors: John Mount and Nina Zumel. Introduction It’s been our experience when teaching the data wrangling part of data science that students often have difficulty understanding the conversion to and from row-oriented and column-oriented data formats (what is commonly called pivoting and un-pivoting). Boris Artzybasheff illustration Real trust and ...
[Read more...]

Datashader is a big deal

March 22, 2017 | John Mount

I recently got back from Strata West 2017 (where I ran a very well received workshop on R and Spark). One thing that really stood out for me at the exhibition hall was Bokeh plus datashader from Continuum Analytics. I had the privilege of having Peter Wang himself demonstrate datashader for ...
[Read more...]

Another R [Non-]Standard Evaluation Idea

March 17, 2017 | John Mount

Jonathan Carroll had a an interesting R language idea: to use @-notation to request value substitution in a non-standard evaluation environment (inspired by msyql User-Defined Variables). He even picked the right image: The idea is kind of reverse from some Lisp ideas ("evaled unless ticked"), but an interesting possibility. We ...
[Read more...]
1 13 14 15 16 17 24

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)