Using mutate from dplyr inside a function: getting around non-standard evaluation
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
To edit or add columns to a data.frame
, you can use mutate
from the dplyr
package:
library(dplyr) mtcars %>% mutate(new_column = mpg + wt)
Here, dplyr
uses non-standard evaluation in finding the contents for mpg
and wt
, knowing that it needs to look in the context of mtcars
. This is nice for interactive use, but not so nice for using mutate
inside a function where mpg
and wt
are inputs to the function.
The goal is to write a function f
that takes the columns in mtcars
you want to add up as strings, and executes mutate
. Note that we also want to be able to set the new column name. A first naive approach might be:
f = function(col1, col2, new_col_name) { mtcars %>% mutate(new_col_name = col1 + col2) }
The problem is that col1
and col2
are not interpreted, in stead dplyr
tries looking for col1
and col2
in mtcars. In addition, the name of the new column will be new_col_name
, and not the content of new_col_name
. To get around non-standard evaluation, you can use the lazyeval
package. The following function does what we expect:
library(lazyeval) f = function(col1, col2, new_col_name) { mutate_call = lazyeval::interp(~ a + b, a = as.name(col1), b = as.name(col2)) mtcars %>% mutate_(.dots = setNames(list(mutate_call), new_col_name)) } head(f('wt', 'mpg', 'hahaaa')) mpg cyl disp hp drat wt qsec vs am gear carb hahaaa 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 23.620 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 23.875 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 25.120 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 24.615 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 22.140 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 21.560
The important parts here are, given the call to f
above:
lazyeval::interp(~ a + b, a = as.name(col1), b = as.name(col2))
this creates the expressionwt + mpg
.mutate_(mutate_call)
wheremutate_
is the version of mutate that uses standard evaluation (SE).setNames(list(mutate_call), new_col_name))
sets the output name to the content ofnew_col_name
, i.e.hahaaa
.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.