How can we enclose the previous operations inside a function? Simple! Using
Let us apply the previous function to
do_()
(the SE version of do()
) and interp()
function of lazyeval
package.
lazyeval
is an R package, written and maintained by Hadley Wickham. It represents a new approach to Non Standard Evaluation (NSE) for R. The difference between SE and NSE approaches is the quoting of input variable names. NSE is suitable for interactive use (see the previous paragraph), but not for programming, for which SE approach is recommended.
Install and load lazyeval
, if you haven’t already done it.
1
|
install.packages("lazyeval")
|
1
|
require(lazyeval)
|
interp()
helps to build the expression up from a mixture of constants and variables to be passed to .dots
argument of dplyr
verbs. For more details see Non Standard Evaluation vignette.
In the following example interp()
is used to build up the expression to be passed to .dots
argument of group_by_()
(SE version of group_by()
), which consists of the grouping variable name. It is used also to build up the expression to be passed to .dots
argument of do_()
. This expression consists of the function name specifying also its arguments in brackets.
1
2
3
4
5
6
7
8
9
10
11
|
fun <- function(data, x_var_name, y_var_name, group_var_name){
# group_by_() .dots argument
group_dots <- interp(~ group_var_name, group_var_name = as.name(group_var_name))
# do_() .dots argument
do_dots = interp( ~ my_fun(x = .[[x_var_name]], y = .[[y_var_name]]))
# Operations
out <- data %>%
group_by_(.dots = group_dots) %>%
do_(.dots = do_dots)
return(out)
}
|
ds
dataset:
1
|
fun(data=ds, x_var_name="x", y_var_name="y", group_var_name="group")
|
1
2
3
4
5
6
7
8
|
Source: local data frame [3 x 3]
Groups: group [3]
group res_x res_y
(fctr) (dbl) (dbl)
1 a 5.005825 9.167546
2 b 5.022282 8.683619
3 c 5.025586 11.240558
|
Other Examples
do()
is often used to fit models and to display the results.
Look at the following functions!
Let us define a function that fits linear model and returns coefficients as a data frame and apply it to ds
by group
:
1
2
3
4
5
6
|
# Function that fits linear model and returns coefficients as a data frame
my_fun_2 <- function(data, x, y){
mod = lm(formula = x~y, data = data)
out = data.frame(intercept=mod$coefficients[1], slope=mod$coefficients[2])
return(out)
}
|
1
2
|
# Apply my_fun_2() function (unnamed elements and nse version) to ds by group
ds %>% group_by(group) %>% do(my_fun_2(x=x, y=y, data=.))
|
1
2
3
4
5
6
7
8
|
Source: local data frame [3 x 3]
Groups: group [3]
group intercept slope
(fctr) (dbl) (dbl)
1 a 2.939123 0.03637955
2 b 3.149110 -0.07302733
3 c 3.249187 -0.09946141
|
ds
bygroup
:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# Enclose the previous operations inside a function
fun_2 <- function(data, x_var_name, y_var_name, group_var_name){
# group_by_() .dots argument
group_dots <- interp(~ group_var_name, group_var_name = as.name(group_var_name))
# do_() .dots argument
do_dots = interp( ~ my_fun_2(data=., x = x_var_name, y = y_var_name),
x_var_name=as.name(x_var_name), y_var_name=as.name(y_var_name))
# Operations
res <- data %>%
group_by_(.dots = group_dots) %>%
do_(.dots = do_dots)
return(res)
}
|
1
2
|
# Apply fun_2() function (se version) to ds by group
fun_2(data=ds, x_var_name="x", y_var_name="y", group_var_name="group")
|
1
2
3
4
5
6
7
8
|
Source: local data frame [3 x 3]
Groups: group [3]
group intercept slope
(fctr) (dbl) (dbl)
1 a 2.939123 0.03637955
2 b 3.149110 -0.07302733
3 c 3.249187 -0.09946141
|