Building views with R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
[Here you can see the Building views with R cheat sheet at a full resolution]
Queries
In database theory a query is a request for data or information from a database table or combination of tables.
Since dplyr we have something that quite closely conceptually resembles a query in R:
require(dplyr)
## Warning: package 'dplyr' was built under R version 3.2.5
require(pryr)
mtcars %>% tbl_df() %>% group_by(cyl) %>% summarise(mean_mpg = mean(mpg), sd_mpg = sd(mpg))
## # A tibble: 3 × 3 ## cyl mean_mpg sd_mpg ## <dbl> <dbl> <dbl> ## 1 4 26.66364 4.509828 ## 2 6 19.74286 1.453567 ## 3 8 15.10000 2.560048
I particularly appreciate of dplyr the possibility of building my query as a step by step set of R statement that I can progressively test at each step.
Views
Again in database theory, a view is the result set of a stored query on the data, which the database users can query just as they would in a table.
I would like to have something similar to a view in R
As far as I know, I can achieve this goal in three ways:
- Function
makeActiveBinding - Operator
%>a%from packagepryr - My proposed `%>>% operator
Function makeActiveBinding()
Function makeActiveBinding(sym, fun, env) installs a function in an environment env so that getting the value of sym calls fun with no arguments.
As a basic example I can actively bind a function that simulates a dice to an object named dice :
makeActiveBinding("dice", function() sample(1:6, 1), env = globalenv())so that:
replicate(5 , dice)
## [1] 5 1 6 2 3
Similarly, I can wrap adplyr expression into a function:
f <- function() {mtcars %>%
group_by(cyl) %>%
summarise(mean_mpg = mean(mpg), sd_mpg = sd(mpg))}and then actively bind it to a symbol:
makeActiveBinding('view', f , env = globalenv())
so that, any time we call view the result of function f()is computed again:
view
## # A tibble: 3 × 3 ## cyl mean_mpg sd_mpg ## <dbl> <dbl> <dbl> ## 1 4 26.66364 4.509828 ## 2 6 19.74286 1.453567 ## 3 8 15.10000 2.560048
As a result, if I change any value of mpg within mtcars, view is automatically updated:
mtcars$mpg[c(1,3,5)] <- 0 view
## # A tibble: 3 × 3 ## cyl mean_mpg sd_mpg ## <dbl> <dbl> <dbl> ## 1 4 24.59091 9.231192 ## 2 6 16.74286 7.504189 ## 3 8 13.76429 4.601606
Clearly, I have to admit that all of this looks quite unfriendly, at least to me.
Operator %<a-%
A valid alternative, that wraps away the complexity of function makeActiveBinding() is provided by operator %<a-% from package pryr:
view %<a-% {mtcars %>%
group_by(cyl) %>%
summarise(mean_mpg = mean(mpg), sd_mpg = sd(mpg))}
Again, if I change any value of mpg within mtcars, the value of view get automatically updated:
mtcars$mpg[c(1,3,5)] <- 50 view
## # A tibble: 3 × 3 ## cyl mean_mpg sd_mpg ## <dbl> <dbl> <dbl> ## 1 4 29.13636 8.159568 ## 2 6 23.88571 11.593451 ## 3 8 17.33571 9.688503
Note that in this case I have to enclose the whole expression within curly brackets.
Moreover, the final assignment: %<a-% goes on the left hand side of my chain of dplyr statements.
Operator %>>%
Finally I would like to propose a third alternative, still based on makeActiveBinding(), that I named %>>%
`%>>%` <- function( expr, x) {
x <- substitute(x)
call <- match.call()[-1]
fun <- function() {NULL}
body(fun) <- call$expr
makeActiveBinding(sym = deparse(x), fun = fun, env = parent.frame())
invisible(NULL)
}that can be used as:
mtcars %>% group_by(cyl) %>% summarise(mean_mpg = mean(mpg), sd_mpg = sd(mpg)) %>>% view
And again, if I change the values of mpg:
mtcars$mpg[c(1,3,5)] <- 100
The content of view changes accordingly
view
## # A tibble: 3 × 3 ## cyl mean_mpg sd_mpg ## <dbl> <dbl> <dbl> ## 1 4 33.68182 22.41624 ## 2 6 31.02857 30.44321 ## 3 8 20.90714 22.88454
I believe this operator offers two advantages:
- Avoids the usage of curly brackets around my
dplyrexpression - Allows me to actively assign the result of my chain of
dplyrstatements, in a more natural way at the end of the chain
The post Building views with R appeared first on Quantide - R training & consulting.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
