Exploring NSE: enquo, quos and …
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As one gets more interested in building your own custom functions, you quickly start realising that unless your functions are tidyverse
friendly, standardising your code workflow becomes a problem. So, how do you make your customs play well with your favourite tidyverse
packages? Our friendly little helpers are going to be enquo
and quos
. I am going to build a function that calculates the proportion and cumulative proportion of a grouping variable.
suppressPackageStartupMessages(library(dplyr)) prop_count <- function(df, vars){ vars_col <- enquo(vars) print(vars_col) df %>% count(!!vars_col, sort = T) %>% mutate(prop_n = prop.table(n)) %>% mutate(cumsum_n = cumsum(prop_n)) } dplyr::starwars %>% prop_count(homeworld) ## <quosure> ## expr: ^homeworld ## env: 000000000C4567B8 ## # A tibble: 49 x 4 ## homeworld n prop_n cumsum_n ## <chr> <int> <dbl> <dbl> ## 1 Naboo 11 0.126 0.126 ## 2 Tatooine 10 0.115 0.241 ## 3 <NA> 10 0.115 0.356 ## 4 Alderaan 3 0.0345 0.391 ## 5 Coruscant 3 0.0345 0.425 ## 6 Kamino 3 0.0345 0.460 ## 7 Corellia 2 0.0230 0.483 ## 8 Kashyyyk 2 0.0230 0.506 ## 9 Mirial 2 0.0230 0.529 ## 10 Ryloth 2 0.0230 0.552 ## # ... with 39 more rows
From the output we can see that quosures are quoted expressions that keep track of an environment or function and we can use the bang bang (!!
) to evaluate (or unquote) the columns. What happens when we are looking to get the proportional count of multiple variable?
dplyr::starwars %>% prop_count(homeworld, species) ## Error in prop_count(., homeworld, species): unused argument (species)
We get an error, as the second argument in the function is interpreted as exactly that, a second argument. We want our function to accommodate multiple grouping variables. This is where quos
and ...
come in. The ellips is analogous to multiple arguments or input.
prop_count <- function(df, ...){ vars_col <- quos(...) print(vars_col) df %>% count(!!!vars_col, sort = T) %>% mutate(prop_n = prop.table(n)) %>% mutate(cumsum_n = cumsum(prop_n)) } dplyr::starwars %>% prop_count(homeworld, species) ## [[1]] ## <quosure> ## expr: ^homeworld ## env: 000000000BFAE918 ## ## [[2]] ## <quosure> ## expr: ^species ## env: 000000000BFAE918 ## # A tibble: 58 x 5 ## homeworld species n prop_n cumsum_n ## <chr> <chr> <int> <dbl> <dbl> ## 1 Tatooine Human 8 0.0920 0.0920 ## 2 Naboo Human 5 0.0575 0.149 ## 3 <NA> Human 5 0.0575 0.207 ## 4 Alderaan Human 3 0.0345 0.241 ## 5 Naboo Gungan 3 0.0345 0.276 ## 6 Corellia Human 2 0.0230 0.299 ## 7 Coruscant Human 2 0.0230 0.322 ## 8 Kamino Kaminoan 2 0.0230 0.345 ## 9 Kashyyyk Wookiee 2 0.0230 0.368 ## 10 Mirial Mirialan 2 0.0230 0.391 ## # ... with 48 more rows
Now our function accommodates multiple inputs in the tidyverse
fashion! If you feel like reading more about Non-standard evaluation, go read the full documentation
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.