Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
wrapr
1.6.2
is now up on CRAN. We have some neat new features for R
users to try (in addition to many earlier wrapr
goodies).
The first is the %in_block%
alternate notation for let()
.
The wrapr
let()
-block allows easy replacement of names in name-capturing interfaces (such as transform()
), as we show below.
library("wrapr") column_mapping <- qc( AREA_COL = Sepal.Area, LENGTH_COL = Sepal.Length, WIDTH_COL = Sepal.Width ) # let-block notation let( alias = column_mapping, iris %.>% transform(., AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) %.>% subset(., select = qc(Species, AREA_COL)) %.>% head(.) )
## Species Sepal.Area ## 1 setosa 14.01936 ## 2 setosa 11.54535 ## 3 setosa 11.81239 ## 4 setosa 11.19978 ## 5 setosa 14.13717 ## 6 setosa 16.54049
The qc()
notation allowed us to specify a named-vector
without quotes. qc(a = b)
is equivalent to c("a" = "b")
.
With the %in_block%
operator notation one writes the let()
-block as an in-line operator supplying the mapping into a code block. The above example can now be re-written as the following.
# %in_block% notation column_mapping %in_block% { iris %.>% transform(., AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) %.>% subset(., select = qc(Species, AREA_COL)) %.>% head(.) }
## Species Sepal.Area ## 1 setosa 14.01936 ## 2 setosa 11.54535 ## 3 setosa 11.81239 ## 4 setosa 11.19978 ## 5 setosa 14.13717 ## 6 setosa 16.54049
This notation can be handy for defining functions.
compute_area <- function( .data, area_col, length_col, width_col) c( # End of function argument definiton AREA_COL = area_col, LENGTH_COL = length_col, WIDTH_COL = width_col ) %in_block% { # End of argument mapping block .data %.>% transform(., AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) } # End of function body block iris %.>% compute_area(., 'Sepal.Area', 'Sepal.Length', 'Sepal.Width') %.>% compute_area(., 'Petal.Area', 'Petal.Length', 'Petal.Width') %.>% subset(., select = c("Species", "Sepal.Area", "Petal.Area")) %.>% head(.)
## Species Sepal.Area Petal.Area ## 1 setosa 14.01936 0.2199115 ## 2 setosa 11.54535 0.2199115 ## 3 setosa 11.81239 0.2042035 ## 4 setosa 11.19978 0.2356194 ## 5 setosa 14.13717 0.2199115 ## 6 setosa 16.54049 0.5340708
We can think of the above function definition notation as having two blocks: the alias defining block (the portion before "%in_block%
") and the templated function body (the portion after "%in_block%
"). Notice how easy it is to use this notation to convert a non-standard (or name/code-capturing interface) into a value-oriented interface. The point is value-oriented interfaces are much more re-usable and easier to program over (use in for-loops, applies, and functions).
The second new feature is the orderv()
function, a value-oriented adapter for base::order()
. orderv()
uses a vector of column names to compute an ordering permutation for a data.frame
. We can use it as we show below.
library("wrapr") sort_columns <- qc(mpg, hp, gear) ordering <- orderv(mtcars[ , sort_columns, drop = FALSE], decreasing = TRUE, method = "radix") head(mtcars[ordering, , drop = FALSE])
## mpg cyl disp hp drat wt qsec vs am gear carb ## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 ## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 ## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 ## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 ## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 ## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Of course we have also have all the steps wrapped in a convenient function: sortv()
.
mtcars %.>% sortv(., sort_columns, decreasing = TRUE, method = "radix") %.>% head(.)
## mpg cyl disp hp drat wt qsec vs am gear carb ## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 ## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 ## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 ## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 ## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 ## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
For details on "method = "radix"
" please see our earlier tip here.
A third new feature is mk_formula()
. mk_formula()
is used to build simple formulas for modeling tasks (which may have a large number of variables) without any string processing or parsing.
Our usual advice for building simple formulas has been to use the paste()
-based methods exhibited in "R Tip: How to Pass a formula to lm". This remains good advice. However mk_formula()
is a more concise and more hygienic alternative. An example is given below.
# specifications of how to model, # coming from somewhere else outcome <- "mpg" variables <- c("cyl", "disp", "hp", "carb") # our modeling effort, # fully parameterized! f <- wrapr::mk_formula(outcome, variables) print(f)
## mpg ~ cyl + disp + hp + carb
model <- lm(f, data = mtcars) print(model)
## ## Call: ## lm(formula = f, data = mtcars) ## ## Coefficients: ## (Intercept) cyl disp hp carb ## 34.021595 -1.048523 -0.026906 0.009349 -0.926863
The above notation is good for programming over modeling tasks.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.