Better R Code with wrapr Dot Arrow
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Our R
package wrapr
supplies a “piping operator” that we feel is a real improvement in R code piped-style coding.
The idea is: with wrapr
‘s “dot arrow” pipe “%.>%
” the expression “A %.>% B
” is treated very much like “{. <- A; B}
". In particular this lets users think of "A %.>% B(.)
" as a left-to-right way to write "B(A)
" (i.e. under the convention of writing-out the dot arguments, the pipe looks a bit like left to right function composition, call this explicit dot notation).
This sort of notation becomes useful when we compose many steps. Some consider "A %.>% B(.) %.>% C(.) %.>% D(.)
" to be easier to read and easier to maintain than "D(C(B(A)))
".
In terms of the popular magrittr
pipe: wrapr
dot arrow "A %.>% B
" behaves a lot like "A %>% {B}
" ("{}
" on the right being magrittr
‘s special notation to treat the right-hand side as an expression instead of as a function call). Like magrittr
, wrapr
dot arrow can be used to write "sin(5)
" as "5 %.>% sin
", though the preferred wrapr
notation is the explicit dot notation: "5 %.>% sin(.)
". Unlike magrittr
, wrapr
deliberately does not accept "only parenthesis notation" "5 %>% sin()
" (our view is: if the user goes to all the trouble to tell us there are no arguments, wrapr
dot arrow should take their word for it).
wrapr
dot arrow strives to be regular (have few different operating modes), giving the user a lot options and a lot of expressive power. Please see here for a small study we conducted on this.
Lets show a few examples that all share a secret feature that we will reveal at the end of this article (note, any example prior to here is not part of the secret).
First we can pipe into package qualified functions (with or without the explicit dot notation).
library("wrapr") 5 %.>% base::sin
## [1] -0.9589243
In the function notation (without the dot) argument names are preserved.
library("wrapr") d <- 5 d %.>% substitute
## d
d %.>% base::substitute
## d
Also, piping into functions held as list
items is supported.
library("wrapr") obj <- list(f = sin) 5 %.>% obj$f
## [1] -0.9589243
5 %.>% obj[['f']]
## [1] -0.9589243
And piping into parenthesized expressions is supported, so it is safe to add clarifying parenthesis when one wishes.
library("wrapr") 5 %.>% (1 + .)
## [1] 6
In the dot notation piping into nested functions is handled smoothly.
# From http://piccolboni.info/2015/09/pipe-operator-for-R.html library("wrapr") 4 %.>% sqrt(sqrt(.))
## [1] 1.414214
wrapr
dot arrow is compatible with dplyr
"pronoun" notation (the ".data
" and ".env
" qualifiers below).
# Adapted from https://github.com/tidyverse/dplyr/issues/3286 suppressPackageStartupMessages(library("dplyr")) library("wrapr") cyl <- 4 mtcars %.>% filter(., .data$cyl == .env$cyl) %.>% head(.)
## mpg cyl disp hp drat wt qsec vs am gear carb ## 1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 ## 2 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 ## 3 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 ## 4 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 ## 5 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 ## 6 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
wrapr
dot arrow can even be used inside dplyr
expressions.
suppressPackageStartupMessages(library("dplyr")) library("wrapr") mtcars %.>% mutate(., someMean = . %.>% select(., cyl, disp, carb) %.>% rowMeans(.)) %.>% head(.)
## mpg cyl disp hp drat wt qsec vs am gear carb someMean ## 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 56.66667 ## 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 56.66667 ## 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 37.66667 ## 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 88.33333 ## 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 123.33333 ## 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 77.33333
Some nested items can be subtle, for in the example below the dot in "rev(.)[2]
" is actually not a top-level argument as the "rev(.)
" expression is technically an argument to the []
operator. Unlike magrittr
, wrapr
does not change its substitution behavior based on the presence or absence of an argument named ".
" at the top-level of expressions, making it easy to reason about expression semantics.
library("wrapr") c(10, 20, 30) %.>% rev(.)[1]
## [1] 30
wrapr
dot arrow is compatible with data.table
, even with the common addition of visibility controlling []
-calls.
library("data.table") library("wrapr") d <- data.table(x = c(2, 1, 3)) d %.>% .[, y := x + 1] %.>% setorder(., x)[]
## x y ## 1: 1 2 ## 2: 2 3 ## 3: 3 4
(Note: if the wrapr
package is attached before data.table
their will be a warning regarding ":=
". This is actually not a problem as data.table
uses the ":=
" symbol in its own parsing and in its own package context where it can not be confused with wrapr
‘s definition.)
Another subtle form of expression nesting are operators such as %in%
. In expressions such as "filter(d, a %in% .)
", the dot is not an argument to filter()
but an argument to %in%
.
# adapted from: https://stackoverflow.com/questions/46728387/piping-with-dot-inside-dplyrfilter suppressPackageStartupMessages(library("dplyr")) library("wrapr") d <- data.frame(a = c(1,2,3), b = c(4,5,6)) c(2,2) %.>% filter(d, a %in% .)
## a b ## 1 2 5
c(2,2) %.>% dplyr::filter(d, a %in% .)
## a b ## 1 2 5
These are all of our examples. We are now ready to share their common secret.
None of the above examples work when translated into
magrittr
pipelines (unless one adds braces and explicit dot-arguments).
When we were not thinking about it deeply all of the above code likely seemed reasonable. In fact it was reasonable, it just doesn’t happen to follow magrittr
conventions.
We call out just a couple to show the issues.
library("magrittr") 5 %>% base::sin
## Error in .::base: unused argument (sin)
library("magrittr") d <- 5 d %>% substitute
## value
The above illustrates a problem we run into in promoting wrapr
dot arrow: the misconception that the magrittr
pipe leaves no room for improvement.
Our impression is magrittr
users eventually learn and internalize what variations work and do not work for magrittr
, and then learn to limit their coding to stay in the safe magrittr
fragment. In our opinion once one learns the one limitation of wrapr
dot arrow (the requirement of explicit dot arguments, due to the expression orientation of the pipe) one can use dot arrow as a much more powerful pipe that supports many more coding patterns (such piping into function references, as we showed above).
For those that are interested we have tutorials and formal documentation.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.