Better R Code with wrapr Dot Arrow

[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Our R package wrapr supplies a “piping operator” that we feel is a real improvement in R code piped-style coding.

The idea is: with wrapr‘s “dot arrow” pipe “%.>%” the expression “A %.>% B” is treated very much like “{. <- A; B}". In particular this lets users think of "A %.>% B(.)" as a left-to-right way to write "B(A)" (i.e. under the convention of writing-out the dot arguments, the pipe looks a bit like left to right function composition, call this explicit dot notation).

This sort of notation becomes useful when we compose many steps. Some consider "A %.>% B(.) %.>% C(.) %.>% D(.)" to be easier to read and easier to maintain than "D(C(B(A)))".

In terms of the popular magrittr pipe: wrapr dot arrow "A %.>% B" behaves a lot like "A %>% {B}" ("{}" on the right being magrittr‘s special notation to treat the right-hand side as an expression instead of as a function call). Like magrittr, wrapr dot arrow can be used to write "sin(5)" as "5 %.>% sin", though the preferred wrapr notation is the explicit dot notation: "5 %.>% sin(.)". Unlike magrittr, wrapr deliberately does not accept "only parenthesis notation" "5 %>% sin()" (our view is: if the user goes to all the trouble to tell us there are no arguments, wrapr dot arrow should take their word for it).

wrapr dot arrow strives to be regular (have few different operating modes), giving the user a lot options and a lot of expressive power. Please see here for a small study we conducted on this.

Lets show a few examples that all share a secret feature that we will reveal at the end of this article (note, any example prior to here is not part of the secret).

First we can pipe into package qualified functions (with or without the explicit dot notation).

library("wrapr")

5 %.>% base::sin
## [1] -0.9589243

In the function notation (without the dot) argument names are preserved.

library("wrapr")

d <- 5

d %.>% substitute
## d
d %.>% base::substitute
## d

Also, piping into functions held as list items is supported.

library("wrapr")

obj <- list(f = sin)

5 %.>% obj$f
## [1] -0.9589243
5 %.>% obj[['f']]
## [1] -0.9589243

And piping into parenthesized expressions is supported, so it is safe to add clarifying parenthesis when one wishes.

library("wrapr")

5 %.>% (1 + .)
## [1] 6

In the dot notation piping into nested functions is handled smoothly.

# From http://piccolboni.info/2015/09/pipe-operator-for-R.html
library("wrapr")

4 %.>% sqrt(sqrt(.))
## [1] 1.414214

wrapr dot arrow is compatible with dplyr "pronoun" notation (the ".data" and ".env" qualifiers below).

# Adapted from https://github.com/tidyverse/dplyr/issues/3286
suppressPackageStartupMessages(library("dplyr"))
library("wrapr")

cyl <- 4

mtcars %.>% 
  filter(., .data$cyl == .env$cyl) %.>% 
  head(.)
##    mpg cyl  disp hp drat    wt  qsec vs am gear carb
## 1 22.8   4 108.0 93 3.85 2.320 18.61  1  1    4    1
## 2 24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2
## 3 22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2
## 4 32.4   4  78.7 66 4.08 2.200 19.47  1  1    4    1
## 5 30.4   4  75.7 52 4.93 1.615 18.52  1  1    4    2
## 6 33.9   4  71.1 65 4.22 1.835 19.90  1  1    4    1

wrapr dot arrow can even be used inside dplyr expressions.

suppressPackageStartupMessages(library("dplyr"))
library("wrapr")

mtcars %.>%
  mutate(., 
         someMean = . %.>% 
           select(., cyl, disp, carb) %.>% 
           rowMeans(.)) %.>% 
  head(.)
##    mpg cyl disp  hp drat    wt  qsec vs am gear carb  someMean
## 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4  56.66667
## 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4  56.66667
## 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1  37.66667
## 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1  88.33333
## 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 123.33333
## 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1  77.33333

Some nested items can be subtle, for in the example below the dot in "rev(.)[2]" is actually not a top-level argument as the "rev(.)" expression is technically an argument to the [] operator. Unlike magrittr, wrapr does not change its substitution behavior based on the presence or absence of an argument named "." at the top-level of expressions, making it easy to reason about expression semantics.

library("wrapr")

c(10, 20, 30) %.>% rev(.)[1]
## [1] 30

wrapr dot arrow is compatible with data.table, even with the common addition of visibility controlling []-calls.

library("data.table")
library("wrapr")

d <- data.table(x = c(2, 1, 3))

d %.>% 
  .[, y := x + 1] %.>% 
  setorder(., x)[]
##    x y
## 1: 1 2
## 2: 2 3
## 3: 3 4

(Note: if the wrapr package is attached before data.table their will be a warning regarding ":=". This is actually not a problem as data.table uses the ":=" symbol in its own parsing and in its own package context where it can not be confused with wrapr‘s definition.)

Another subtle form of expression nesting are operators such as %in%. In expressions such as "filter(d, a %in% .)", the dot is not an argument to filter() but an argument to %in%.

# adapted from: https://stackoverflow.com/questions/46728387/piping-with-dot-inside-dplyrfilter
suppressPackageStartupMessages(library("dplyr"))
library("wrapr")

d <- data.frame(a = c(1,2,3),
                b = c(4,5,6))
c(2,2) %.>% filter(d, a %in% .)
##   a b
## 1 2 5
c(2,2) %.>% dplyr::filter(d, a %in% .)
##   a b
## 1 2 5

These are all of our examples. We are now ready to share their common secret.

None of the above examples work when translated into magrittr pipelines (unless one adds braces and explicit dot-arguments).

When we were not thinking about it deeply all of the above code likely seemed reasonable. In fact it was reasonable, it just doesn’t happen to follow magrittr conventions.

We call out just a couple to show the issues.

library("magrittr")

5 %>% base::sin
## Error in .::base: unused argument (sin)
library("magrittr")

d <- 5

d %>% substitute
## value

The above illustrates a problem we run into in promoting wrapr dot arrow: the misconception that the magrittr pipe leaves no room for improvement.

Our impression is magrittr users eventually learn and internalize what variations work and do not work for magrittr, and then learn to limit their coding to stay in the safe magrittr fragment. In our opinion once one learns the one limitation of wrapr dot arrow (the requirement of explicit dot arguments, due to the expression orientation of the pipe) one can use dot arrow as a much more powerful pipe that supports many more coding patterns (such piping into function references, as we showed above).

For those that are interested we have tutorials and formal documentation.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)