>=% filter(mpg > 10) %>>=% select(mpg, cyl, disp) %>>=% arrange(desc(mpg)) %>>=% head() Again, extracting the components from this value(res) ## # Source: SQL [6 x 3] ## # Database: sqlite 3.46.0 [:memory:] ## # Ordered by: desc(mpg) ## mpg cyl disp ## ## 1 33.9 4 71.1 ## 2 32.4 4 78.7 ## 3 30.4 4 75.7 ## 4 30.4 4 95.1 ## 5 27.3 4 79 ## 6 26 4 120. logger_log(res) ## ✔ Log of 5 operations: ## ## mem %>% ## tbl("mtcars") %>% ## filter(mpg > 10) %>% ## select(mpg, cyl, disp) %>% ## arrange(desc(mpg)) %>% ## head() Since the log captures what operations were performed, we could re-run this expression, and a helper is available for that rerun(res) ## # Source: SQL [6 x 3] ## # Database: sqlite 3.46.0 [:memory:] ## # Ordered by: desc(mpg) ## mpg cyl disp ## ## 1 33.9 4 71.1 ## 2 32.4 4 78.7 ## 3 30.4 4 75.7 ## 4 30.4 4 95.1 ## 5 27.3 4 79 ## 6 26 4 120. Some similar functionality is present in the {magrittr} package which provides the ‘classic’ R pipe %>%; a ‘functional sequence’ starts with a . and similarly tracks which functions are to be applied to an arbitrary input once evaluated - in this way, this is similar to defining a new function. library(magrittr) # define a functional sequence fs % tbl("mtcars") %>% select(cyl, mpg) # evaluate the functional sequence with some input data fs(mem) ## # Source: SQL [?? x 2] ## # Database: sqlite 3.46.0 [:memory:] ## cyl mpg ## ## 1 6 21 ## 2 6 21 ## 3 4 22.8 ## 4 6 21.4 ## 5 8 18.7 ## 6 6 18.1 ## 7 8 14.3 ## 8 4 24.4 ## 9 4 22.8 ## 10 6 19.2 ## # ℹ more rows # identify the function calls at each step of the pipeline magrittr::functions(fs) ## [[1]] ## function (.) ## tbl(., "mtcars") ## ## [[2]] ## function (.) ## select(., cyl, mpg) Since the functional sequence is unevaluated, errors can be present and not triggered errfs % sqrt() %>% stop("oops") %>% add_n(3) x =% sqrt() %>>=% add_n(4) value(resx) ## [1] 5.000000 5.414214 5.732051 6.000000 6.236068 6.449490 6.645751 6.828427 ## [9] 7.000000 7.162278 logger_log(resx) ## ✔ Log of 2 operations: ## ## x %>% ## sqrt() %>% ## add_n(4) err >=% sqrt() %>>=% stop("oops") %>>=% add_n(3) value(err) ## NULL logger_log(err) ## ✖ Log of 3 operations: [ERROR] ## ## x %>% ## sqrt() %>% ## [E] stop("oops") %>% ## [E] add_n(3) Aside from an error destroying the value, returning a NULL result will also produce this effect nullify >=% sqrt() %>>=% ret_null() %>>=% add_n(7) value(nullify) ## NULL logger_log(nullify) ## ✖ Log of 3 operations: [ERROR] ## ## x %>% ## sqrt() %>% ## [E] ret_null() %>% ## [E] add_n(7) One downside to the functional sequence approach is chaining these - since the first term must be ., that is always the first entry, and chaining multiple sequences is not clean. a % sqrt() a ## Functional sequence with the following components: ## ## 1. sqrt(.) ## ## Use 'functions' to extract the individual functions. b % a %>% add_n(1) b ## Functional sequence with the following components: ## ## 1. a(.) ## 2. add_n(., 1) ## ## Use 'functions' to extract the individual functions. b(x) ## [1] 2.000000 2.414214 2.732051 3.000000 3.236068 3.449490 3.645751 3.828427 ## [9] 4.000000 4.162278 Because the monad context is recreated at every step, chaining these is not a problem a >=% sqrt() value(a) ## [1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427 ## [9] 3.000000 3.162278 logger_log(a) ## ✔ Log of 1 operations: ## ## x %>% ## sqrt() b >=% add_n(1) value(b) ## [1] 2.000000 2.414214 2.732051 3.000000 3.236068 3.449490 3.645751 3.828427 ## [9] 4.000000 4.162278 logger_log(b) ## ✔ Log of 2 operations: ## ## x %>% ## sqrt() %>% ## add_n(1) This achieves what I wanted in terms of ‘recording’ the steps of the pipeline, and it only requires wrapping the initial value and using a different pipe. But there are other monads I could also implement… so I did. Timer In addition to capturing the expressions in a log, the Timer monad also captures the evaluation timing for each step, storing these alongside the expressions themselves in a data.frame x >=% sleep_for(3) %>>=% timestwo() %>>=% sleep_for(1.3) value(x) ## [1] 10 times(x) ## expr time ## 1 5 0.000 ## 2 sleep_for(3) 3.014 ## 3 timestwo() 0.000 ## 4 sleep_for(1.3) 1.306 y >=% sleep_for(2) %>>=% ret_null() %>>=% sleep_for(0.3) value(y) ## NULL times(y) ## expr time ## 1 5 0.000 ## 2 sleep_for(2) 2.002 ## 3 ret_null() 0.000 ## 4 sleep_for(0.3) 0.302 Maybe In some languages it is preferrable to return something rather than raising an error, particularly if you want to ensure that errors are handled. The Maybe pattern consists of either a Nothing (which is empty) or a Just containing some value; all functions applied to a Maybe will be one of these. For testing the result, some helpers is_nothing() and is_just() are defined. x >=% sqrt() %>>=% timestwo() value(x) ## Just: ## [1] 6 is_just(x) ## [1] TRUE is_nothing(x) ## [1] FALSE y >=% sqrt() value(y) ## Nothing is_just(y) ## [1] FALSE is_nothing(y) ## [1] TRUE z >=% timestwo() %>>=% add_n(Nothing()) value(z) ## Nothing is_just(z) ## [1] FALSE is_nothing(z) ## [1] TRUE For what is likely a much more robust implementation, see {maybe}. Result Similar to a Maybe, a Result can contain either a successful Ok wrapped value or an Err wrapped message, but it will be one of these. This pattern resembles (and internally, uses) the tryCatch() approach where the evaluation will not fail, but requires testing what is produced to determine success, for which is_ok() and is_err() are defined. x >=% sqrt() %>>=% timestwo() value(x) ## OK: ## [1] 6 is_err(x) ## [1] FALSE is_ok(x) ## [1] TRUE When the evaluation fails, the error is reported, along with the value prior to the error y >=% sqrt() %>>=% ret_err("this threw an error") value(y) ## Error: ## [1] "this threw an error; previously: 3" is_err(y) ## [1] TRUE is_ok(y) ## [1] FALSE z >=% timestwo() %>>=% add_n("banana") value(z) ## Error: ## [1] "n should be numeric; previously: 20" is_err(z) ## [1] TRUE is_ok(z) ## [1] FALSE Extensions The flatMap/“bind” operator defined here as %>>=% is applicable to any monad which has a bind() method defined. The monads defined in this package are all R6Class objects exposing such a method of the form m$bind(.call, .quo) which expects a function and a quosure. You can add your own extensions to these by defining such a class (and probably a constructor helper and a print() method) # a Reporter monad which reports unpiped function calls Reporter = or as a valid R infix special, %>>=%) but at least I’m not stepping on other package’s toes there. One particular benefit of this one is that by deleting the two outermost characters inside the special you get the {magrittr} pipe %>%. If nothing else, I found it really useful to go through the process of defining these myself - I learned a lot about {R6} classes and quosures in the process, too. My package comes with no guarantees - it works for the examples I’ve tried, but it’s possible (if not likely) that I’ve not thought of all the edge cases. I’ve certainly relied on R’s vectorisation (rather than explicitly re-mapping individual values) and my quosure skills are somewhat underdeveloped. If you do take it for a spin I’d love to hear your thoughts on it. As always, I can be found on Mastodon and the comment section below. devtools::session_info() ## ─ Session info ─────────────────────────────────────────────────────────────── ## setting value ## version R version 4.3.3 (2024-02-29) ## os Pop!_OS 22.04 LTS ## system x86_64, linux-gnu ## ui X11 ## language (EN) ## collate en_AU.UTF-8 ## ctype en_AU.UTF-8 ## tz Australia/Adelaide ## date 2024-10-18 ## pandoc 3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown) ## ## ─ Packages ─────────────────────────────────────────────────────────────────── ## package * version date (UTC) lib source ## bit 4.0.4 2020-08-04 [3] CRAN (R 4.0.2) ## bit64 4.0.5 2020-08-30 [3] CRAN (R 4.2.0) ## blob 1.2.4 2023-03-17 [3] CRAN (R 4.2.3) ## blogdown 1.19 2024-02-01 [1] CRAN (R 4.3.3) ## bookdown 0.36 2023-10-16 [1] CRAN (R 4.3.2) ## bslib 0.8.0 2024-07-29 [1] CRAN (R 4.3.3) ## cachem 1.1.0 2024-05-16 [1] CRAN (R 4.3.3) ## callr 3.7.3 2022-11-02 [3] CRAN (R 4.2.2) ## cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.3) ## crayon 1.5.2 2022-09-29 [3] CRAN (R 4.2.1) ## DBI 1.2.1 2024-01-12 [3] CRAN (R 4.3.2) ## dbplyr 2.4.0 2023-10-26 [3] CRAN (R 4.3.2) ## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.3.2) ## digest 0.6.37 2024-08-19 [1] CRAN (R 4.3.3) ## dplyr * 1.1.4 2023-11-17 [3] CRAN (R 4.3.2) ## ellipsis 0.3.2 2021-04-29 [3] CRAN (R 4.1.1) ## evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.3.3) ## fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.3) ## fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.3.3) ## fs 1.6.4 2024-04-25 [1] CRAN (R 4.3.3) ## generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.3) ## glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.3) ## htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.3.3) ## htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.2) ## httpuv 1.6.12 2023-10-23 [1] CRAN (R 4.3.2) ## icecream 0.2.1 2023-09-27 [1] CRAN (R 4.3.2) ## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.3.3) ## jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.3.3) ## knitr 1.48 2024-07-07 [1] CRAN (R 4.3.3) ## later 1.3.1 2023-05-02 [1] CRAN (R 4.3.2) ## lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.3) ## magrittr * 2.0.3 2022-03-30 [1] CRAN (R 4.3.3) ## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.3) ## mime 0.12 2021-09-28 [1] CRAN (R 4.3.3) ## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.3.2) ## monads * 0.1.0.9000 2024-10-14 [1] local ## pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.3) ## pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 4.3.2) ## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.3) ## pkgload 1.3.3 2023-09-22 [1] CRAN (R 4.3.2) ## prettyunits 1.2.0 2023-09-24 [3] CRAN (R 4.3.1) ## processx 3.8.3 2023-12-10 [3] CRAN (R 4.3.2) ## profvis 0.3.8 2023-05-02 [1] CRAN (R 4.3.2) ## promises 1.2.1 2023-08-10 [1] CRAN (R 4.3.2) ## ps 1.7.6 2024-01-18 [3] CRAN (R 4.3.2) ## purrr 1.0.2 2023-08-10 [3] CRAN (R 4.3.1) ## R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.3) ## Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.2) ## remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.3.2) ## rlang 1.1.4 2024-06-04 [1] CRAN (R 4.3.3) ## rmarkdown 2.28 2024-08-17 [1] CRAN (R 4.3.3) ## RSQLite 2.3.7 2024-05-27 [1] CRAN (R 4.3.3) ## rstudioapi 0.15.0 2023-07-07 [3] CRAN (R 4.3.1) ## sass 0.4.9 2024-03-15 [1] CRAN (R 4.3.3) ## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.2) ## shiny 1.7.5.1 2023-10-14 [1] CRAN (R 4.3.2) ## stringi 1.8.4 2024-05-06 [1] CRAN (R 4.3.3) ## stringr 1.5.1 2023-11-14 [1] CRAN (R 4.3.3) ## tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.3) ## tidyselect 1.2.0 2022-10-10 [3] CRAN (R 4.2.1) ## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.3.2) ## usethis 3.0.0 2024-07-29 [1] CRAN (R 4.3.3) ## utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.3) ## vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.3) ## withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.3) ## xfun 0.47 2024-08-17 [1] CRAN (R 4.3.3) ## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.2) ## yaml 2.3.10 2024-07-26 [1] CRAN (R 4.3.3) ## ## [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.3 ## [2] /usr/local/lib/R/site-library ## [3] /usr/lib/R/site-library ## [4] /usr/lib/R/library ## ## ────────────────────────────────────────────────────────────────────────────── " />

Monads in R

[This article was first published on rstats on Irregularly Scheduled Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this post I describe a useful programming pattern that I implemented, and hopefully provide a gentle introduction to the idea of monads.

The motivation for all of this was that I had a {dplyr} pipeline as part of a {shiny} app that queries a database and I wanted to “record” what steps were in that pipeline so that I could offer them as a way to ‘reproduce’ the query. Some of the steps might be user-defined via the UI, so it was a little more complicated than just a hardcoded query.

One quick-and-dirty solution that might come to mind would be to make a with_logging() function that takes an expression, writes a text-representation of it to a file or a global, then evaluates the expression. This would probably work, but it means that every step of the pipeline needs to be wrapped in that. Not the worst, but I had a feeling I knew of something more suitable. I’ve been trying to learn Haskell this year, and so far it’s going sort of okay, but I’m taking a detour through Elm which has most of the same syntax but less of the hardcore ‘maths’ constructs.

Returning readers may have seen me use the term ‘monadic’ in the context of APL where it means that a function ‘takes one argument’ (as compared to ‘dyadic’ which takes two) and I believe this definition predates the mathematical one I’m going to use for the rest of this post.

‘Monad’ is a term often best avoided in conversation, and is often described in overly mathematical terms, the “meme” definition being the category theory version which states

“a monad is just a monoid in the category of endofunctors”

which is mostly true, but also unnecessary. Nonetheless, it’s an extremely useful pattern that comes up a lot in functional programming.

This blog post does a great job of walking through the more practical definition, and it has “translations” into several programming languages including JavaScript and Python.

Basically, map applies some function to some values. flatMap does the same, but first “reaches inside” a context to extract some inner values, and after applying the function, re-wraps the result in the original context.

One big advantage to this is that the ‘purity’ of the function remains; you always get the same output for the same input, but as well as that you can have some input/output operation be requested to be performed, which is how ‘pure’ languages still manage to communicate with the outside world and not just heat up the CPU for no reason.

The enlightening example for me is a List – if we have some values and want to apply some function to them, we can do that with, e.g.

f <- function(x) x^2
Map(f, c(2, 4, 6))
## [[1]]
## [1] 4
## 
## [[2]]
## [1] 16
## 
## [[3]]
## [1] 36

and if we have a ‘flat’ list, this still works

Map(f, list(2, 4, 6))
## [[1]]
## [1] 4
## 
## [[2]]
## [1] 16
## 
## [[3]]
## [1] 36

but what if we have an ‘outer context’ list?

Map(f, list(c(2, 3), c(4, 5, 6)))
## [[1]]
## [1] 4 9
## 
## [[2]]
## [1] 16 25 36

In this case, because f is vectorised, Map sends each vector to f and gets a result for each list. What if we have a list in the inner context?

Map(f, list(list(2, 3), list(4, 5, 6)))
## Error in x^2: non-numeric argument to binary operator

This fails because f(list(2, 3)) fails (it doesn’t know how to deal with an argument which is a list).

Instead, we can use a version of ‘map’ that first reaches inside the outer list context, concatenates what’s inside, applies the function, then re-wraps the result in a new, flat list

fmap <- function(x, f) {
  list(f(unlist(x)))
}
fmap(list(list(2, 3), list(4, 5, 6)), f)
## [[1]]
## [1]  4  9 16 25 36

This is the essence of a monad - something that supports such a fmap operation that performs the mapping inside the context (and potentially some other operations, which we’ll get to). There are various patterns which benefit from such a context, and this vignette describes an implementation of several of these via the {monads} package.

The fmap operation is so common that it’s typical to find it presented as an infix function, similar to how pipes work in R

list(list(2, 3), list(4, 5, 6)) |> fmap(f)
## [[1]]
## [1]  4  9 16 25 36

and we can go one step further by defining a new pipe which is just a different syntax for this

x |> fmap(f)

x %>>=% f

This infix function borrows from Haskell’s >>= (pronounced “bind”) which is so fundamental that forms part of the language’s logo

The Haskell logo
The Haskell logo

With all that in mind, here’s how it looks in my (perhaps simplistic) implementation which you can get from GitHub here

library(monads)
{monads} hex logo
{monads} hex logo

Additionally, some toy helper functions are defined in this package for demonstrating application of functions, e.g.

timestwo(4)
## [1] 8
square(5)
## [1] 25
add_n(3, 4)
## [1] 7

List

As per the example above, the List monad wraps values (which may be additional lists) and when flatMaped the results are ‘flattened’ into a single List.

# identical to a regular Map
x <- listM(1, 2, 3) %>>=%
  timestwo()
x
## [[1]]
## [1] 2 4 6
# only possible with the flatMap approach
y <- listM(list(1, 2), list(3, 4, 5)) %>>=% 
  timestwo()
y
## [[1]]
## [1]  2  4  6  8 10

Note that while x and y print as regular lists, they remain List monads; a print method is defined which essentially extracts value(x).

Logger

As I alluded to earlier, additional operations can happen while the context is unwrapped, including IO. What if I just kept a log of the operations and appended each step to it? The wrapping context can include additional components, and a stored ‘log’ of the expressions used at each step is entirely possible.

All that is required is to wrap the value at the start of the pipeline in a Logger context for which there is a constructor helper, loggerM()

library(dplyr, warn.conflicts = FALSE)

result <- loggerM(mtcars) %>>=%
  filter(mpg > 10) %>>=%
  select(mpg, cyl, disp) %>>=%
  arrange(desc(mpg)) %>>=%
  head()

This result is still a Logger instance, not a value. To extract the value from this we can use value(). To extract the log of each step, use logger_log() (to avoid conflict with base::log)

value(result)
##                 mpg cyl  disp
## Toyota Corolla 33.9   4  71.1
## Fiat 128       32.4   4  78.7
## Honda Civic    30.4   4  75.7
## Lotus Europa   30.4   4  95.1
## Fiat X1-9      27.3   4  79.0
## Porsche 914-2  26.0   4 120.3
logger_log(result)
## ✔ Log of 4 operations:
## 
##  mtcars %>%
##    filter(mpg > 10) %>%
##    select(mpg, cyl, disp) %>%
##    arrange(desc(mpg)) %>%
##    head()

This works with any data value, so we could just as easily use an in-memory SQLite database (or external)

mem <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
dplyr::copy_to(mem, mtcars)

res <- loggerM(mem) %>>=%
  tbl("mtcars") %>>=%
  filter(mpg > 10) %>>=%
  select(mpg, cyl, disp) %>>=%
  arrange(desc(mpg)) %>>=%
  head()

Again, extracting the components from this

value(res)
## # Source:     SQL [6 x 3]
## # Database:   sqlite 3.46.0 [:memory:]
## # Ordered by: desc(mpg)
##     mpg   cyl  disp
##   <dbl> <dbl> <dbl>
## 1  33.9     4  71.1
## 2  32.4     4  78.7
## 3  30.4     4  75.7
## 4  30.4     4  95.1
## 5  27.3     4  79  
## 6  26       4 120.
logger_log(res)
## ✔ Log of 5 operations:
## 
##  mem %>%
##    tbl("mtcars") %>%
##    filter(mpg > 10) %>%
##    select(mpg, cyl, disp) %>%
##    arrange(desc(mpg)) %>%
##    head()

Since the log captures what operations were performed, we could re-run this expression, and a helper is available for that

rerun(res)
## # Source:     SQL [6 x 3]
## # Database:   sqlite 3.46.0 [:memory:]
## # Ordered by: desc(mpg)
##     mpg   cyl  disp
##   <dbl> <dbl> <dbl>
## 1  33.9     4  71.1
## 2  32.4     4  78.7
## 3  30.4     4  75.7
## 4  30.4     4  95.1
## 5  27.3     4  79  
## 6  26       4 120.

Some similar functionality is present in the {magrittr} package which provides the ‘classic’ R pipe %>%; a ‘functional sequence’ starts with a . and similarly tracks which functions are to be applied to an arbitrary input once evaluated - in this way, this is similar to defining a new function.

library(magrittr)

# define a functional sequence
fs <- . %>%
  tbl("mtcars") %>%
  select(cyl, mpg)

# evaluate the functional sequence with some input data
fs(mem)
## # Source:   SQL [?? x 2]
## # Database: sqlite 3.46.0 [:memory:]
##      cyl   mpg
##    <dbl> <dbl>
##  1     6  21  
##  2     6  21  
##  3     4  22.8
##  4     6  21.4
##  5     8  18.7
##  6     6  18.1
##  7     8  14.3
##  8     4  24.4
##  9     4  22.8
## 10     6  19.2
## # ℹ more rows
# identify the function calls at each step of the pipeline
magrittr::functions(fs)
## [[1]]
## function (.) 
## tbl(., "mtcars")
## 
## [[2]]
## function (.) 
## select(., cyl, mpg)

Since the functional sequence is unevaluated, errors can be present and not triggered

errfs <- . %>%
  sqrt() %>%
  stop("oops") %>%
  add_n(3)

x <- 1:10

errfs(x)
## Error in function_list[[i]](value): 11.41421356237311.7320508075688822.236067977499792.449489742783182.645751311064592.8284271247461933.16227766016838oops
magrittr::functions(errfs)
## [[1]]
## function (.) 
## sqrt(.)
## 
## [[2]]
## function (.) 
## stop(., "oops")
## 
## [[3]]
## function (.) 
## add_n(., 3)

In the monad context, steps which do raise an error nullify the value and a signifier is added to the log to prevent re-running the error

resx <- loggerM(x) %>>=%
  sqrt() %>>=%
  add_n(4)

value(resx)
##  [1] 5.000000 5.414214 5.732051 6.000000 6.236068 6.449490 6.645751 6.828427
##  [9] 7.000000 7.162278
logger_log(resx)
## ✔ Log of 2 operations:
## 
##  x %>%
##    sqrt() %>%
##    add_n(4)
err <- loggerM(x) %>>=%
  sqrt() %>>=%
  stop("oops") %>>=%
  add_n(3)

value(err)
## NULL
logger_log(err)
## ✖ Log of 3 operations: [ERROR]
## 
##  x %>%
##    sqrt() %>%
##    [E] stop("oops") %>%
##    [E] add_n(3)

Aside from an error destroying the value, returning a NULL result will also produce this effect

nullify <- loggerM(x) %>>=%
  sqrt() %>>=%
  ret_null() %>>=%
  add_n(7)

value(nullify)
## NULL
logger_log(nullify)
## ✖ Log of 3 operations: [ERROR]
## 
##  x %>%
##    sqrt() %>%
##    [E] ret_null() %>%
##    [E] add_n(7)

One downside to the functional sequence approach is chaining these - since the first term must be ., that is always the first entry, and chaining multiple sequences is not clean.

a <- . %>% sqrt()
a
## Functional sequence with the following components:
## 
##  1. sqrt(.)
## 
## Use 'functions' to extract the individual functions.
b <- . %>% a %>% add_n(1)
b
## Functional sequence with the following components:
## 
##  1. a(.)
##  2. add_n(., 1)
## 
## Use 'functions' to extract the individual functions.
b(x)
##  [1] 2.000000 2.414214 2.732051 3.000000 3.236068 3.449490 3.645751 3.828427
##  [9] 4.000000 4.162278

Because the monad context is recreated at every step, chaining these is not a problem

a <- loggerM(x) %>>=%
  sqrt()

value(a)
##  [1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
##  [9] 3.000000 3.162278
logger_log(a)
## ✔ Log of 1 operations:
## 
##  x %>%
##    sqrt()
b <- a %>>=%
  add_n(1)

value(b)
##  [1] 2.000000 2.414214 2.732051 3.000000 3.236068 3.449490 3.645751 3.828427
##  [9] 4.000000 4.162278
logger_log(b)
## ✔ Log of 2 operations:
## 
##  x %>%
##    sqrt() %>%
##    add_n(1)

This achieves what I wanted in terms of ‘recording’ the steps of the pipeline, and it only requires wrapping the initial value and using a different pipe.

But there are other monads I could also implement… so I did.

Timer

In addition to capturing the expressions in a log, the Timer monad also captures the evaluation timing for each step, storing these alongside the expressions themselves in a data.frame

x <- timerM(5) %>>=%
  sleep_for(3) %>>=%
  timestwo() %>>=%
  sleep_for(1.3)

value(x)
## [1] 10
times(x)
##             expr  time
## 1              5 0.000
## 2   sleep_for(3) 3.014
## 3     timestwo() 0.000
## 4 sleep_for(1.3) 1.306
y <- timerM(5) %>>=%
  sleep_for(2) %>>=%
  ret_null() %>>=%
  sleep_for(0.3)

value(y)
## NULL
times(y)
##             expr  time
## 1              5 0.000
## 2   sleep_for(2) 2.002
## 3     ret_null() 0.000
## 4 sleep_for(0.3) 0.302

Maybe

In some languages it is preferrable to return something rather than raising an error, particularly if you want to ensure that errors are handled. The Maybe pattern consists of either a Nothing (which is empty) or a Just containing some value; all functions applied to a Maybe will be one of these.

For testing the result, some helpers is_nothing() and is_just() are defined.

x <- maybeM(9) %>>=% 
  sqrt() %>>=%
  timestwo()

value(x)
## Just:
## [1] 6
is_just(x)
## [1] TRUE
is_nothing(x)
## [1] FALSE
y <- maybeM(Nothing()) %>>=%
  sqrt()

value(y)
## Nothing
is_just(y)
## [1] FALSE
is_nothing(y)
## [1] TRUE
z <- maybeM(10) %>>=%
  timestwo() %>>=%
  add_n(Nothing())

value(z)
## Nothing
is_just(z)
## [1] FALSE
is_nothing(z)
## [1] TRUE

For what is likely a much more robust implementation, see {maybe}.

Result

Similar to a Maybe, a Result can contain either a successful Ok wrapped value or an Err wrapped message, but it will be one of these. This pattern resembles (and internally, uses) the tryCatch() approach where the evaluation will not fail, but requires testing what is produced to determine success, for which is_ok() and is_err() are defined.

x <- resultM(9) %>>=% 
  sqrt() %>>=%
  timestwo()

value(x)
## OK:
## [1] 6
is_err(x)
## [1] FALSE
is_ok(x)
## [1] TRUE

When the evaluation fails, the error is reported, along with the value prior to the error

y <- resultM(9) %>>=%
  sqrt() %>>=%
  ret_err("this threw an error")

value(y)
## Error:
## [1] "this threw an error; previously: 3"
is_err(y)
## [1] TRUE
is_ok(y)
## [1] FALSE
z <- resultM(10) %>>=%
  timestwo() %>>=%
  add_n("banana")

value(z)
## Error:
## [1] "n should be numeric; previously: 20"
is_err(z)
## [1] TRUE
is_ok(z)
## [1] FALSE

Extensions

The flatMap/“bind” operator defined here as %>>=% is applicable to any monad which has a bind() method defined. The monads defined in this package are all R6Class objects exposing such a method of the form m$bind(.call, .quo) which expects a function and a quosure. You can add your own extensions to these by defining such a class (and probably a constructor helper and a print() method)

# a Reporter monad which reports unpiped function calls
Reporter <- R6::R6Class(
  c("ReporterMonad"),
  public = list(
    value = NULL,
    initialize = function(value) {
      if (rlang::is_quosure(value)) {
        self$value <- rlang::eval_tidy(value)
      } else {
        self$value <- value
      }
    },
    bind = function(f, expr) {
      ## 'undo' the pipe and inject the lhs as an argument
      result <- unlist(lapply(unlist(self$value), f))
      args <- as.list(c(self$value, rlang::call_args(expr)))
      fnew <- rlang::call2(rlang::call_name(expr), !!!args)
      cat(" ** Calculating:", rlang::quo_text(fnew), "=", result, "\n")
      Reporter$new(result)
    }
  )
)

reporterM <- function(value) {
  v <- rlang::enquo(value)
  Reporter$new(v)
}

print.Reporter <- function(x, ...) {
  print(value(x))
}

x <- reporterM(17) %>>=%
  timestwo() %>>=%
  square() %>>=% 
  add_n(2) %>>=%
  `/`(8)
##  ** Calculating: timestwo(17) = 34 
##  ** Calculating: square(34) = 1156 
##  ** Calculating: add_n(1156, 2) = 1158 
##  ** Calculating: 1158/8 = 144.75
value(x)
## [1] 144.75

This is just a toy example; attempting to cat() a data.frame result would not go well.

Other Monads

There are other patterns that I haven’t implemented. One that would have been interesting is Promise - I had a ‘mind-blown’ moment reading this post about some Roc syntax with the throw-away line

Tasks can be chained together using the Task.await function, similarly to how JavaScript Promises can be chained together using a Promise’s then() method. (You might also know functions in other languages similar to Task.await which go by names like andThen, flatMap, or bind.)

because I had never made the connection between monads and async/await, but it’s a lot clearer now. I did try implementing Promise in {monads} using {future} but I couldn’t quite get the unevaluated promise object to pipe correctly.

Prior Art

There are a handful of existing implementations, most of which are more fleshed out than mine.

  • {monads} - a sketched-out implementation that relies on dispatch for flatMap operations. I’m using the same name as this package, but that one hasn’t been touched in quite a while.

  • {rmonad} - archived on CRAN, but offers a sophisticated ‘funnel’ mechanism and various ways to capture steps of a pipeline.

  • {maybe} - a more detailed implementation of Maybe.

  • {chronicler} - a way to post-process the result at each step and capture information, such as the runtime (see Timer) or dimensions. Requires an explicit bind() at each step. Associated blog post.

I also found this post about implementing a Maybe monad, and this one comparing the {foreach} package’s %do% to Haskell.

I’m somewhat surprised that in all of the above examples, none seem to use the Haskell ‘bind’ format of a pipe (>>= or as a valid R infix special, %>>=%) but at least I’m not stepping on other package’s toes there. One particular benefit of this one is that by deleting the two outermost characters inside the special you get the {magrittr} pipe %>%.

If nothing else, I found it really useful to go through the process of defining these myself - I learned a lot about {R6} classes and quosures in the process, too.

My package comes with no guarantees - it works for the examples I’ve tried, but it’s possible (if not likely) that I’ve not thought of all the edge cases. I’ve certainly relied on R’s vectorisation (rather than explicitly re-mapping individual values) and my quosure skills are somewhat underdeveloped.

If you do take it for a spin I’d love to hear your thoughts on it. As always, I can be found on Mastodon and the comment section below.


devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.3 (2024-02-29)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2024-10-18
##  pandoc   3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version    date (UTC) lib source
##  bit           4.0.4      2020-08-04 [3] CRAN (R 4.0.2)
##  bit64         4.0.5      2020-08-30 [3] CRAN (R 4.2.0)
##  blob          1.2.4      2023-03-17 [3] CRAN (R 4.2.3)
##  blogdown      1.19       2024-02-01 [1] CRAN (R 4.3.3)
##  bookdown      0.36       2023-10-16 [1] CRAN (R 4.3.2)
##  bslib         0.8.0      2024-07-29 [1] CRAN (R 4.3.3)
##  cachem        1.1.0      2024-05-16 [1] CRAN (R 4.3.3)
##  callr         3.7.3      2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.6.1      2023-03-23 [1] CRAN (R 4.3.3)
##  crayon        1.5.2      2022-09-29 [3] CRAN (R 4.2.1)
##  DBI           1.2.1      2024-01-12 [3] CRAN (R 4.3.2)
##  dbplyr        2.4.0      2023-10-26 [3] CRAN (R 4.3.2)
##  devtools      2.4.5      2022-10-11 [1] CRAN (R 4.3.2)
##  digest        0.6.37     2024-08-19 [1] CRAN (R 4.3.3)
##  dplyr       * 1.1.4      2023-11-17 [3] CRAN (R 4.3.2)
##  ellipsis      0.3.2      2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.24.0     2024-06-10 [1] CRAN (R 4.3.3)
##  fansi         1.0.6      2023-12-08 [1] CRAN (R 4.3.3)
##  fastmap       1.2.0      2024-05-15 [1] CRAN (R 4.3.3)
##  fs            1.6.4      2024-04-25 [1] CRAN (R 4.3.3)
##  generics      0.1.3      2022-07-05 [1] CRAN (R 4.3.3)
##  glue          1.7.0      2024-01-09 [1] CRAN (R 4.3.3)
##  htmltools     0.5.8.1    2024-04-04 [1] CRAN (R 4.3.3)
##  htmlwidgets   1.6.2      2023-03-17 [1] CRAN (R 4.3.2)
##  httpuv        1.6.12     2023-10-23 [1] CRAN (R 4.3.2)
##  icecream      0.2.1      2023-09-27 [1] CRAN (R 4.3.2)
##  jquerylib     0.1.4      2021-04-26 [1] CRAN (R 4.3.3)
##  jsonlite      1.8.8      2023-12-04 [1] CRAN (R 4.3.3)
##  knitr         1.48       2024-07-07 [1] CRAN (R 4.3.3)
##  later         1.3.1      2023-05-02 [1] CRAN (R 4.3.2)
##  lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.3.3)
##  magrittr    * 2.0.3      2022-03-30 [1] CRAN (R 4.3.3)
##  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.3.3)
##  mime          0.12       2021-09-28 [1] CRAN (R 4.3.3)
##  miniUI        0.1.1.1    2018-05-18 [1] CRAN (R 4.3.2)
##  monads      * 0.1.0.9000 2024-10-14 [1] local
##  pillar        1.9.0      2023-03-22 [1] CRAN (R 4.3.3)
##  pkgbuild      1.4.2      2023-06-26 [1] CRAN (R 4.3.2)
##  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.3.3)
##  pkgload       1.3.3      2023-09-22 [1] CRAN (R 4.3.2)
##  prettyunits   1.2.0      2023-09-24 [3] CRAN (R 4.3.1)
##  processx      3.8.3      2023-12-10 [3] CRAN (R 4.3.2)
##  profvis       0.3.8      2023-05-02 [1] CRAN (R 4.3.2)
##  promises      1.2.1      2023-08-10 [1] CRAN (R 4.3.2)
##  ps            1.7.6      2024-01-18 [3] CRAN (R 4.3.2)
##  purrr         1.0.2      2023-08-10 [3] CRAN (R 4.3.1)
##  R6            2.5.1      2021-08-19 [1] CRAN (R 4.3.3)
##  Rcpp          1.0.11     2023-07-06 [1] CRAN (R 4.3.2)
##  remotes       2.4.2.1    2023-07-18 [1] CRAN (R 4.3.2)
##  rlang         1.1.4      2024-06-04 [1] CRAN (R 4.3.3)
##  rmarkdown     2.28       2024-08-17 [1] CRAN (R 4.3.3)
##  RSQLite       2.3.7      2024-05-27 [1] CRAN (R 4.3.3)
##  rstudioapi    0.15.0     2023-07-07 [3] CRAN (R 4.3.1)
##  sass          0.4.9      2024-03-15 [1] CRAN (R 4.3.3)
##  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.3.2)
##  shiny         1.7.5.1    2023-10-14 [1] CRAN (R 4.3.2)
##  stringi       1.8.4      2024-05-06 [1] CRAN (R 4.3.3)
##  stringr       1.5.1      2023-11-14 [1] CRAN (R 4.3.3)
##  tibble        3.2.1      2023-03-20 [1] CRAN (R 4.3.3)
##  tidyselect    1.2.0      2022-10-10 [3] CRAN (R 4.2.1)
##  urlchecker    1.0.1      2021-11-30 [1] CRAN (R 4.3.2)
##  usethis       3.0.0      2024-07-29 [1] CRAN (R 4.3.3)
##  utf8          1.2.4      2023-10-22 [1] CRAN (R 4.3.3)
##  vctrs         0.6.5      2023-12-01 [1] CRAN (R 4.3.3)
##  withr         3.0.0      2024-01-16 [1] CRAN (R 4.3.3)
##  xfun          0.47       2024-08-17 [1] CRAN (R 4.3.3)
##  xtable        1.8-4      2019-04-21 [1] CRAN (R 4.3.2)
##  yaml          2.3.10     2024-07-26 [1] CRAN (R 4.3.3)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.3
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────


To leave a comment for the author, please follow the link and comment on their blog: rstats on Irregularly Scheduled Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)