Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I (only! but luckily!) recently got introduced to the magic of purrr::reduce()
. Thank you, Tobias! I was told about it right as I was unhappily using many for loops in a package1, for lack of a better idea. In this post I’ll explain how purrr::reduce()
helped me reduce my for loop usage. I also hope that if I’m doing something wrong, someone will come forward and tell me!
Before: many for, much sadness
I was starting from a thing, that could be a list, or even a data.frame. Then for a bunch of variables, I tweaked the thing. My initial coding pattern was therefore:
for (var in variables_vector) { thing <- do_something(thing, var, other_argument = other_argument) }
I was iteratively changing the thing, along a variables_vector
, or sometimes a variables_list
.
Silly example
Ugh, finding an example is hard, it feels very contrived but I promise my real-life adoption of purrr::usage()
was life-changing!
# Some basic movie information movies <- tibble::tribble( ~title, ~color, ~elements, "Barbie", "pink", "shoes", "Oppenheimer", "red", "history" ) # More information to add to movies info_list <- list( list(title = "Barbie", info = list(element = "sparkles")), list(title = "Barbie", info = list(element = "feminism")), list(title = "Oppenheimer", info = list(element = "fire")) ) # Don't tell me this is weirdly formatted data, # who never obtains weirdly formatted data?! info_list #> [[1]] #> [[1]]$title #> [1] "Barbie" #> #> [[1]]$info #> [[1]]$info$element #> [1] "sparkles" #> #> #> #> [[2]] #> [[2]]$title #> [1] "Barbie" #> #> [[2]]$info #> [[2]]$info$element #> [1] "feminism" #> #> #> #> [[3]] #> [[3]]$title #> [1] "Oppenheimer" #> #> [[3]]$info #> [[3]]$info$element #> [1] "fire" add_element <- function(movies, info) { movies[movies[["title"]] == info[["title"]],][["elements"]] <- toString(c( movies[movies[["title"]] == info[["title"]],][["elements"]], info[["info"]][[1]] )) movies }
Now how do I add each element of the list to the original table? I could type something like:
for (info in info_list) { movies <- add_element(movies, info) } movies #> # A tibble: 2 × 3 #> title color elements #> <chr> <chr> <chr> #> 1 Barbie pink shoes, sparkles, feminism #> 2 Oppenheimer red history, fire
It’s not too bad, really. But since there’s another way, we can change it.
After
With purrr::reduce()
for (var in variables_vector) { thing <- do_something(thing, var) }
can become
thing <- purrr::reduce(variables_vector, do_something, .init = thing)
And (notice the other argument),
for (var in variables_vector) { thing <- do_something(thing, var, other_argument = other_argument) }
can become
thing <- purrr::reduce( variables_vector, \(thing, x) do_something(thing, x, other_argument = other_argument), .init = thing )
I haven’t completely internalized the pattern above but the documentation of purrr::reduce()
states
“We now generally recommend against using … to pass additional (constant) arguments to .f. Instead use a shorthand anonymous function:
Instead of x |> map(f, 1, 2, collapse = “,") do: x |> map((x) f(x, 1, 2, collapse = “,")) This makes it easier to understand which arguments belong to which function and will tend to yield better error messages.”
It might remind you of how things work for dplyr::across()
these days.
Back to our silly example!
# Some basic movie information movies <- tibble::tribble( ~title, ~color, ~elements, "Barbie", "pink", "shoes", "Oppenheimer", "red", "history" ) # More information to add to movies info_list <- list( list(title = "Barbie", info = list(element = "sparkles")), list(title = "Barbie", info = list(element = "feminism")), list(title = "Oppenheimer", info = list(element = "fire")) ) add_element <- function(movies, info) { movies[movies[["title"]] == info[["title"]],][["elements"]] <- toString(c( movies[movies[["title"]] == info[["title"]],][["elements"]], info[["info"]][[1]] )) movies } purrr::reduce(info_list, add_element, .init = movies) #> # A tibble: 2 × 3 #> title color elements #> <chr> <chr> <chr> #> 1 Barbie pink shoes, sparkles, feminism #> 2 Oppenheimer red history, fire
If we tweak the add_element()
function to add a separator
argument to it,
add_element <- function(movies, info, separator) { movies[movies[["title"]] == info[["title"]],][["elements"]] <- paste(c( movies[movies[["title"]] == info[["title"]],][["elements"]], info[["info"]][[1]] ), collapse = separator) movies } purrr::reduce( info_list, \(movies, x) add_element(movies, x, separator = " - "), .init = movies ) #> # A tibble: 2 × 3 #> title color elements #> <chr> <chr> <chr> #> 1 Barbie pink shoes - sparkles - feminism #> 2 Oppenheimer red history - fire purrr::reduce( info_list, \(movies, x) add_element(movies, x, separator = " PLUS "), .init = movies ) #> # A tibble: 2 × 3 #> title color elements #> <chr> <chr> <chr> #> 1 Barbie pink shoes PLUS sparkles PLUS feminism #> 2 Oppenheimer red history PLUS fire
And voilà!
Conclusion
In this post I presented my approximate understanding of purrr::reduce()
, that helped me avoid writing some for loops and instead more elegant code… or at least helped me understand a pattern that in the future I could use elegantly. I can only hope I purrr::accumulate()
more experience, as I very much still feel like a newbie.
For more information I’d recommend reading the documentation of purrr::reduce()
to be aware of other features, the content on the reduce family in Advanced R by Hadley Wickham… and release-watching the purrr repo to keep up-to-date with latest recommendations. You can also use GitHub Advanced Search to find examples of usage of the function in, say, CRAN packages.
Edit: For another take of / use case of purrr::reduce()
, June Choe wrote a nice detailed tutorial “Collapse repetitive piping with reduce()".
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.