Reducing my for loop usage with purrr::reduce()

Maëlle's R blog on Maëlle Salmon's personal website

19 hours ago

[This article was first published on Maëlle's R blog on Maëlle Salmon's personal website, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I (only! but luckily!) recently got introduced to the magic of purrr::reduce(). Thank you, Tobias! I was told about it right as I was unhappily using many for loops in a package¹, for lack of a better idea. In this post I’ll explain how purrr::reduce() helped me reduce my for loop usage. I also hope that if I’m doing something wrong, someone will come forward and tell me!

Before: many for, much sadness

I was starting from a thing, that could be a list, or even a data.frame. Then for a bunch of variables, I tweaked the thing. My initial coding pattern was therefore:

for (var in variables_vector) {
  thing <- do_something(thing, var, other_argument = other_argument)
}

I was iteratively changing the thing, along a variables_vector, or sometimes a variables_list.

Silly example

Ugh, finding an example is hard, it feels very contrived but I promise my real-life adoption of purrr::usage() was life-changing!

# Some basic movie information
movies <- tibble::tribble(
  ~title, ~color, ~elements,
  "Barbie", "pink", "shoes",
  "Oppenheimer", "red", "history"
)

# More information to add to movies
info_list <- list(
  list(title = "Barbie", info = list(element = "sparkles")),
  list(title = "Barbie", info = list(element = "feminism")),
  list(title = "Oppenheimer", info = list(element = "fire"))
)

# Don't tell me this is weirdly formatted data,
# who never obtains weirdly formatted data?!
info_list
#> [[1]]
#> [[1]]$title
#> [1] "Barbie"
#> 
#> [[1]]$info
#> [[1]]$info$element
#> [1] "sparkles"
#> 
#> 
#> 
#> [[2]]
#> [[2]]$title
#> [1] "Barbie"
#> 
#> [[2]]$info
#> [[2]]$info$element
#> [1] "feminism"
#> 
#> 
#> 
#> [[3]]
#> [[3]]$title
#> [1] "Oppenheimer"
#> 
#> [[3]]$info
#> [[3]]$info$element
#> [1] "fire"

add_element <- function(movies, info) {
  movies[movies[["title"]] == info[["title"]],][["elements"]] <-
    toString(c(
      movies[movies[["title"]] == info[["title"]],][["elements"]],
      info[["info"]][[1]]
    ))
  movies
}

Now how do I add each element of the list to the original table? I could type something like:

for (info in info_list) {
  movies <- add_element(movies, info)
}
movies
#> # A tibble: 2 × 3
#>   title       color elements                 
#>   <chr>       <chr> <chr>                    
#> 1 Barbie      pink  shoes, sparkles, feminism
#> 2 Oppenheimer red   history, fire

It’s not too bad, really. But since there’s another way, we can change it.

After

With purrr::reduce()

for (var in variables_vector) {
  thing <- do_something(thing, var)
}

can become

thing <- purrr::reduce(variables_vector, do_something, .init = thing)

And (notice the other argument),

for (var in variables_vector) {
  thing <- do_something(thing, var, other_argument = other_argument)
}

can become

thing <- purrr::reduce(
  variables_vector, 
  \(thing, x) do_something(thing, x, other_argument = other_argument), 
  .init = thing
)

I haven’t completely internalized the pattern above but the documentation of purrr::reduce() states

“We now generally recommend against using … to pass additional (constant) arguments to .f. Instead use a shorthand anonymous function:

Instead of x |> map(f, 1, 2, collapse = “,") do: x |> map((x) f(x, 1, 2, collapse = “,")) This makes it easier to understand which arguments belong to which function and will tend to yield better error messages.”

It might remind you of how things work for dplyr::across() these days.

Back to our silly example!

# Some basic movie information
movies <- tibble::tribble(
  ~title, ~color, ~elements,
  "Barbie", "pink", "shoes",
  "Oppenheimer", "red", "history"
)

# More information to add to movies
info_list <- list(
  list(title = "Barbie", info = list(element = "sparkles")),
  list(title = "Barbie", info = list(element = "feminism")),
  list(title = "Oppenheimer", info = list(element = "fire"))
)

add_element <- function(movies, info) {
  movies[movies[["title"]] == info[["title"]],][["elements"]] <-
    toString(c(
      movies[movies[["title"]] == info[["title"]],][["elements"]],
      info[["info"]][[1]]
    ))
  movies
}

purrr::reduce(info_list, add_element, .init = movies)
#> # A tibble: 2 × 3
#>   title       color elements                 
#>   <chr>       <chr> <chr>                    
#> 1 Barbie      pink  shoes, sparkles, feminism
#> 2 Oppenheimer red   history, fire

If we tweak the add_element() function to add a separator argument to it,

add_element <- function(movies, info, separator) {
  movies[movies[["title"]] == info[["title"]],][["elements"]] <-
    paste(c(
      movies[movies[["title"]] == info[["title"]],][["elements"]],
      info[["info"]][[1]]
    ), collapse = separator)
  movies
}

purrr::reduce(
  info_list, 
  \(movies, x) add_element(movies, x, separator = " - "), 
  .init = movies
)
#> # A tibble: 2 × 3
#>   title       color elements                   
#>   <chr>       <chr> <chr>                      
#> 1 Barbie      pink  shoes - sparkles - feminism
#> 2 Oppenheimer red   history - fire

purrr::reduce(
  info_list, 
  \(movies, x) add_element(movies, x, separator = " PLUS "), 
  .init = movies
)
#> # A tibble: 2 × 3
#>   title       color elements                         
#>   <chr>       <chr> <chr>                            
#> 1 Barbie      pink  shoes PLUS sparkles PLUS feminism
#> 2 Oppenheimer red   history PLUS fire

And voilà!

Conclusion

In this post I presented my approximate understanding of purrr::reduce(), that helped me avoid writing some for loops and instead more elegant code… or at least helped me understand a pattern that in the future I could use elegantly. I can only hope I purrr::accumulate() more experience, as I very much still feel like a newbie.

For more information I’d recommend reading the documentation of purrr::reduce() to be aware of other features, the content on the reduce family in Advanced R by Hadley Wickham… and release-watching the purrr repo to keep up-to-date with latest recommendations. You can also use GitHub Advanced Search to find examples of usage of the function in, say, CRAN packages.

Edit: For another take of / use case of purrr::reduce(), June Choe wrote a nice detailed tutorial “Collapse repetitive piping with reduce()".

< section class="footnotes" role="doc-endnotes">

The package is glitter, where we store query objects as a list. ↩︎

To leave a comment for the author, please follow the link and comment on their blog: Maëlle's R blog on Maëlle Salmon's personal website.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.