Lesser known purrr tricks
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
purrr
is a package that extends R’s functional programming capabilities. It brings a lot of new stuff to the table and in this post I show you some of the most useful (at least to me) functions included in purrr
.
Getting rid of loops with map()
library(purrr) numbers <- list(11, 12, 13, 14) map_dbl(numbers, sqrt) ## [1] 3.316625 3.464102 3.605551 3.741657
You might wonder why this might be preferred to a for loop? It’s a lot less verbose, and you do not need to initialise any kind of structure to hold the result. If you google “create empty list in R” you will see that this is very common. However, with the map()
family of functions, there is no need for an initial structure. map_dbl()
returns an atomic list of real numbers, but if you use map()
you will get a list back. Try them all out!
Map conditionally
map_if()
# Create a helper function that returns TRUE if a number is even is_even <- function(x){ !as.logical(x %% 2) } map_if(numbers, is_even, sqrt) ## [[1]] ## [1] 11 ## ## [[2]] ## [1] 3.464102 ## ## [[3]] ## [1] 13 ## ## [[4]] ## [1] 3.741657
map_at()
map_at(numbers, c(1,3), sqrt) ## [[1]] ## [1] 3.316625 ## ## [[2]] ## [1] 12 ## ## [[3]] ## [1] 3.605551 ## ## [[4]] ## [1] 14
map_if()
and map_at()
have a further argument than map()
; in the case of map_if()
, a predicate function ( a function that returns TRUE
or FALSE
) and a vector of positions for map_at()
. This allows you to map your function only when certain conditions are met, which is also something that a lot of people google for.
Map a function with multiple arguments
numbers2 <- list(1, 2, 3, 4) map2(numbers, numbers2, `+`) ## [[1]] ## [1] 12 ## ## [[2]] ## [1] 14 ## ## [[3]] ## [1] 16 ## ## [[4]] ## [1] 18
You can map two lists to a function which takes two arguments using map_2()
. You can even map an arbitrary number of lists to any function using pmap()
.
By the way, try this in: `+`(1,3)
and see what happens.
Don’t stop execution of your function if something goes wrong
possible_sqrt <- possibly(sqrt, otherwise = NA_real_) numbers_with_error <- list(1, 2, 3, "spam", 4) map(numbers_with_error, possible_sqrt) ## [[1]] ## [1] 1 ## ## [[2]] ## [1] 1.414214 ## ## [[3]] ## [1] 1.732051 ## ## [[4]] ## [1] NA ## ## [[5]] ## [1] 2
Another very common issue is to keep running your loop even when something goes wrong. In most cases the loop simply stops at the error, but you would like it to continue and see where it failed. Try to google “skip error in a loop” or some variation of it and you’ll see that a lot of people really just want that. This is possible by combining map()
and possibly()
. Most solutions involve the use of tryCatch()
which I personally do not find very easy to use.
Don’t stop execution of your function if something goes wrong and capture the error
safe_sqrt <- safely(sqrt, otherwise = NA_real_) map(numbers_with_error, safe_sqrt) ## [[1]] ## [[1]]$result ## [1] 1 ## ## [[1]]$error ## NULL ## ## ## [[2]] ## [[2]]$result ## [1] 1.414214 ## ## [[2]]$error ## NULL ## ## ## [[3]] ## [[3]]$result ## [1] 1.732051 ## ## [[3]]$error ## NULL ## ## ## [[4]] ## [[4]]$result ## [1] NA ## ## [[4]]$error ## <simpleError in sqrt(x = x): non-numeric argument to mathematical function> ## ## ## [[5]] ## [[5]]$result ## [1] 2 ## ## [[5]]$error ## NULL
safely()
is very similar to possibly()
but it returns a list of lists. An element is thus a list of the result and the accompagnying error message. If there is no error, the error component is NULL
if there is an error, it returns the error message.
Transpose a list
safe_result_list <- map(numbers_with_error, safe_sqrt) transpose(safe_result_list) ## $result ## $result[[1]] ## [1] 1 ## ## $result[[2]] ## [1] 1.414214 ## ## $result[[3]] ## [1] 1.732051 ## ## $result[[4]] ## [1] NA ## ## $result[[5]] ## [1] 2 ## ## ## $error ## $error[[1]] ## NULL ## ## $error[[2]] ## NULL ## ## $error[[3]] ## NULL ## ## $error[[4]] ## <simpleError in sqrt(x = x): non-numeric argument to mathematical function> ## ## $error[[5]] ## NULL
Here we transposed the above list. This means that we still have a list of lists, but where the first list holds all the results (which you can then access with safe_result_list$result
) and the second list holds all the errors (which you can access with safe_result_list$error
). This can be quite useful!
Apply a function to a lower depth of a list
transposed_list <- transpose(safe_result_list) transposed_list %>% at_depth(2, is_null) ## Warning: at_depth() is deprecated, please use `modify_depth()` instead ## $result ## $result[[1]] ## [1] FALSE ## ## $result[[2]] ## [1] FALSE ## ## $result[[3]] ## [1] FALSE ## ## $result[[4]] ## [1] FALSE ## ## $result[[5]] ## [1] FALSE ## ## ## $error ## $error[[1]] ## [1] TRUE ## ## $error[[2]] ## [1] TRUE ## ## $error[[3]] ## [1] TRUE ## ## $error[[4]] ## [1] FALSE ## ## $error[[5]] ## [1] TRUE
Sometimes working with lists of lists can be tricky, especially when we want to apply a function to the sub-lists. This is easily done with at_depth()
!
Set names of list elements
name_element <- c("sqrt()", "ok?") set_names(transposed_list, name_element) ## $`sqrt()` ## $`sqrt()`[[1]] ## [1] 1 ## ## $`sqrt()`[[2]] ## [1] 1.414214 ## ## $`sqrt()`[[3]] ## [1] 1.732051 ## ## $`sqrt()`[[4]] ## [1] NA ## ## $`sqrt()`[[5]] ## [1] 2 ## ## ## $`ok?` ## $`ok?`[[1]] ## NULL ## ## $`ok?`[[2]] ## NULL ## ## $`ok?`[[3]] ## NULL ## ## $`ok?`[[4]] ## <simpleError in sqrt(x = x): non-numeric argument to mathematical function> ## ## $`ok?`[[5]] ## NULL
Reduce a list to a single value
reduce(numbers, `*`) ## [1] 24024
reduce()
applies the function *
iteratively to the list of numbers. There’s also accumulate()
:
accumulate(numbers, `*`) ## [1] 11 132 1716 24024
which keeps the intermediary results.
This function is very general, and you can reduce anything:
Matrices:
mat1 <- matrix(rnorm(10), nrow = 2) mat2 <- matrix(rnorm(10), nrow = 2) mat3 <- matrix(rnorm(10), nrow = 2) list_mat <- list(mat1, mat2, mat3) reduce(list_mat, `+`) ## [,1] [,2] [,3] [,4] [,5] ## [1,] -2.48530177 1.0110049 0.4450388 1.280802 1.3413979 ## [2,] 0.07596679 -0.6872268 -0.6579242 1.615237 0.8231933
even data frames:
df1 <- as.data.frame(mat1) df2 <- as.data.frame(mat2) df3 <- as.data.frame(mat3) list_df <- list(df1, df2, df3) reduce(list_df, dplyr::full_join) ## Joining, by = c("V1", "V2", "V3", "V4", "V5") ## Joining, by = c("V1", "V2", "V3", "V4", "V5") ## V1 V2 V3 V4 V5 ## 1 -0.6264538 -0.8356286 0.32950777 0.48742905 0.5757814 ## 2 0.1836433 1.5952808 -0.82046838 0.73832471 -0.3053884 ## 3 -0.8969145 1.5878453 -0.08025176 0.70795473 1.9844739 ## 4 0.1848492 -1.1303757 0.13242028 -0.23969802 -0.1387870 ## 5 -0.9619334 0.2587882 0.19578283 0.08541773 -1.2188574 ## 6 -0.2925257 -1.1521319 0.03012394 1.11661021 1.2673687
Hope you enjoyed this list of useful functions! If you enjoy the content of my blog, you can follow me on twitter.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.