It’s lists all the way down
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Today, I had the opportunity to help someone over at the R for Data Science Slack group (read more about this group here) and I thought that the question asked could make for an interesting blog post, so here it is!
Disclaimer: the way I’m doing things here is totally not optimal, but I want to illustrate how to map functions over nested lists. But I show the optimal way at the end, so for the people that are familiar with purrr
don’t get mad at me.
Suppose you have to do certain data transformation tasks on a data frame, and you write a nice function that does that for you:
library(tidyverse) data(mtcars) nice_function = function(df, param1, param2){ df = df %>% filter(cyl == param1, am == param2) %>% mutate(result = mpg * param1 * (2 - param2)) return(df) } nice_function(mtcars, 4, 0) ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0
This might seem like a silly function and not a nice function, but it will illustrate the point I want to make (and the question that was asked) very well. This function is completely useless, but bear with me. Now, suppose that you want to do these operations for each value of cyl
and am
(of course you can do that without using nice_function()
…). First, you might want to fix the value of am
to 0, and then loop over the values of cyl
. But as I have explained in this other blog post I prefer using the map()
functions included in purrr
. For example:
values_cyl = c(4, 6, 8) (result = map(values_cyl, nice_function, df = mtcars, param2 = 0)) ## [[1]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## ## [[2]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 2 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 3 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 4 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## ## [[3]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 2 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 3 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 4 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 5 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 6 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 7 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 8 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 9 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 10 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 11 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 12 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2
What you get here is a list for each value in values_cyl
; so one list for 4
, one for 6
and one for 8
. Suppose now that you are feeling adventurous, and want to loop over the values of am
too:
values_am = c(0, 1)
So first, we need to map a function to each element of values_am
. But which function? Well, for given value of am
, our problem is the same as before; we need to map nice_function()
to each value of cyl
. So, that’s what we’re going to do:
(result = map(values_am, ~map(values_cyl, nice_function, df = mtcars, param2 = .))) ## [[1]] ## [[1]][[1]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## ## [[1]][[2]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 2 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 3 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 4 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## ## [[1]][[3]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 2 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 3 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 4 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 5 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 6 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 7 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 8 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 9 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 10 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 11 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 12 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2 ## ## ## [[2]] ## [[2]][[1]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 91.2 ## 2 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 129.6 ## 3 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 121.6 ## 4 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 135.6 ## 5 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 109.2 ## 6 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 104.0 ## 7 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 121.6 ## 8 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 85.6 ## ## [[2]][[2]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 126.0 ## 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 126.0 ## 3 19.7 6 145 175 3.62 2.770 15.50 0 1 5 6 118.2 ## ## [[2]][[3]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 15.8 8 351 264 4.22 3.17 14.5 0 1 5 4 126.4 ## 2 15.0 8 301 335 3.54 3.57 14.6 0 1 5 8 120.0
We now have a list of size 2 (for each value of am
) where each element is itself a list of size 3 (for each value of cyl
) where each element is a data frame. Are you still with me? Also, notice that the second map is given as a formula (notice the ~
in front of the second map). This creates an anonymous function, where the parameter is given by the .
(think of the .
as being the x
in f(x)
). So the .
is the stand-in for the values contained inside values_am
.
The people that are familiar with the map()
functions must be fuming right now; there is a way to avoid this nested hell. I will talk about it soon, but first I want to play around with this list of lists.
If you have a list of data frames, you can bind their rows together with reduce(list_of_dfs, rbind)
. You would like to this here, but because your lists of data frames are contained inside another list… you guessed it, you have to map over it!
(result2 = map(result, ~reduce(., rbind))) ## [[1]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 5 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 6 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 7 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## 8 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 9 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 10 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 11 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 12 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 13 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 14 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 15 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 16 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 17 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 18 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 19 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2 ## ## [[2]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 91.2 ## 2 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 129.6 ## 3 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 121.6 ## 4 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 135.6 ## 5 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 109.2 ## 6 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 104.0 ## 7 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 121.6 ## 8 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 85.6 ## 9 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 126.0 ## 10 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 126.0 ## 11 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 118.2 ## 12 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 126.4 ## 13 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 120.0
Here again, I pass reduce()
as a formula to map()
to create an anonymous function. Again, the .
is used as the stand-in for each element contained in result
; a list of data frames, where reduce(., rbind)
knows what to do. Now that we have this we can use reduce()
with rbind()
again to get a single data frame:
(result3 = reduce(result2, rbind)) ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 5 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 6 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 7 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## 8 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 9 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 10 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 11 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 12 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 13 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 14 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 15 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 16 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 17 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 18 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 19 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2 ## 20 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 91.2 ## 21 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 129.6 ## 22 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 121.6 ## 23 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 135.6 ## 24 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 109.2 ## 25 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 104.0 ## 26 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 121.6 ## 27 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 85.6 ## 28 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 126.0 ## 29 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 126.0 ## 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 118.2 ## 31 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 126.4 ## 32 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 120.0
Of course, since reduce(list_of_dfs, rbind)
is such a common operation, you could have simply used dplyr::bind_rows
, which does exactly this:
(result2 = map(result, bind_rows)) ## [[1]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 5 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 6 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 7 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## 8 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 9 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 10 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 11 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 12 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 13 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 14 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 15 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 16 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 17 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 18 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 19 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2 ## ## [[2]] ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 91.2 ## 2 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 129.6 ## 3 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 121.6 ## 4 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 135.6 ## 5 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 109.2 ## 6 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 104.0 ## 7 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 121.6 ## 8 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 85.6 ## 9 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 126.0 ## 10 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 126.0 ## 11 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 118.2 ## 12 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 126.4 ## 13 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 120.0
and then:
(result3 = bind_rows(result2)) ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 5 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 6 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 7 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## 8 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 9 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 10 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 11 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 12 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 13 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 14 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 15 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 16 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 17 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 18 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 19 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2 ## 20 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 91.2 ## 21 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 129.6 ## 22 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 121.6 ## 23 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 135.6 ## 24 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 109.2 ## 25 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 104.0 ## 26 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 121.6 ## 27 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 85.6 ## 28 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 126.0 ## 29 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 126.0 ## 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 118.2 ## 31 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 126.4 ## 32 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 120.0
Of course, things are even simpler: you can avoid this deeply nested monstrosity by using map_df()
instead of map()
! map_df()
works just like map()
but return a data frame (hence the _df
in the name) instead of a list:
(result_df = map_df(values_am, ~map_df(values_cyl, nice_function, df = mtcars, param2 = .))) ## mpg cyl disp hp drat wt qsec vs am gear carb result ## 1 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 195.2 ## 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 182.4 ## 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 172.0 ## 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 256.8 ## 5 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 217.2 ## 6 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 230.4 ## 7 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 213.6 ## 8 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 299.2 ## 9 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 228.8 ## 10 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 262.4 ## 11 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 276.8 ## 12 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 243.2 ## 13 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 166.4 ## 14 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 166.4 ## 15 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 235.2 ## 16 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 248.0 ## 17 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 243.2 ## 18 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 212.8 ## 19 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 307.2 ## 20 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 91.2 ## 21 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 129.6 ## 22 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 121.6 ## 23 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 135.6 ## 24 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 109.2 ## 25 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 104.0 ## 26 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 121.6 ## 27 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 85.6 ## 28 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 126.0 ## 29 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 126.0 ## 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 118.2 ## 31 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 126.4 ## 32 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 120.0
If you look at the source code of map_df()
you see that dplyr::bind_rows
gets called at the end:
map_df ## function (.x, .f, ..., .id = NULL) ## { ## if (!is_installed("dplyr")) { ## abort("`map_df()` requires dplyr") ## } ## .f <- as_mapper(.f, ...) ## res <- map(.x, .f, ...) ## dplyr::bind_rows(res, .id = .id) ## } ## <environment: namespace:purrr>
So moral of the story? There are a lot of variants of the common purrr::map()
functions (as well as of dplyr
verbs, such as filter_at
, select_if
, etc…) and learning about them can save you from a lot of pain! However, if you need to apply a function to nested lists this is still possible; you just have to think about the structure of the nested list for a bit. There is also another function that you might want to study, modify_depth()
which solves related issues but I will end the blog post here. I might talk about it in a future blog post.
Also, if you want to learn more about R and the tidyverse, do read the link I posted in the introduction of the post and join the R4ds slack group! There are a lot of very nice people there that want to help you get better with your R-fu. Also, this is where I got the inspiration to write this blog post and I am thankful to the people there for the discussions; I feel comfortable with R, but I still learn new tips and tricks every day!
If you enjoy these blog posts, you can follow me on twitter. And happy new yeaR!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.