Simplifying List Filtering in R with purrr’s keep()

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

The {purrr} package in R is a powerful tool for working with lists and other data structures. One particularly useful function in the package is keep(), which allows you to filter a list by keeping only the elements that meet certain conditions.

The keep() function takes two arguments: the list to filter, and a function that returns a logical value indicating whether each element of the list should be kept. The function can be specified as an anonymous function or a named function, and it should take a single argument (the current element of the list).

For example, let’s say we have a list of numbers and we want to keep only the even numbers. We could use the keep() function with an anonymous function that checks the remainder of the current element divided by 2:

library(purrr)

numbers <- c(1, 2, 3, 4, 5, 6)
even_numbers <- keep(numbers, function(x) x %% 2 == 0)
even_numbers
[1] 2 4 6

We see that this keeps [1] 2 4 6.

The purrr package also provides a convenient shorthand for this operation, .p, which can be used inside the keep function to return the element.

even_numbers <- keep(numbers, ~ .x %% 2 == 0)
even_numbers
[1] 2 4 6

You can also use the keep() function to filter a list of other types of objects, such as strings or lists. For example, you could use it to keep only the strings that are longer than a certain length:

words <- c("cat", "dog", "elephant", "bird")
long_words <- keep(words, function(x) nchar(x) > 4)
long_words
[1] "elephant"

We see that this keeps “elephant” & “bird”.

In summary, the {purrr} package’s keep() function is a powerful tool for filtering lists in R, and the .p parameter can be used as a shorthand. It can be used to keep only the items in a list that meet a user-given condition, and it can be used with a variety of data types.

Function

Here is the keep() function and it’s parameters.

keep(.x, .p, ...)

Here are the arguments to the parameters.

  • .x - A list or vector.
  • .p - A predicate function (i.e. a function that returns either TRUE or FALSE) specified in one of the following ways:
    • A named function, e.g. is.character.
    • An anonymous function, e.g. \(x) all(x < 0) or function(x) all(x < 0).
    • A formula, e.g. ~ all(.x < 0). You must use .x to refer to the first argument). Only recommended if you require backward compatibility with older versions of R.
  • ... - Additional arguments passed on to .p.

Examples

I recently came across wanting to filter a list that is given as an argument to a parameter. The function I am working for my upcoming {tidyAML} package has a function called create_workflow_set() that has a parameter .recipe_list which is set to list(). The user must only place recipes in this list or else I want it to fail. So I was able to write a quick check using keep() like so:

# Checks ----
# only keep() recipes
rec_list <- purrr::keep(rec_list, ~ inherits(.x, "recipe"))

Voila!

To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)