How to Use “not in” operator in Filter

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to Use “not in” operator in Filter appeared first on Data Science Tutorials

How to Use “not in” operator in Filter, To filter for rows in a data frame that is not in a list of values, use the following basic syntax in dplyr.

How to compare variances in R – Data Science Tutorials

df %>%
  filter(!col_name %in% c('value1', 'value2', 'value3', ...))

The examples below demonstrate how to utilize this syntax in practice.

Example 1: Rows that do not have a value in one column are filtered out.

Let’s say we have the following R data frame.

Two Sample Proportions test in R-Complete Guide – Data Science Tutorials

Let’s create a data frame

df <- data.frame(team=c('P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7', 'P8'),
                 points=c(110, 120, 80, 16, 105, 185, 112, 112),
                 assists=c(133, 128, 131, 139, 134,55,66,135),
                 rebounds=c(18, 18, 14, 13, 12, 15, 17, 12))

Now we can view the data frame

df
    team points assists rebounds
1   P1    110     133       18
2   P2    120     128       18
3   P3     80     131       14
4   P4     16     139       13
5   P5    105     134       12
6   P6    185      55       15
7   P7    112      66       17
8   P8    112     135       12

The following syntax demonstrates how to search for rows where the team name is not ‘P1’ or ‘P2’.

Get the first value in each group in R? – Data Science Tutorials

Find rows where the team name isn’t ‘P1’ or ‘P2’.

df %>%
  filter(!team %in% c('P1', 'P2'))
   team points assists rebounds
1   P3     80     131       14
2   P4     16     139       13
3   P5    105     134       12
4   P6    185      55       15
5   P7    112      66       17
6   P8    112     135       12

Example 2: Filter for rows that don’t have a value in more than one column

The following syntax demonstrates how to filter for rows with a team name that does not equal ‘P1’ and a position that does not equal ‘P3’.

Change ggplot2 Theme Color in R- Data Science Tutorials

filter for rows with a team name other than ‘P1’ and a position other than ‘P3’.

df <- data.frame(team=c('P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7', 'P8'),
                 points=c('A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'),
                 assists=c(133, 128, 131, 139, 134,55,66,135),
                 rebounds=c(18, 18, 14, 13, 12, 15, 17, 12))
df
   team points assists rebounds
1   P1      A     133       18
2   P2      A     128       18
3   P3      B     131       14
4   P4      B     139       13
5   P5      C     134       12
6   P6      C      55       15
7   P7      C      66       17
8   P8      D     135       12
df %>%
  filter(!team %in% c('P1') & !points %in% c('D'))
   team points assists rebounds
1   P2      A     128       18
2   P3      B     131       14
3   P4      B     139       13
4   P5      C     134       12
5   P6      C      55       15
6   P7      C      66       17

The post How to Use “not in” operator in Filter appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)