match vs. %in%

Xianjun Dong

10 years ago

[This article was first published on One Tip Per Day, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

match and %in% are two very commonly-used function in R. So, what’s the difference of them?

First, how to use them — (copy from R manual)

match returns a vector of the positions of (first) matches of its first argument in its second.
%in% is a more intuitive interface as a binary operator, which returns a logical vector indicating if there is a match or not for its left operand.

match(x, table, nomatch = NA_integer_, incomparables = NULL)x %in% table

Examples:

> a
[1] 1 1 0 1 5 1 2 4
> b
 [1] 10  9  8  7  6  5  4  3  2  1
> match(a,b)
[1] 10 10 NA 10  6 10  9  7
> a %in% b
[1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

So, if two vectors are overlapped like

a —————

b —————————–

To get the overlapped part in order of a, use a[a %in% b], even though there are duplicates in the overlapped part. However, this does not work for match, since match() only returns the first match of a in b. For example,

> match(b,a)
 [1] NA NA NA NA NA  5  8 NA  7  1
> match(b,a, nomatch=0)
 [1] 0 0 0 0 0 5 8 0 7 1
> a[match(b,a, nomatch=0)]
[1] 5 4 2 1

even using ‘nomatch=0’, the final command still returns 4 elements, not the overlapped ones.

To leave a comment for the author, please follow the link and comment on their blog: One Tip Per Day.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.