[This article was first published on You Know, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As a toy example, look what happens when trying to subset on a column that includes NA values.
df <- data.frame(a=11:15,b=c(3,NA,4,4,NA))
df
df[df$b==4,]
df[df$b<=4,]In each case, rows with an NA in the b column are returned. This might be surprising and not obvious if wrapped inside of a an aggregation such as nrow or sum. A safer way to accomplish this subsetting is by using the %in% operator. Like so:
df[df$b %in% 4,]
To leave a comment for the author, please follow the link and comment on their blog: You Know.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.