[This article was first published on RLang.io | R Language Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Long story short, I need to convert a pretty simple OR search to a non-directional AND keyword search. Direction is straightforward, with just using [.*?] between words (or in SQL using LIKE keyword_1%keyword_2). Anyhow, I came up with this little function and thought I would share.
keyword_search <- paste0(sapply(unlist(strsplit("keyword_1,keyword_2", ",")),function(x) { return(paste0("(?=.*?(",x,"))")) }),collapse="")
Now this sets keyword_search to a really nice regular expression that can be used with grep.
NOTE: You will need to use PERL = TRUE when using the generated regular expression.
(?=.*?(keyword_1))(?=.*?(keyword_2))
Results from regex101 show the following breakdown for the curious
Positive Lookahead
- (?=.*?(keyword_1))
- Assert that the Regex below matches
- .*? matches any character
- *? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
1st Capturing Group
- (keyword_1)
- keyword_1 matches the characters keyword_1
Positive Lookahead
- (?=.*?(keyword_2))
- Assert that the Regex below matches
- .*? matches any character
- *? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
2nd Capturing Group
- (keyword_2)
- keyword_2 matches the characters keyword_2l
To leave a comment for the author, please follow the link and comment on their blog: RLang.io | R Language Programming.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.