3 (actually 4) neat R functions
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Time for me to throw away my sticky note after sharing what I wrote on it!
grep(...)
not which(grepl(...))
Recently I caught myself using which(grepl(...))
,
animals <- c("cat", "bird", "dog", "fish")
which(grepl("i", animals))
#> [1] 2 4
when the simpler alternative is
animals <- c("cat", "bird", "dog", "fish")
grep("i", animals)
#> [1] 2 4
And should I need the values instead of the indices, I know I shouldn’t write
animals <- c("cat", "bird", "dog", "fish")
animals[grepl("i", animals)]
#> [1] "bird" "fish"
but
animals <- c("cat", "bird", "dog", "fish")
grep("i", animals, value = TRUE)
#> [1] "bird" "fish"
How to remember to use grep()
? Re-reading oneself, or having code reviewed, probably helps, but why not automate this? When I shared my note to self on Mastodon, Hugo Gruson explained that detecting usage of which(grepl(
was part of planned linters to be added to lintr from Google linting suite. This is excellent news!
strrep()
and other defence tools against poor usages of paste()
Yihui Xie wrote a blog post inspired by my own series, where one of the three presented functions was one that was on my sticky note! I’ll still present it: strrep()
.
strrep()
means “string repeat”. Instead of writing
paste(rep("bla", 3), collapse = "")
#> [1] "blablabla"
you can, and should, write
strrep("bla", 3)
#> [1] "blablabla"
I discovered this function because Hugo Gruson telling me about lintr inspired me to skim through lintr reference, where I saw “Raise lints for several common poor usages of paste()
”. That linter would also tell you when you use paste(, sep = "")
instead of paste0()
.
startsWith()
and endsWith()
I learned about startsWith()
and endsWith()
by reading lintr reference but I also got notified about it when running lintr on a package I was working on. Have you ever tried running all linters on your code? Fun experience. Anyhow, one linter is Require usage of startsWith()
and endsWith()
over grepl()
/substr()
versions, with an interesting Details section on missing values.
Instead of writing
animals <- c("cat", "cow", "dog", "fish")
grepl("^c", animals)
#> [1] TRUE TRUE FALSE FALSE
I should write
animals <- c("cat", "cow", "dog", "fish")
startsWith(animals, "c")
#> [1] TRUE TRUE FALSE FALSE
A nice side-effect of the switch, beyond good practice for its own sake and more readability, is that the argument order is more logical in startsWith()
.
Similarly, instead of writing
animals <- c("cat", "cow", "dog", "fish")
grepl("t$", animals)
#> [1] TRUE FALSE FALSE FALSE
I should write
animals <- c("cat", "cow", "dog", "fish")
endsWith(animals, "t")
#> [1] TRUE FALSE FALSE FALSE
Conclusion
In this post I shared about grep()
to be used in lieu of which(grepl())
, about strrep()
(string repetition) to be used in lieu of paste(rep(), collapse ="")
and about startsWith()
and endsWith()
to be used in lieu of some regular expressions with respectively ^
and $
.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.