Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Remember my blog post about automatic tools for improving R packages? One of these tools is Jim Hester’s lintr
, a package that performs static code analysis. In my experience it mostly helps identifying too long code lines and missing space, although it’s a bit more involved than that. In any case, lintr
helps you maintain good code style, and as mentioned in that now old post of mine, you can add a lintr
unit test to your package which will ensure you don’t get lazy over time.
Now say your package has a lintr
unit test and lives on GitHub. What happens if someone makes a pull request and writes looong code lines? Continuous integration builds will fail but not only that… The contributor will get to know Lintr Bot, lintr’s Hester (Easter) egg!
Lintr Bot, a lazy little thing?
Lintr Bot has a GitHub profile. I am actually responsible for its avatar, because when I discovered it a while back I offered Jim to design it based on this free image. Now, Lintr Bot has a face but if you look at its GitHub activity, you might wonder, what does it even do? It’s perfectly fine for a human to have nearly no GitHub activity, but remember that Lintr Bot lives on GitHub only, so what does it do all day?
Well Lintr Bot goes around and comments on Pull Requests in R packages, see for instance this comment. As mentioned earlier, a contributor who forgets even a slight code style mistake will be reminded of it by Lintr Bot if the package has a lintr
unit test. This is quite useful!
We might also wonder how often Lintr Bot actually works? After all, since GitHub activity timeline doesn’t show comments on Pull Requests, we’re at a loss when it comes to estimating how active this bot is…
Lintr Bot’s year in review
That’s where my gs
package, presented in my blog post about Bob Rudis’ R packages, will come into play. I’ve been working on a function called spy
which can return either Pull Requests or Issues an user commented on during a given period of time. As creepy as it might sound, it could help you trace back all the desperate comments you wrote when debugging something last week for instance, without your having had to track them in the first place. g::spy
is a work in progress, but let’s see what it can return me for Lintr Bot!
Note that there’s no way to filter for comments between a date range, well as far as I understand the “search” functionalities of GitHub V4 API, so I can only filter Pull Requests updated in a given range.
lintrbot_act <- gs::spy(user = "lintr-bot", type = "PullRequest", updated_after = "2017-03-30", updated_before = "2018-03-30") lintrbot_act <- dplyr::filter(lintrbot_act, as.Date(created_at) >= anytime::anydate("2017-03-30"))
We get 228 pull requests in one year! Lintr Bot is definitely not a lazy little thing!
Here are 5 Pull Requests that got a visit from Lintr Bot.
knitr::kable(lintrbot_act[1:5,])
owner | repo | title | created_at | state | author | url | no_comments | id |
---|---|---|---|---|---|---|---|---|
PredictiveEcology | reproducible | mergeCache | 2018-03-29 18:29:33 | MERGED | eliotmcintire | https://github.com/PredictiveEcology/reproducible/pull/18 | 2 | 18 |
PredictiveEcology | SpaDES.core | change objectSize to objSize – name conflict with R.oo::objectSize | 2018-03-26 23:48:22 | MERGED | eliotmcintire | https://github.com/PredictiveEcology/SpaDES.core/pull/57 | 1 | 57 |
PredictiveEcology | reproducible | memoise – loadFromRepo | 2018-03-26 17:41:25 | MERGED | eliotmcintire | https://github.com/PredictiveEcology/reproducible/pull/17 | 3 | 17 |
PredictiveEcology | quickPlot | thin fixes – bugfixes, SpatialPolygonsDataFrames | 2018-03-26 05:32:12 | MERGED | eliotmcintire | https://github.com/PredictiveEcology/quickPlot/pull/13 | 2 | 13 |
ptl93 | AEDA | added getEps | 2018-03-25 17:12:44 | MERGED | MiGraber | https://github.com/ptl93/AEDA/pull/39 | 5 | 39 |
Let’s have a look at the time series of comments.
library("ggplot2") library("hrbrthemes") ggplot(lintrbot_act) + geom_segment(aes(x = created_at, xend = created_at), y = 0, yend = 1) + theme_ipsum(base_size = 16, strip_text_size = 16) + ggtitle("Lintr Bot's work events", subtitle = "Each vertical line indicates Lintr Bot's commenting on a pull request")
So it seems that Lintr Bot works quite regularly, with some intense periods! I won’t look at the number of repos over time to differentiate activity due to more activity in a few repos or due to several repos, but I’m curious to see who was helped by Lintr Bot.
library("magrittr") dplyr::group_by(lintrbot_act, owner) %>% dplyr::summarize(repos = toString(unique(repo)), n = n()) %>% dplyr::arrange(- n) %>% dplyr::filter(n >= 5) %>% knitr::kable()
owner | repos | n |
---|---|---|
RTMC | tmc-rstudio, tmc-r-tester, tmc-r | 64 |
mlr-org | mlr | 38 |
PredictiveEcology | reproducible, SpaDES.core, quickPlot, SpaDES.tools, SpaDES.shiny | 26 |
Azure | doAzureParallel | 16 |
HealthCatalyst | healthcareai-r | 16 |
neuropsychology | psycho.R | 11 |
ropensci | drake | 11 |
hbc | bcbioRNASeq, bcbioSingleCell | 5 |
I must say that besides the recently onboarded rOpenSci package drake
by Will Landau, I don’t know any of these repos!
Hire Lintr Bot?
What about you? Would you let that cute little bot help you?
Note, a good package to improve code style automatically is styler
by Kirill Müller and Lorenz Walthert, so if Lintr Bot has a lot to tell you, give styler::style_pkg()
a try!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.