Site icon R-bloggers

The most prolific package maintainers on CRAN

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

During a discussion with some other members of the R Consortium, the question came up: who maintains the most packages on CRAN? DataCamp maintains a list of most active maintainers by downloads, but in this case we were interested in the total number of packages by maintainer. Fortunately, this is pretty easy to figure thanks to the CRAN repository tools now included in R, and a little dplyr (see the code below) gives the answer quickly[*].

And the answer? The most prolific maintainer is Scott Chamberlain from ROpenSci, who is currently the maintainer of 77 packages. Here's a list of the top 20:

                 Maint     n
 1  Scott Chamberlain     77
 2  Dirk Eddelbuettel     53
 3        Jeroen Ooms     40
 4     Hadley Wickham     39
 5       Gábor Csárdi     37
 6           ORPHANED     37
 7   Thomas J. Leeper     29
 8          Bob Rudis     28
 9   Henrik Bengtsson     28
10        Kurt Hornik     28
11       Oliver Keyes     28
12    Martin Maechler     27
13     Richard Cotton     27
14 Robin K. S. Hankin     24
15      Simon Urbanek     24
16      Kirill Müller     23
17    Torsten Hothorn     23
18      Achim Zeileis     22
19       Paul Gilbert     21
20          Yihui Xie     21

(That list of orphaned packages with no current maintainer includes XML, d3heatmap, and  flexclust, to name just 3 of the 37.) Here's the R code used to calculate the top 20:

[*]Well, it would have been quick, until I noticed that some maintainers had two forms of their name in the database, one with surrounding quotes and one without. It seemed like it was going to be trivial to fix with a regular expression, but it took me longer than I hoped to come up with the final regexp on line 4 above, which is now barely distinguishable from line noise. As usual, there an xkcd for this situation:

 

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.