Site icon R-bloggers

Who’s downloading the forecast package?

[This article was first published on R on Rob J Hyndman, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The github page for the forecast package currently shows the following information

Note the downloads figure: 264K/month. I know the package is popular, but that seems crazy. Also, the downloads figure on github only counts the downloads from the RStudio mirror, and ignores downloads from the other 125 mirrors around the world.< !-- more -->

Here are the top ten downloaded packages from the last month:

library(cranlogs)
cran_top_downloads(when='last-month')

 rank   package  count       from         to
    1       zoo 308290 2015-11-09 2015-12-08
    2  forecast 263797 2015-11-09 2015-12-08
    3      Rcpp 260636 2015-11-09 2015-12-08
    4    lmtest 258810 2015-11-09 2015-12-08
    5       fpp 244989 2015-11-09 2015-12-08
    6 expsmooth 244179 2015-11-09 2015-12-08
    7       fma 243556 2015-11-09 2015-12-08
    8   tseries 243172 2015-11-09 2015-12-08
    9   stringi 199384 2015-11-09 2015-12-08
   10   ggplot2 192072 2015-11-09 2015-12-08

OK, that is very weird. Four of those packages are mine (forecast, fpp, expsmooth, and fma), and zoo, Rcpp, lmtest and tseries are all dependencies of forecast. Further, expsmooth, fma and forecast are all dependencies of fpp. So it looks like a lot of people were installing fpp plus all its dependencies.

If we check the daily downloads for 2015, we get the following plot.

library(ggplot2)
data <- cran_downloads(packages=c("forecast","fpp"), from="2015-01-01")
qplot(date, count, data=data, geom="line", colour=package, 
  ylab="Downloads", main="Package downloads in past year")

Sure enough, the last few weeks show a very strong correspondence between fpp and forecast downloads, while previously most forecast downloads were not correlated with fpp downloads.

So the recent spike in forecast package downloads are clearly being driven by fpp installations. But why so many in one month, and most of them in one week? The fpp package is used by people studying forecasting with my textbook (Forecasting: principles and practice coauthored with George Athanasopoulos), but there wouldn’t be that many people in the world studying forecasting. I wonder if some large organization installed fpp on every computer they own as part of some generic set up. But surely any sys admin who knew what they were doing would only download it once.

Anyone like to own up?

To leave a comment for the author, please follow the link and comment on their blog: R on Rob J Hyndman.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.