Site icon R-bloggers

Forecasting Best Practices, from Microsoft

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Microsoft has released a GitHub repository to share best practices for time series forecasting. From the repo:

Time series forecasting is one of the most important topics in data science. Almost every business needs to predict the future in order to make better decisions and allocate resources more effectively.

This repository provides examples and best practice guidelines for building forecasting solutions. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in forecasting algorithms to build solutions and operationalize them. Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featurizing the data, optimizing and evaluating models, and scaling up to the cloud.

The repository includes detailed examples of various time series modeling techniques, as Jupyter Notebooks for Python, and R Markdown documents for R. It also includes Python notebooks to fit time series models in the Azure Machine Learning service, and then operationalize the forecasts as a web service.

The R examples demonstrate several techniques for forecasting time series, specifically data on refrigerated orange juice sales from 83 stores (sourced from the the bayesm package). The forecasting techniques vary (mean forecasting with interpolation, ARIMA, exponential smoothing, and additive models), but all make extensive use of the tidyverts suite of packages, which provides "tidy time series forecasting for R". The forecasting methods themselves are explained in detail in the book (readable online) Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos (Monash University).

You can try out the examples yourself by cloning the repository and knitting the RMarkdown files in R. If you have git installed, a quick and easy way to do this in with RStudio. Choose File > New Project > Version Control > Git, and enter https://github.com/microsoft/forecasting in the Repository URL field. (You might prefer to fork the repository first.)

Open each .Rmd file in turn, accept the prompt to install packages, and click the Knit button to generate the document. The computations can take a while (particularly the Prophet Models example), but if you have a multi-core machine the notebooks do use the parallel package to speed things up. If you don't want to wait, the repository does include HTML versions of the rendered documents. Github doesn't render RMarkdown files, but the rendered HTML files are included in the repository. They're hard to read within GitHub, so to make thing easier I used the trick of creating a gh-pages branch in my fork so I could link to them directly below:

This repository will be updated over time, and contributions are welcome as pull requests to the repository linked below.

GitHub (Microsoft): Forecasting Best Practices

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.