Time horizon in forecasting, and rules of thumb
[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I recently received an email about forecasting and rules of thumb. “Dans la profession […] se transmet une règle empirique qui voudrait que l’on prenne un historique du double de l’horizon de prévision : 20 ans de données pour une prévision à 10 ans, etc… Je souhaite savoir si cette règle n’aurait pas, par hasard, un fondement théorique quitte à ce que le rapport ne soit pas de 2 pour 1, mais de 3 pour 1 ou de 1 pour 1 par exemple.” To summarize briefly, the rule is to consider a 2-1 ratio for the period of observation vs. forecast horizon. And the interesting question is if there are justifications for such a rule…
At first, I remembered a rules of thumb, from the book by Box and Jenkins, which states that it is meaningless to look at autocorrelations when lags exceed the sample size over 6. So with 12 years of data, autocorrelations with a lag higher than two years are useless. But it is not what is mentioned here. So I looked at some dataset, and some standard time series models.- It depends on the series
library(forecast) X = AirPassengers ETS = ets(X) plot(forecast(ETS,h=length(X)/2))or some sales in a big store,
- It depends on the model
With that kind of assumption, we see that the 2-1 ratio is useless since we can get forecasts up to any horizon…. But that does not seem very robust. For instance, if we consider exponential smoothing techniques, we can obtain
Which is rather different. And with the 2-1 ratio, obviously, there is a lot of uncertainty at the end ! It would be even worst if we assume that we look at a random walk. Because actually a dozen models – at least – can be considered, from ARIMA, seasonal ARIMA, Holt Winters, Exponential Smoothing, etc…
So I do not see any theoretical justification of that rule of thumb. Obviously, the maximum horizon can not be extremely far away if the series is non-stationary, with a very irregular pattern, and with a lot of noise… So we’re back at the beginning. If anyone is willing to share his or her experience, comments are open.
To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.