Time series cross-validation 4: forecasting the S&P 500
[This article was first published on Modern Toolmaking, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I finally got around to publishing my time series cross-validation package to github, and I plan to push it out to CRAN shortly.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
You can clone the repo using github for mac, for windows, or linux, and then run the following script to check it out:
This script downloads monthly data for S&P 500 (adjusted for splits and dividends), and, for each month form 1995 to the present, fits a naive model, an auto.arima() model, and an ets() model to the past 5 year’s worth of data and uses those models to predict S&P 500 prices for the next 12 months (note that the progress bar doesn’t update if you register a parallel backend. I can’t figure out how to fix this bug):
This script downloads monthly data for S&P 500 (adjusted for splits and dividends), and, for each month form 1995 to the present, fits a naive model, an auto.arima() model, and an ets() model to the past 5 year’s worth of data and uses those models to predict S&P 500 prices for the next 12 months (note that the progress bar doesn’t update if you register a parallel backend. I can’t figure out how to fix this bug):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Setup | |
rm(list = ls(all = TRUE)) | |
setwd('path.to/cv.ts') | |
#Load Packages | |
require(forecast) | |
require(doParallel) | |
source('R/cv.ts.R') | |
source('R/forecast functions.R') | |
#Download S&P 500 data and adjust from splits/dividends | |
library(quantmod) | |
getSymbols('^GSPC', from='1990-01-01') | |
GSPC <- adjustOHLC(GSPC, symbol.name='^GSPC') | |
#Calculate monthly returns | |
GSPC <- to.monthly(GSPC, indexAt='lastof') | |
GSPC <- Cl(GSPC) | |
#Convert from xts to ts | |
GSPC <- ts(GSPC, start=c(1990,1), frequency=12) | |
#Start a cluster to speed up cross validaiton | |
cl <- makeCluster(4, type='SOCK') | |
registerDoParallel(cl) | |
#Define cross validation parameters | |
myControl <- tseriescontrol( | |
minObs=60, | |
stepSize=1, | |
maxHorizon=12, | |
fixedWindow=TRUE, | |
preProcess=FALSE, | |
ppMethod='guerrero', | |
summaryFunc=tsSummary | |
) | |
#Forecast using several models | |
result_naive <- cv.ts(GSPC, naiveForecast, myControl) | |
myControl$preProcess <- TRUE | |
result_autoarima <- cv.ts(GSPC, auto.arimaForecast, myControl, ic='bic') | |
result_ets <- cv.ts(GSPC, etsForecast, myControl, ic='bic') | |
#Stop cluster | |
stopCluster(cl) | |
#Plot error | |
require(reshape2) | |
require(ggplot2) | |
plotData <- data.frame( | |
horizon=1:12 | |
,naive =result_naive$results$MAPE[1:12] | |
,arima=result_autoarima$results$MAPE[1:12] | |
,ets=result_ets$results$MAPE[1:12] | |
) | |
plotData <- melt(plotData, id.vars='horizon', value.name='MAPE', variable.name='model') | |
ggplot(plotData, aes(horizon, MAPE, color=model)) + geom_line() |
The naive model outperforms the arima and exponential smoothing models, both of which take into account seasonal patterns, trends, and mean-reversion! Furthermore, we’re not just using any arima/exponential smoothing model: at each step we’re selecting the best model, based on the last 5 years worth of data. (The ets model slightly outperforms the naive model at the 3 month horizon, but not the 2 month or 4 month horizons).
Forecasting equities prices is hard!
To leave a comment for the author, please follow the link and comment on their blog: Modern Toolmaking.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.