Time series cross-validation 5
[This article was first published on Modern Toolmaking, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The caret package for R now supports time series cross-validation! (Look for version 5.15-052 in the news file). You can use the createTimeSlices function to do time-series cross-validation with a fixed window, as well as a growing window. This function generates a list of indexes for the training set, as well as a list of indexes for the test set, which you can then pass to the `trainControl` object.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Caret does not currently support univariate time series models (like `arima`, `auto.arima` and `ets`), but perhaps that functionality is coming in the future? I’d also love to see someone write a I’d also love to see someone write a `timeSeriesSummary` function for caret that calculates error at each horizon in the test set and a createTimeResamples function, perhaps using the Maximum Entropy Bootstrap.
Here’s a quick demo of how you might use this new functionality:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Load the dataset, adjust, and convert to monthly returns | |
set.seed(42) | |
library(quantmod) | |
getSymbols('^GSPC', from='1990-01-01') | |
GSPC <- adjustOHLC(GSPC, symbol.name='^GSPC') | |
GSPC <- to.monthly(GSPC, indexAt='lastof') | |
Target <- ClCl(GSPC) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Calculate some technical indicators | |
periods <- c(3, 6, 9, 12) | |
Lags <- data.frame(lapply(c(1:2, periods), function(x) Lag(Target, x))) | |
EMAs <- data.frame(lapply(periods, function(x) { | |
out <- EMA(Target, x) | |
names(out) <- paste('EMA', x, sep='.') | |
return(out) | |
})) | |
RSIs <- data.frame(lapply(periods, function(x) { | |
out <- RSI(Cl(GSPC), x) | |
names(out) <- paste('RSI', x, sep='.') | |
return(out) | |
})) | |
DVIs <- data.frame(lapply(periods, function(x) { | |
out <- DVI(Cl(GSPC), x) | |
out <- out$dvi | |
names(out) <- paste('DVI', x, sep='.') | |
return(out) | |
})) | |
dat <- data.frame(Next(Target), Lags, EMAs, RSIs, DVIs) | |
dat <- na.omit(dat) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Create a summary function to calculate trade costs and cumulative profit in the test set | |
mySummary <- function (data, lev = NULL, model = NULL) { | |
positions <- sign(data[, "pred"]) | |
trades <- abs(c(1,diff(positions))) | |
profits <- positions*data[, "obs"] + trades*0.01 | |
profit <- prod(1+profits) | |
names(profit) <- 'profit' | |
return(profit) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Build the model! | |
library(caret) | |
model <- train(dat[,-1], dat[,1], method='rpart', | |
metric='profit', maximize=TRUE, | |
trControl=trainControl( | |
method='timeslice', | |
initialWindow=12, fixedWindow=TRUE, | |
horizon=12, summaryFunction=mySummary, | |
verboseIter=TRUE)) | |
model |
To leave a comment for the author, please follow the link and comment on their blog: Modern Toolmaking.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.