Time series cross-validation 5

[This article was first published on Modern Toolmaking, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The caret package for R now supports time series cross-validation!  (Look for version 5.15-052 in the news file).  You can use the createTimeSlices function to do time-series cross-validation with a fixed window, as well as a growing window.  This function generates a list of indexes for the training set, as well as a list of indexes for the test set, which you can then pass to the `trainControl` object.



Caret does not currently support univariate time series models (like `arima`, `auto.arima` and `ets`), but perhaps that functionality is coming in the future?  I’d also love to see someone write a I’d also love to see someone write a `timeSeriesSummary` function for caret that calculates error at each horizon in the test set and a createTimeResamples function, perhaps using the Maximum Entropy Bootstrap.

Here’s a quick demo of how you might use this new functionality:
#Load the dataset, adjust, and convert to monthly returns
set.seed(42)
library(quantmod)
getSymbols('^GSPC', from='1990-01-01')
GSPC <- adjustOHLC(GSPC, symbol.name='^GSPC')
GSPC <- to.monthly(GSPC, indexAt='lastof')
Target <- ClCl(GSPC)
view raw 1. Load Data.R hosted with ❤ by GitHub
#Calculate some technical indicators
periods <- c(3, 6, 9, 12)
Lags <- data.frame(lapply(c(1:2, periods), function(x) Lag(Target, x)))
EMAs <- data.frame(lapply(periods, function(x) {
out <- EMA(Target, x)
names(out) <- paste('EMA', x, sep='.')
return(out)
}))
RSIs <- data.frame(lapply(periods, function(x) {
out <- RSI(Cl(GSPC), x)
names(out) <- paste('RSI', x, sep='.')
return(out)
}))
DVIs <- data.frame(lapply(periods, function(x) {
out <- DVI(Cl(GSPC), x)
out <- out$dvi
names(out) <- paste('DVI', x, sep='.')
return(out)
}))
dat <- data.frame(Next(Target), Lags, EMAs, RSIs, DVIs)
dat <- na.omit(dat)
view raw 2. Covar.R hosted with ❤ by GitHub
#Create a summary function to calculate trade costs and cumulative profit in the test set
mySummary <- function (data, lev = NULL, model = NULL) {
positions <- sign(data[, "pred"])
trades <- abs(c(1,diff(positions)))
profits <- positions*data[, "obs"] + trades*0.01
profit <- prod(1+profits)
names(profit) <- 'profit'
return(profit)
}
view raw 3. summary.R hosted with ❤ by GitHub
#Build the model!
library(caret)
model <- train(dat[,-1], dat[,1], method='rpart',
metric='profit', maximize=TRUE,
trControl=trainControl(
method='timeslice',
initialWindow=12, fixedWindow=TRUE,
horizon=12, summaryFunction=mySummary,
verboseIter=TRUE))
model
view raw 4. Model.R hosted with ❤ by GitHub

To leave a comment for the author, please follow the link and comment on their blog: Modern Toolmaking.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)