Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Given a data frame with multiple columns which contain time series data, let’s say that we are interested in executing an automatic forecasting algorithm on a number of columns. Furthermore, we want to train the model on a particular number of observations and assess how well they forecast future values. Based upon those testing procedures, we will estimate the full model. This is a fairly simple undertaking, but let’s walk through this task. My preference for such procedures is to loop through each column and append the results into a nested list.
First, let’s create some data.
ddat <- data.frame(date = c(seq(as.Date("2010/01/01"), as.Date("2010/03/02"), by=1)),
value1 = abs(round(rnorm(61), 2)),
value2 = abs(round(rnorm(61), 2)),
value3 = abs(round(rnorm(61), 2)))
head(ddat)
tail(ddat)
We want to forecast future values of the three columns. Because we want to save the results of these models into a list, lets begin by creating a list that contains the same number of elements as our data frame.
lst.names <- c(colnames(data))
lst <- vector("list", length(lst.names))
names(lst) <- lst.names
lst
I’ve gone ahead and written a user defined function that handles the batch forecasting process. It takes two arguments, a data frame and default argument which specifies the number of observations that will be used in the training set. The model estimates, forecasts, and diagnostic measures will be saved as a nested list and categorized under the appropriate variable name.
batch <- function(data, n_train=55){
lst.names <- c(colnames(data))
lst <- vector("list", length(lst.names))
names(lst) <- lst.names
for( i in 2:ncol(data) ){
lst[[1]][["train_dates"]] <- data[1:(n_train),1]
lst[[1]][["test_dates"]] <- data[(n_train+1):nrow(data),1]
est <- auto.arima(data[1:n_train,i])
fcas <- forecast(est, h=6)$mean
acc <- accuracy(fcas, data[(n_train+1):nrow(data),i])
fcas_upd <- data.frame(date=data[(n_train+1):nrow(data),1], forecast=fcas, actual=data[(n_train+1):nrow(data),i])
lst[[i]][["estimates"]] <- est
lst[[i]][["forecast"]] <- fcas
lst[[i]][["forecast_f"]] <- fcas_upd
lst[[i]][["accuracy"]] <- acc
cond1 = diff(range(fcas[1], fcas[length(fcas)])) == 0
cond2 = acc[,3] >= 0.025
if(cond1|cond2){
mfcas = forecast(ma(data[,i], order=3), h=5)
lst[[i]][["moving_average"]] <- mfcas
} else {
est2 <- auto.arima(data[,i])
fcas2 <- forecast(est, h=5)$mean
lst[[i]][["estimates_full"]] <- est2
lst[[i]][["forecast_full"]] <- fcas2
}
}
return(lst)
}
batch(ddat)
This isn’t the prettiest code, but it gets the job done. Note that lst was populated within a function and won’t be available in the global environment. Instead, I chose to simply print out the contents of the list after the function is evaluated.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
