Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
So, before revealing a slight wrinkle on the last strategy I wrote about, I’d like to clear up a bit of confusion regarding Jaekle and Tomasini’s idea of a stable region.
Essentially, the entire idea *is* that similar parameter configurations behave in very similar ways, and so, are supposed to be highly correlated. It does not mean the strategy may not be overfit in other ways, but that incremental changes to a parameter should mean incremental changes to performance, rather than seeing some sort of lucky spike in a sea of poor performance.
In any case, the one change to the strategy from last week is that rather than get in at the current close (aka observe close, execute at close), to get in at the next day’s close.
Again, here’s the strategy script:
require(downloader) require(PerformanceAnalytics) require(TTR) download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vxvdailyprices.csv", destfile="vxvData.csv") download("http://www.cboe.com/publish/ScheduledTask/MktData/datahouse/vxmtdailyprices.csv", destfile="vxmtData.csv") vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2)) vxmt <- xts(read.zoo("vxmtData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2)) ratio <- Cl(vxv)/Cl(vxmt) download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT", destfile="longXIV.txt") download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", destfile="longVXX.txt") #requires downloader package xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE)) vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE)) xiv <- merge(xiv, ratio, join='inner') vxx <- merge(vxx, ratio, join='inner') colnames(xiv)[5] <- colnames(vxx)[5] <- "ratio" xivRets <- Return.calculate(Cl(xiv)) vxxRets <- Return.calculate(Cl(vxx)) retsList <- list() count <- 1 for(i in 10:200) { ratioSMA <- SMA(ratio, n=i) vxxSig <- lag(ratio > 1 & ratio > ratioSMA, 2) xivSig <- lag(ratio < 1 & ratio < ratioSMA, 2) rets <- vxxSig*vxxRets + xivSig*xivRets colnames(rets) <- i retsList[[i]] <- rets count <- count+1 } retsList <- do.call(cbind, retsList) colnames(retsList) <- gsub("X", "", colnames(retsList)) charts.PerformanceSummary(retsList) retsList <- retsList[!is.na(retsList[,191]),] retsList <- retsList[-1,]
The one change I made is that rather than go with the default lag value, I went with a lag of 2. A lag of zero induces look-ahead bias. In any case, let’s run through the process again of analyzing for robustness.
rankComparison <- function(rets, perfAfun="Return.cumulative") { fun <- match.fun(perfAfun) monthlyFun <- apply.monthly(rets, fun) monthlyRank <- t(apply(monthlyFun, MARGIN=1, FUN=rank)) meanMonthlyRank <- apply(monthlyRank, MARGIN=2, FUN=mean) rankMMR <- rank(meanMonthlyRank) aggFun <- fun(rets) aggFunRank <- rank(aggFun) bothRanks <- data.frame(cbind(aggFunRank, rankMMR, names(rankMMR)), stringsAsFactors=FALSE) names(bothRanks) <- c("aggregateRank", "averageMonthlyRank", "configName") bothRanks$aggregateRank <- as.numeric(bothRanks$aggregateRank) bothRanks$averageMonthlyRank <- as.numeric(bothRanks$averageMonthlyRank) bothRanks$sum <- bothRanks[,1] + bothRanks[,2] bothRanks <- bothRanks[order(bothRanks$sum, decreasing=TRUE),] plot(aggFunRank~rankMMR, main=perfAfun) print(cor(aggFunRank, meanMonthlyRank)) return(bothRanks) } retRank <- rankComparison(retsList) sharpeRank <- rankComparison(retsList, perfAfun="SharpeRatio.annualized")
In this case, I added some functionality to not only do the plotting and correlation, but to spit out a table comparing both the aggregate metric along with the rank of the average monthly rank (again, dual ranking layer), and ordered the table by the sum of both the aggregate and the monthly metric, starting with the highest.
For instance, here’s the output from the returns comparison:
> retRank <- rankComparison(retsList) [1] 0.736377 > head(retRank, 20) aggregateRank averageMonthlyRank configName sum 62 190 191 62 381 63 189 187 63 376 60 185 189 60 374 66 191 182 66 373 65 187 183 65 370 59 184 185 59 369 56 181 186 56 367 64 188 179 64 367 152 174 190 152 364 61 183 178 61 361 67 186 167 67 353 151 165 188 151 353 57 179 173 57 352 153 167 184 153 351 58 182 164 58 346 154 170 175 154 345 53 164 180 53 344 155 166 176 155 342 158 163 177 158 340 150 157 181 150 338
So, for this configuration, the correlation went down from above .8 to around .74…which is still strong and credence that the strategy configurations have validity outside some lucky months. The new feature I added was the data frame of the two ranks side by side, along with their configuration name (in this case, my names were simply the SMA parameter, but the names could be anything such as say, SMA_60_lag_2), and the sum of the two rankings, which orders the configurations. As there were 191 configurations (SMA ranging from 10 to 200), the best score that could be achieved was 382. Furthermore, note that although there seems to be a strong region from SMA 53 to SMA 67, there also seems to be another region, at least when it comes to absolute return, of an SMA parameter at SMA 150+.
Here’s the same table for annualized Sharpe (this variation takes a bit longer to compute due to the monthly annualized Sharpes).
> sharpeRank <- rankComparison(retsList, perfAfun="SharpeRatio.annualized") [1] 0.5590881 > head(sharpeRank, 20) aggregateRank averageMonthlyRank configName sum 62 190 191.0 62 381.0 59 185 190.0 59 375.0 61 183 186.5 61 369.5 60 186 181.0 60 367.0 63 189 175.0 63 364.0 66 191 164.0 66 355.0 152 166 173.0 152 339.0 58 182 155.0 58 337.0 56 181 151.0 56 332.0 53 174 153.0 53 327.0 57 179 148.0 57 327.0 151 159 162.0 151 321.0 76 177 143.0 76 320.0 150 152 163.0 150 315.0 54 173 140.0 54 313.0 77 178 131.0 77 309.0 65 187 119.0 65 306.0 143 146 156.0 143 302.0 74 167 132.0 74 299.0 153 161 138.0 153 299.0
So, largely the same sort of results as we see with the annualized returns. A correlation of .5 gives some cause for concern, which will hopefully show up in the line plot of the rank of the four metrics (returns, Sharpe, drawdowns, and return to drawdown), which will reveal the regions with strong performance, and not-so-strong performances.
Here’s the ranking line plot.
aggReturns <- Return.annualized(retsList) aggSharpe <- SharpeRatio.annualized(retsList) aggMAR <- Return.annualized(retsList)/maxDrawdown(retsList) aggDD <- maxDrawdown(retsList) plot(rank(aggReturns)~as.numeric(colnames(aggReturns)), type="l", ylab="annualized returns rank", xlab="SMA", main="Risk and return rank comparison") lines(rank(aggSharpe)~as.numeric(colnames(aggSharpe)), type="l", ylab="annualized Sharpe rank", xlab="SMA", col="blue") lines(rank(aggMAR)~as.numeric(colnames(aggMAR)), type="l", ylab="Max return over max drawdown", xlab="SMA", col="red") lines(rank(-aggDD)~as.numeric(colnames(aggDD)), type="l", ylab="max DD", xlab="SMA", col="green") legend("bottomright", c("Return rank", "Sharpe rank", "MAR rank", "Drawdown rank"), pch=0, col=c("black", "blue", "red", "green"))
And the resulting plot:
There are several regions that show similar, strong metrics for similar parameter choices for the value of SMA when we use a “delayed” entry. Namely, the regions around the 60 day SMA, the 150 day SMA, and the 125 day SMA.
Let’s look at those configurations.
truncRets <- retsList[,c(51, 116, 141)] stats <- data.frame(cbind(t(Return.annualized(truncRets)), t(SharpeRatio.annualized(truncRets)), t(maxDrawdown(truncRets)))) colnames(stats) <- c("A.Return", "A.Sharpe", "Worst_Drawdown") stats$MAR <- stats[,1]/stats[,3] stats <- round(stats, 3)
And the results:
> stats A.Return A.Sharpe Worst_Drawdown MAR 60 1.103 2.490 0.330 3.342 125 0.988 2.220 0.368 2.683 150 0.983 2.189 0.404 2.435
And the resulting performance, on both a regular, and log scale:
charts.PerformanceSummary(truncRets) logRets <- log(cumprod(1+truncRets)) chart.TimeSeries(logRets)
Perfect strategies? There’s probably room for improvement. As good if not better than the volatility strategies posted elsewhere on the internet? Probably. Is there more investigation that can be done regarding the differences in signal delay? Yes.
So, in conclusion for this post, I’m hoping that the rank comparison heuristic and its new output gives people another tool to consider, along with another vol strategy to consider as well.
Thanks for reading.
NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.