Universal portfolio, part 8
[This article was first published on logopt: a journey in R, finance and open source, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We extend the analysis of part 7 by calculating the final wealth for all tuples of 3 and 4 stocks, this is a simple extension but it also shows the most important problem of the Universal portfolio algorithm, its exponential complexity in the number of stocks.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The first part shows the cumulative distribution of the absolute final wealth for all possible tuples of 2, 3 and 4 stocks. The results are qualitatively the same as before, a thin tail of good results. It also a shows increase in absolute wealth as the number of stocks increase.
The code remains quite compact, R is simply wonderful for this type of analysis.
library(logopt)
x <- coredata(nyse.cover.1962.1984)
w <- logopt:::x2w(x)
nDays <- dim(x)[1]
nStocks <- dim(x)[2]
Days <- 1:nDays
iWin <- 1 ; plot(1:10)
FinalWealth <- function(cols) {
xij <- x[,cols]
uc <- universal.cover(xij, 20)
return(uc[length(uc)])
}
TupleSizes <- c(2,3,4)
if (exists(“lws”) == FALSE) {
lws <- list()
for (i in 1:length(TupleSizes)) {
TupleSize <- TupleSizes[i]
ws <- combn(1:nStocks, TupleSize, FinalWealth)
lines(ecdf(ws), pch=”.”, col=Colors[i])
lws[[i]] <- ws
}
}
# show the ecdf
if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ;
plot(ecdf(lws[[3]]), pch=”.”, col=”red”, main=”Cumulative probability of final wealth”,
xlab=”Final wealth”, ylab=”Cumulative probability”)
lines(ecdf(lws[[2]]), pch=”.”, col=”green”)
lines(ecdf(lws[[1]]), pch=”.”, col=”blue”)
legend(“bottomright”, legend=c(“2 stocks”,”3 stocks”,”4 stocks”), fill=c(“blue”,”green”,”red”))
grid()
# show best absolute wealth and its composition
for (i in 1:length(TupleSizes)) {
TupleSize <- TupleSizes[i]
BestTuple <- which.max(lws[[i]])
BestStocks <- combn(1:nStocks, TupleSize)[,BestTuple]
cat(sprintf(“Max final wealth %.4f for stocks: “, lws[[i]][BestTuple]))
cat(colnames(x)[BestStocks]) ; cat(“\n”)
}
This gives the following output and graph
Max final wealth 78.4742 for stocks: comme kinar
Max final wealth 111.6039 for stocks: comme kinar meico
Max final wealth 138.5075 for stocks: comme espey kinar meico
The highlighted line of code is there for a reason, calculating the final wealth for all 4-tuples takes about one full day on my machine. So the code is only run once, then the magic of automatically saving the environment takes care of avoiding recalculating the result.
But why is the running time so long? The main problem of the Universal Portfolio algorithm is that it needs to evaluate an integral on the simplex, and the curse of dimensionality rears its ugly head. The general approach for a numerical integration in n dimensions is exponential in the number of dimensions. Furthermore we calculate the wealth for all possible n-tuples and this increases also exponentially in n when n is much smaller than the total number of stocks concerned.
As usual R code can provide a more vivid representation, the following code generates a plot showing the complexity evolution as the number of stocks increase for 36 stocks and a grid spacing of 0.05. The code also produces a table for small the small values of n. The code uses the ::: operator, it was new to me and allows to call package functions that are not part of the exposed interface.
library(logopt)
x <- coredata(nyse.cover.1962.1984)
nStocks <- dim(x)[2]
Sizes <- 1:nStocks
nCs <- 0 * Sizes
nSs <- 0 * Sizes
for (i in 1:length(Sizes)) {
n <- Sizes[i]
nCs[i] <- choose(nStocks, n)
nSs[i] <- logopt:::count.grid.points(n, 20)
}
plot(nCs * nSs, type=”l”, col=”blue”, log=”y”,ylim=range(nCs,nSs,nCs*nSs))
lines(nCs, col=”red”)
lines(nSs, col=”green”)
grid()
cat(sprintf(“n C(36,n) S(n,20) C(36,n)*S(n,20) Ratios to previous value\n”))
for(i in 2:6) {
n <- Sizes[i]
cat(sprintf(“%d %10d %10d %15.0f”,n,nCs[i],nSs[i], nCs[i] * nSs[i]))
if(i > 2) {
cat(sprintf(“%7.1f %7.1f %7.1f”, nCs[i]/nCs[i-1], nSs[i]/nSs[i-1],
(nCs[i] * nSs[i]) / (nCs[i-1] * nSs[i-1])))
}
cat(“\n”)
}
n C(36,n) S(n,20) C(36,n)*S(n,20) Ratios to previous value
2 630 21 13230
3 7140 231 1649340 11.3 11.0 124.7
4 58905 1771 104320755 8.2 7.7 63.2
5 376992 10626 4005916992 6.4 6.0 38.4
6 1947792 53130 103486188960 5.2 5.0 25.8
The exponential complexity makes the Universal Porfolio algorithm impractical for large portfolio, a serious limitation. Some authors have devised more efficient approaches, that either try to approximate Universal Portfolio at reduced complexity or are inspired by the ideas of Universal but with less complexity and general either no bounds or worse bounds.
To leave a comment for the author, please follow the link and comment on their blog: logopt: a journey in R, finance and open source.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.