Universal portfolio, part 11
[This article was first published on logopt: a journey in R, finance and open source, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
First an apology, the links to the Universal Portfolio paper have stopped working. This is because the personal webpage of Thomas Cover at Stanford has been taken down, but fortunately the content moved elsewhere. The new link is Universal Portfolio and hopefully this one will be stable.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Note that there are many available copies on the web but most (like this one) are for something that seems to be a slighly reworked version dated October 23 1996. The text appears mostly identical to the published version, but it does not include the figures.
In the rest of this post, I discuss the data used by Cover. That data is included in logopt as nyse.cover.1962.1984. It contains the relative prices for 36 NYSE stocks between 1962 and 1984.
> range(index(nyse.cover.1962.1984))
[1] “1962-07-03” “1984-12-31”
> colnames(nyse.cover.1962.1984)
[1] “ahp” “alcoa” “amerb” “arco” “coke” “comme” “dow” “dupont” “espey” “exxon” “fisch” “ford” “ge” “gm” “gte” “gulf” “hp”
[18] “ibm” “inger” “iroqu” “jnj” “kimbc” “kinar” “kodak” “luken” “meico” “merck” “mmm” “mobil” “morris” “pandg” “pills” “schlum” “sears”
[35] “sherw” “tex”
The names are not stickers, some guessing and with some help from an other person using the series gives the table below (and if anybody knows about the one without expansion yet. please post a comment).
There is a lot of diversity across the different stocks, we saw that in two ways:
This gives the following textual answer and graphs. Note that there are many alternate ways to present this information, in particular the package PerformanceAnalytics.
Stock with worst final value: dupont finishing at 3.07
Stock with worst valley value: meico at 0.26
Stock with best final value: morris finishing at 54.14
Stock with best peak value: schlum at 90.12
Stock with best gain on 1200 days: espey at 15.84
Stock with worst lost on 1200 days: meico at 0.07
Abbreviation | Company name | Current ticker |
---|---|---|
ahp | ? | ? |
alcoa | Alcoa | AA |
amerb | American Brands aka Fortune Brands | – |
arco | ? | ? |
coke | Coca-Cola | KO |
comme | Commercial Metals | CMC |
dow | Dow Chemicals | DOW |
dupont | DuPont | DD |
espey | Espey Manufacturing | ESP |
exxon | Exxon Mobil | XOM |
coke | Coca-Cola | KO |
fisch | Fischbach Corp | – |
ford | Ford | F |
ge | General Electric | GE |
gm | General Motors | GM* |
gte | GTE Corporation | – |
gulf | Gulf Oil (now Chevron) | CVX |
hp | Hewlett-Packard | HPQ |
ibm | IBM | IBM |
inger | Ingersoll-Rand | IR |
iroq | Iroquois Brands | – |
jnj | Johnson & Johnson | JNJ |
kimbc | Kimberly-Clark | KMB |
kinar | Kinark? | – |
kodak | Eastman Kodak | EKDKQ |
luken | Lukens? | – |
meico | ? | ? |
merck | Merck | MRK |
mmm | 3M | MMM |
mobil | Exxon Mobil | XOM |
morris | Philip Morris | PM |
pandg | Procter & Gamble | PG |
pills | Pillsbury, now part of General Mills | – |
schlum | Schlumberger | SLB |
sears | Sears Holdings | SHLD |
sherw | Sherwin-Williams | SHW |
tex | Texaco, now Chevron | CVX |
There is a lot of diversity across the different stocks, we saw that in two ways:
- by showing the global time evolution of all stocks in time
- by showing the growth rate at two times separated by N market days (shown as a price relative between the two dates).
# Some statistics on the NYSE series library(logopt) x <- coredata(nyse.cover.1962.1984) w <- logopt:::x2w(x) nDays <- dim(x)[1] nStocks <- dim(x)[2] Days <- 1:nDays iWin <- 1 ; plot(1:10) Time <- index(nyse.cover.1962.1984) # for each stock calculate: # - min, max # - average geometric return MaxFinal <- max(w[nDays,]) MinFinal <- min(w[nDays,]) MaxAll <- max(w) MinAll <- min(w) if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ; plot(Time, w[,1], col="gray", ylim=range(w), log="y", type="l") for (i in 1:nStocks) { lines(Time, w[,i], col="gray") if (w[nDays,i] == MaxFinal) { cat(sprintf("Stock with best final value: %s finishing at %.2f\n", colnames(w)[i], MaxFinal)) ; iMax <- i } if (w[nDays,i] == MinFinal) { cat(sprintf("Stock with worst final value: %s finishing at %.2f\n", colnames(w)[i], MinFinal)) ; iMin <- i } if (max(w[,i]) == MaxAll) { cat(sprintf("Stock with best peak value: %s at %.2f\n", colnames(w)[i], MaxAll)) } if (min(w[,i]) == MinAll) { cat(sprintf("Stock with worst valley value: %s at %.2f\n", colnames(w)[i], MinAll)) } } lines(Time, w[,iMax], col="green") lines(Time, w[,iMin], col="red") lines(Time, apply(w,1,mean), col="blue") grid() # do a summary across n quotes nDelta <- 1200 wD <- w[(nDelta+1):nDays,] / w[1:(nDays-nDelta),] Time <- Time[1:(nDays-nDelta)] MaxDAll <- max(wD) MinDAll <- min(wD) if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ; plot(Time, wD[,1], col="gray", ylim=range(wD), log ="y", type="l") for (i in 1:nStocks) { lines(Time, wD[,i], col="gray") if (max(wD[,i]) == MaxDAll) { cat(sprintf("Stock with best gain on %s days: %s at %.2f\n", nDelta, colnames(w)[i], MaxDAll)) } if (min(wD[,i]) == MinDAll) { cat(sprintf("Stock with worst lost on %s days: %s at %.2f\n", nDelta, colnames(w)[i], MinDAll)) } } lines(Time, apply(wD,1,mean), col="blue") grid()
This gives the following textual answer and graphs. Note that there are many alternate ways to present this information, in particular the package PerformanceAnalytics.
Stock with worst final value: dupont finishing at 3.07
Stock with worst valley value: meico at 0.26
Stock with best final value: morris finishing at 54.14
Stock with best peak value: schlum at 90.12
Stock with best gain on 1200 days: espey at 15.84
Stock with worst lost on 1200 days: meico at 0.07
This sequence forms a nice reference covering a long period of time, and has been used in many studies of portfolio selection algorithms. But the series has a number of serious problems:
- Survivorship bias
- The time range corresponds to a time where quotes were not yet decimal.
To leave a comment for the author, please follow the link and comment on their blog: logopt: a journey in R, finance and open source.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.