Site icon R-bloggers

Universal portfolio, part 11

[This article was first published on logopt: a journey in R, finance and open source, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
First an apology, the links to the Universal Portfolio paper have stopped working.  This is because the personal webpage of Thomas Cover at Stanford has been taken down, but fortunately the content moved elsewhere.  The new link is Universal Portfolio and hopefully this one will be stable.

Note that there are many available copies on the web but most (like this one) are for something that seems to be a slighly reworked version dated October 23 1996.  The text appears mostly identical to the published version, but it does not include the figures.

In the rest of this post, I discuss the data used by Cover.  That data is included in logopt as nyse.cover.1962.1984.  It contains the relative prices for 36 NYSE stocks between 1962 and 1984.


> range(index(nyse.cover.1962.1984))
[1] “1962-07-03” “1984-12-31”
> colnames(nyse.cover.1962.1984)
 [1] “ahp”    “alcoa”  “amerb”  “arco”   “coke”   “comme”  “dow”    “dupont” “espey”  “exxon”  “fisch”  “ford”   “ge”     “gm”     “gte”    “gulf”   “hp”    
[18] “ibm”    “inger”  “iroqu”  “jnj”    “kimbc”  “kinar”  “kodak”  “luken”  “meico”  “merck”  “mmm”    “mobil”  “morris” “pandg”  “pills”  “schlum” “sears” 
[35] “sherw”  “tex” 

The names are not stickers, some guessing and with some help from an other person using the series gives the table below (and if anybody knows about the one without expansion yet. please post a comment).

AbbreviationCompany nameCurrent ticker
ahp??
alcoaAlcoaAA
amerbAmerican Brands
aka Fortune Brands
arco??
cokeCoca-ColaKO
commeCommercial MetalsCMC
dowDow ChemicalsDOW
dupontDuPontDD
espeyEspey ManufacturingESP
exxonExxon MobilXOM
cokeCoca-ColaKO
fischFischbach Corp
fordFordF
geGeneral ElectricGE
gmGeneral MotorsGM*
gteGTE Corporation
gulfGulf Oil (now Chevron)CVX
hpHewlett-PackardHPQ
ibmIBMIBM
ingerIngersoll-RandIR
iroqIroquois Brands
jnjJohnson & JohnsonJNJ
kimbcKimberly-ClarkKMB
kinarKinark?
kodakEastman KodakEKDKQ
lukenLukens?
meico??
merckMerckMRK
mmm3MMMM
mobilExxon MobilXOM
morrisPhilip MorrisPM
pandgProcter & GamblePG
pillsPillsbury, now part of General Mills
schlumSchlumbergerSLB
searsSears HoldingsSHLD
sherwSherwin-WilliamsSHW
texTexaco, now ChevronCVX

There is a lot of diversity across the different stocks, we saw that in two ways:

  • by showing the global time evolution of all stocks in time
  • by showing the growth rate at two times separated by N market days (shown as a price relative between the two dates).

# Some statistics on the NYSE series

library(logopt)
x <- coredata(nyse.cover.1962.1984)
w <- logopt:::x2w(x)
nDays <- dim(x)[1]
nStocks <- dim(x)[2] 
Days <- 1:nDays
iWin <- 1 ; plot(1:10)
Time <- index(nyse.cover.1962.1984)

# for each stock calculate:
# - min, max
# - average geometric return

MaxFinal <- max(w[nDays,])
MinFinal <- min(w[nDays,])
MaxAll <- max(w)
MinAll <- min(w)

if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ;
plot(Time, w[,1], col="gray", ylim=range(w), log="y", type="l")
for (i in 1:nStocks) {
 lines(Time, w[,i], col="gray")
 if (w[nDays,i] == MaxFinal) { cat(sprintf("Stock with best final value: %s finishing at %.2f\n", colnames(w)[i], MaxFinal)) ; iMax <- i }
 if (w[nDays,i] == MinFinal) { cat(sprintf("Stock with worst final value: %s finishing at %.2f\n", colnames(w)[i], MinFinal)) ; iMin <- i }
 if (max(w[,i]) == MaxAll) { cat(sprintf("Stock with best peak value: %s at %.2f\n", colnames(w)[i], MaxAll)) }
 if (min(w[,i]) == MinAll) { cat(sprintf("Stock with worst valley value: %s at %.2f\n", colnames(w)[i], MinAll)) } 
}

lines(Time, w[,iMax], col="green")
lines(Time, w[,iMin], col="red")
lines(Time, apply(w,1,mean), col="blue")
grid()

# do a summary across n quotes
nDelta <- 1200
wD <- w[(nDelta+1):nDays,] / w[1:(nDays-nDelta),]
Time <- Time[1:(nDays-nDelta)]
MaxDAll <- max(wD)
MinDAll <- min(wD)
if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ;
plot(Time, wD[,1], col="gray", ylim=range(wD), log ="y", type="l")
for (i in 1:nStocks) {
 lines(Time, wD[,i], col="gray")
 if (max(wD[,i]) == MaxDAll) { cat(sprintf("Stock with best gain on %s days: %s at %.2f\n", nDelta, colnames(w)[i], MaxDAll)) }
 if (min(wD[,i]) == MinDAll) { cat(sprintf("Stock with worst lost on %s days: %s at %.2f\n", nDelta, colnames(w)[i], MinDAll)) } 
} 
lines(Time, apply(wD,1,mean), col="blue")

grid()

This gives the following textual answer and graphs.  Note that there are many alternate ways to present this information, in particular the package PerformanceAnalytics.


Stock with worst final value: dupont finishing at 3.07
Stock with worst valley value: meico at 0.26
Stock with best final value: morris finishing at 54.14
Stock with best peak value: schlum at 90.12
Stock with best gain on 1200 days: espey at 15.84
Stock with worst lost on 1200 days: meico at 0.07



This sequence forms a nice reference covering a long period of time, and has been used in many studies of portfolio selection algorithms.  But the series has a number of serious problems:
  • Survivorship bias
  • The time range corresponds to a time where quotes were not yet decimal.

To leave a comment for the author, please follow the link and comment on their blog: logopt: a journey in R, finance and open source.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.