48 Industries (Dendrogram Ordered) Over 50 Years
[This article was first published on Timely Portfolio, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Thanks to reader AHWest for the comment on post 48 Industries Since 1963.
“I think it would be interesting to see the industries ordered by some sort of similarity of returns.”
I think this is a great suggestion, and I would like to see it also. I tried the dendrogram plot technique from Inspirational Stack Overflow Dendrogram Applied to Currencies, but then I spotted the dendrogramGrob in the latticeExtra documentation. This was much easier, and in a couple of lines, we are able to order and connect the 48 industries.
![]() |
From TimelyPortfolio |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require(fAssets) | |
require(latticeExtra) | |
require(quantmod) | |
require(PerformanceAnalytics) | |
#my.url will be the location of the zip file with the data | |
my.url="http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/48_Industry_Portfolios_daily.zip" | |
#this will be the temp file set up for the zip file | |
my.tempfile<-paste(tempdir(),"\\frenchindustry.zip",sep="") | |
#my.usefile is the name of the txt file with the data | |
my.usefile<-paste(tempdir(),"\\48_Industry_Portfolios_daily.txt",sep="") | |
download.file(my.url, my.tempfile, method="auto", | |
quiet = FALSE, mode = "wb",cacheOK = TRUE) | |
unzip(my.tempfile,exdir=tempdir(),junkpath=TRUE) | |
#read space delimited text file extracted from zip | |
french_industry <- read.table(file=my.usefile, | |
header = TRUE, sep = "", | |
as.is = TRUE, | |
skip = 9, nrows=12211) | |
#get dates ready for xts index | |
datestoformat <- rownames(french_industry) | |
datestoformat <- paste(substr(datestoformat,1,4), | |
substr(datestoformat,5,6),substr(datestoformat,7,8),sep="-") | |
#get xts for analysis | |
french_industry_xts <- as.xts(french_industry[,1:NCOL(french_industry)], | |
order.by=as.Date(datestoformat)) | |
#divide by 100 to get percent | |
french_industry_xts <- french_industry_xts/100 | |
#delete missing data which is denoted by -0.9999 | |
french_industry_xts[which(french_industry_xts < -0.99,arr.ind=TRUE)[,1], | |
unique(which(french_industry_xts < -0.99,arr.ind=TRUE)[,2])] <- 0 | |
#get price series or cumulative growth of 1 | |
french_industry_price <- cumprod(french_industry_xts+1) | |
#get 250 day rate of change or feel free to change to something other than 250 | |
roc <- french_industry_price | |
#split into groups so do not run out of memory | |
for (i in seq(12,48,by=12)) { | |
roc[,((i-11):(i))] <- ROC(french_industry_price[,((i-11):(i))],n=250,type="discrete") | |
} | |
roc[1:250,] <- 0 | |
# try to do http://stackoverflow.com/questions/9747426/how-can-i-produce-plots-like-this | |
# was much easier to use latticeExtra | |
# get dendrogram data from hclust | |
# backward and repetitive but it works | |
t <- assetsDendrogramPlot(as.timeSeries(french_industry_xts)) | |
# thanks to the latticeExtra example | |
dd.row <- as.dendrogram(t$hclust) | |
row.ord <- order.dendrogram(dd.row) | |
xyplot(roc[,row.ord], | |
layout=c(1,48), ylim=c(0,0.25), | |
scales = list(tck = c(1,0), y = list(draw = FALSE,relation = "same")), | |
horizonscale=0.25, | |
origin = 0, | |
colorkey = TRUE, | |
#since so many industries, we will comment out grid | |
panel = function(x,y,...) { | |
panel.horizonplot(x,y,...) #feel free to change to whatever you would like) | |
# panel.grid(h=3, v=0,col = "white", lwd=1,lty = 3) | |
}, | |
ylab = list(rev(colnames(roc[,row.ord])), rot = 0, cex = 0.7, pos = 3), | |
xlab = NULL, | |
par.settings=theEconomist.theme(box = "gray70"), | |
#use ylab above for labelling so we can specify FALSE for strip and strip.left | |
strip = FALSE, | |
strip.left = FALSE, | |
main = "French Daily 48 Industry (Dendrogram Ordered) 1963-2011\n source: http://mba.tuck.dartmouth.edu/pages/faculty/ken.french", | |
legend = | |
list(right = | |
list(fun = dendrogramGrob, | |
args = | |
list(x = dd.row, ord = row.ord, | |
side = "right", | |
size = 10)))) | |
To leave a comment for the author, please follow the link and comment on their blog: Timely Portfolio.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.