The Financial Crisis on Tape Part I

[This article was first published on Joe's Data Diner, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Hello and welcome to Joe’s Data Diner’s first ever post!

Today, I will touch on both R and Finance, but I’ll try and make it accesible for those with an interest in either and not just Quants like myself!

Almost everyone is now aware that asset correlation increases in times of stress. This topic has made its way into the most esteemed journals and even the popular press. However, it remains a defining feature of the ’07-’08 financial crisis and, for those working in the Credit markets, it remains a topic of paramount importance.

My idea of graphing the time evolution of the correlation structure between assets is not revolutionary. However, my aim here is to show how easy it is to reproduce such graphics using only the free R software.

In Part I, I will produce a static graph showing quarterly measurements of the correlation structure and, in Part II, I will create a video showing the daily evolution in the corrrelation.

As some of you may be more interested in the results than the R code, let’s look at them straight away!

  
The assets shown are as follow:
SPY: S&P 500
QQQ: Nasdaq 100
EEM: Emerging Markets
IWM: Russel 2000
EFA: EAFE (Europe, Australasia and Far East)
TLT: 20 Year Treasury
IYR: U.S. Real Estate
GLD: Gold

Even from the small thumbnail, the increase in correlation during the crisis, and the decrease by 2013 are immediately evident. I’m certain, a detailed study could yield many more observations – please, feel free to add coments, if any you observe anything! 

However, I will now jump into the code! I’m new to blogging in general and R-blogging in particular, so I’ve decided well commented code is probably the clearest way to explain the majority of the process while returning to prose to discuss some of the more interesting coding decisions. 


First, let’s get the data. Thanks to Systematic Investor for teaching (via his blog) me how to use QuantMod:
#install the superb quantmod library
#we will use it to download the data and compute returns
library(quantmod)
# load historical prices from Yahoo Finance
# I use a set that I saw used by systematic investor CREDIT
symbols = c('SPY','QQQ','EEM','IWM','EFA','TLT','IYR','GLD')
symbols.names = c('S&P 500,Nasdaq 100,Emerging Markets,Russell 2000,EAFE,20 Year
Treasury,U.S. Real Estate,Gold')
#Downlad the data from yahoo (default choice of getSymbols)
#It loads directly to the environment which depending on your R background may seem surprising
#doesn't return result.Use get function below to deal with programatically
getSymbols(symbols, src = 'yahoo', from = '2005-01-01')
#inspect the data
head(SPY)
class(SPY)
str(SPY)
range(index(SPY))
#obtain the daily closing price for each and form into a data frame
hist.returns =
do.call(cbind,
lapply(symbols, function(symbol){
symbol.data = get(symbol) #get from enviroment
symbol.data.adj = Ad(symbol.data) #extract the adjusted price
symbols.data.adjret = ROC(symbol.data.adj, n=1, type="continuous", na.pad=TRUE) #compute simple returns
symbols.data.adjret
})
)

Next let’s compute some correlations! Later we’ll use a rolling window to produce daily updates for the video, but for now lets look at the correlation matrix for each quarter:
#now use Hadley's great plyr library and a simple function to compute the
#rolling correlation matrix
library(plyr)
DataFrameCorOutput<-function(hist.returns){
require(reshape2)
correls = melt(cor(as.matrix(na.omit(hist.returns))))
colnames(correls) = c("Var1","Var2","Correl")
correls
}
rolling.correlmatrix =
ddply(hist.returns,
.(year,quarter),
.fun = function(x){
DataFrameCorOutput(
x[,grep("Adjusted",colnames(x))]
)}
)
#rename some of the series to make obtain a pretty plot a little easier!
rolling.correlmatrix$Var1 = factor(gsub(".Adjusted","",rolling.correlmatrix$Var1),symbols)
rolling.correlmatrix$Var2 = factor(gsub(".Adjusted","",rolling.correlmatrix$Var2),symbols)
And now we’re ready to use another of Hadley’s packages the famous ggplot (2) to create the first iteration of the graph:
library(ggplot2)
library(scales)
library(RColorBrewer)
#this is a slightly bespoke scale - basically I wanted a blue to red scale
#that gave extra variation over the interesting 0.6-1.0 range.
mycolvec_1 = c(rev(brewer.pal(8,"Blues")),rep("white",3),brewer.pal(8,"YlOrRd"))
ggplot(rolling.correlmatrix, aes(x = Var1, y = Var2, fill = Correl)) +
geom_tile() +
scale_fill_gradientn(colours = mycolvec_1
,limits = c(-1,1))+
facet_grid(quarter~year)+theme_bw()+
theme(panel.grid.major.x = element_blank(),
panel.grid.major.y = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.x=element_text(angle=90))
I make two small changes to improve the visual impact. First, the series are re-ordered to put the ones which tend to be negatively-correlated to one side, making for a less cluttered looking graph. Secondly, I feel that the important range of correlations from +0.5 – +1.0 is somewhat difficult to analyse as the colours are very similar. Inspired by a topographical map of Lanzarote (where I’ve just been cycling), I thought I’d change the very top of the correlation range to purple, increasing the definition in this correlation range:
#Personally I find the TLT anti correlation, rather obscures judging
#Put it to the right/
PrefOrder = c("SPY","QQQ","EEM","IWM","EFA","IYR","GLD","TLT")
rolling.correlmatrix$Var1 = factor(rolling.correlmatrix$Var1, PrefOrder) #careful not to just rename levels here!
rolling.correlmatrix$Var2 = factor(rolling.correlmatrix$Var2, PrefOrder) #careful not to just rename levels here!
#introduce the purple colouring to the top of the range to better highlight correlation changes.
mycolvec_2 = c(
rev(brewer.pal(8,"Blues")),
colorRampPalette(c(
colorRampPalette(c("white",brewer.pal(9,"Set3")[2],
brewer.pal(9,"Set1")[c(6,5,1)])
)(10)
rev(brewer.pal(9,"PRGn")[1])
))(9)
)
#The final graph shown at the start of the blog:
ggplot(rolling.correlmatrix, aes(x = Var1, y = Var2, fill = Correl)) +
geom_tile() +
scale_fill_gradientn(colours = mycolvec_2
,limits = c(-1,1))+
facet_grid(quarter~year)+ theme_bw()+
theme(panel.grid.major.x = element_blank(),
panel.grid.major.y = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.x=element_text(angle=90))
and, finally, we have the graph shown at the top! However, I don’t like how we’ve sacrificed the pleasant visuals of the mainly red palette to obtain this greater definition. If anyone has any suggestions how to improve this, please let me know!

I hope you enjoyed the opening night at Joe’s Data Diner; tips (non-monetary) are always welcome and I hope you’ll return to see the film in”The Financial Crisis on Tape Part II” in the next couple of weeks!

To leave a comment for the author, please follow the link and comment on their blog: Joe's Data Diner.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)