Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Here is a spot of code to create a series of small multiples for comparing return distributions. You may have spotted this in a presentation I posted about earlier, but I’ve been using it here and there and am finally satisfied that it is a generally useful view, so I functionalized it.
require(PerformanceAnalytics) data(edhec) page.Distributions(edhec[,c("Convertible Arbitrage", "Equity Market Neutral","Fixed Income Arbitrage", "Event Driven", "CTA Global", "Global Macro", "Long/Short Equity")])
When visually comparing distributions, there are a few characteristics to get right across the graphs. For example, each histogram’s bin sizes should match and the min and the max of each chart should line up.
I prefer all three views together. The histogram is a more typical view of the distribution, improved when overplotted with a normal distribution and with the zero bin marked, both being important references. The QQ plot is more important, again improved with confidence bands for a normal distribution.
This is just a first cut. There’s no reason that the normal distribution has to be the reference for these charts, but I’ll have to do some more parameterization. There is also a balance between the number of rows in the device and readability. Maybe I’ll insert some “pagination” like charts.BarVaR uses… What else?
This is checked into PApages on r-forge right now, in /sandbox
as page.Distributions.R. I’m contemplating including it in PerformanceAnalytics, but I’m interested in your feedback before I do. Here’s the code:
# Histogram, QQPlot and ECDF plots aligned by scale for comparison page.Distributions <- function (R, ...) { require(PerformanceAnalytics) op <- par(no.readonly = TRUE) # c(bottom, left, top, right) par(oma = c(5,0,2,1), mar=c(0,0,0,3)) layout(matrix(1:(4*NCOL(R)), ncol=4, byrow=TRUE), widths=rep(c(.6,1,1,1),NCOL(R))) # layout.show(n=21) chart.mins=min(R, na.rm=TRUE) chart.maxs=max(R, na.rm=TRUE) row.names = sapply(colnames(R), function(x) paste(strwrap(x,10), collapse = "\n"), USE.NAMES=FALSE) for(i in 1:NCOL(R)){ if(i==NCOL(R)){ plot.new() text(x=1, y=0.5, adj=c(1,0.5), labels=row.names[i], cex=1.1) chart.Histogram(R[,i], main="", xlim=c(chart.mins, chart.maxs), breaks=seq(round(chart.mins, digits=2)-0.01, round(chart.maxs, digits=2)+0.01, by=0.01), show.outliers=TRUE, methods=c("add.normal"), colorset = c("black", "#00008F", "#005AFF", "#23FFDC", "#ECFF13", "#FF4A00", "#800000")) abline(v=0, col="darkgray", lty=2) chart.QQPlot(R[,i], main="", pch=20, envelope=0.95, col=c(1,"#005AFF"), ylim=c(chart.mins, chart.maxs)) abline(v=0, col="darkgray", lty=2) chart.ECDF(R[,i], main="", xlim=c(chart.mins, chart.maxs), lwd=2) abline(v=0, col="darkgray", lty=2) } else{ plot.new() text(x=1, y=0.5, adj=c(1,0.5), labels=row.names[i], cex=1.1) chart.Histogram(R[,i], main="", xlim=c(chart.mins, chart.maxs), breaks=seq(round(chart.mins, digits=2)-0.01, round(chart.maxs, digits=2)+0.01, by=0.01), xaxis=FALSE, yaxis=FALSE, show.outliers=TRUE, methods=c("add.normal"), colorset = c("black", "#00008F", "#005AFF", "#23FFDC", "#ECFF13", "#FF4A00", "#800000")) abline(v=0, col="darkgray", lty=2) chart.QQPlot(R[,i], main="", xaxis=FALSE, yaxis=FALSE, pch=20, envelope=0.95, col=c(1,"#005AFF"), ylim=c(chart.mins, chart.maxs)) abline(v=0, col="darkgray", lty=2) chart.ECDF(R[,i], main="", xlim=c(chart.mins, chart.maxs), xaxis=FALSE, yaxis=FALSE, lwd=2) abline(v=0, col="darkgray", lty=2) } } par(op) }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.