Site icon R-bloggers

Using R: correlation heatmap, take 2

[This article was first published on There is grandeur in this view of life » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Apparently, this turned out to be my most popular post ever.  Of course there are lots of things to say about the heatmap (or quilt, tile, guilt plot etc), but what I wrote was literally just a quick celebratory post to commemorate that I’d finally grasped how to combine reshape2 and ggplot2 to quickly make this colourful picture of a correlation matrix.

However, I realised there is one more thing that is really needed, even if just for the first quick plot one makes for oneself: a better scale. The default scale is not the best for correlations, which range from -1 to 1, because it’s hard to tell where zero is. We use the airquality dataset for illustration as it actually has some negative correlations. In ggplot2, it’s very easy to get a scale that has a midpoint and a different colour in each direction. It’s called scale_colour_gradient2, and we just need to add it. I also set the limits to -1 and 1, which doesn’t change the colour but fills out the legend for completeness. Done!

data <- airquality[,1:4]
library(ggplot2)
library(reshape2)
qplot(x=Var1, y=Var2, data=melt(cor(data, use="p")), fill=value, geom="tile") +
   scale_fill_gradient2(limits=c(-1, 1))


Postat i:computer stuff, data analysis, english Tagged: #blogg100, ggplot2, R, reshape2

To leave a comment for the author, please follow the link and comment on their blog: There is grandeur in this view of life » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.