Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
My disdain for pie charts is fairly well-known, but I do concede that there are times one needs to communicate parts of a whole graphically verses using just words or a table. When that need arises, I’m partial to “waffle charts” or “square pie charts”. @eagereyes did a great post a while ago on them (make sure to read the ‘debate’ between Robert and @hadleywickham in the comments, too), so head there for the low-down on them. Rather than have every waffle chart I make be a one-off creation, I made an R package for them.
There is currently one function in the package — waffle
— and said function doesn’t mimic all the goodness of these charts as described in Robert’s post (yet). It does, however, do a pretty decent job covering the basics. Let’s take the oft-cited New York times “debt” graphic:
We can replicate that pretty closely in R. To make it as simple as possible, the waffle
function takes a named numeric vector. If no names are specified, or you leave some names out, LETTERS
will be used to fill in the gaps. The function takes your data quite literally, so if you give it a vector that sums up to, say, 10,000
, then the function will try to create a ggplot object with 10,000 geom_rect
elements. Needless to say, that’s a bad idea. So, I suggest using the raw numbers in the vector and passing in a scaled version of the vector to the function. That way, you can play with the values to get the desired look. Here’s the R version of of the NYT graphic:
library(waffle) savings <- c(`Mortgage ($84,911)`=84911, `Auto andntuition loans ($14,414)`=14414, `Home equity loans ($10,062)`=10062, `Credit Cards ($8,565)`=8565) waffle(savings/392, rows=7, size=0.5, colors=c("#c7d4b6", "#a3aabd", "#a0d0de", "#97b5cf"), title="Average Household Savings Each Year", xlab="1 square == $392") |
This package evolved from a teensy gist I made earlier this year to help communicate the scope of the Anthem data breach in the US. Since then, a recent breach at Premera occurred and added to the tally. Here’s two views of that data, one with one square equalling one million people and another with one square equalling ten million people (using the blue shade from each of the company’s logos):
parts <- c(`Un-breachednUS Population`=(318-11-79), `Premera`=11, `Anthem`=79) waffle(parts, rows=8, size=1, colors=c("#969696", "#1879bf", "#009bda"), title="Health records breaches as fraction of US Population", xlab="One square == 1m ppl") |
waffle(parts/10, rows=3, colors=c("#969696", "#1879bf", "#009bda"), title="Health records breaches as fraction of US Population", xlab="One square == 10m ppl" |
I’m betting that gets alot bluer by the end of the year.
The function returns a ggplot object, so s, sizes, etc can all be customized and the source is up on github for all to play with and contribute to.
Along with adding support for filling in the chart as shown in the @eagereyes post, there will also be an htmlwidget version coming as well. Standard drill applies: issues/enhancements to github issues, feedback and your own examples in the comments.
UPDATE
Thanks to a PR by @timelyportfolio, there is now a widget option in the package.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.