ggplot2: Crayola Crayon Colours
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Statistical Algorithms blog attempted to recreate a graph depicting the growing colour selection of Crayola crayons in ggplot2 (original graph below via FlowingData).
He also asked the following questions: Is there an easier way to do this? How can I make the axes more like the original? What about the white lines between boxes and the gradual change between years? The sort order is also different.
I will present my version in this post, trying to address some of these questions.
data:image/s3,"s3://crabby-images/d79fa/d79fa549e55f4068aa3b9303a3defd5eda78560e" alt="crayons_small.png"
Data Import
The list of Crayola crayon colours is available on Wikipedia, and also contains one duplicate colour (#FF1DCE) that was excluded to make further processing easier.
> library(XML) > library(ggplot2) |
> theurl <- "http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors" > html <- htmlParse(theurl) > crayola <- readHTMLTable(html, stringsAsFactors = FALSE)[[2]] > crayola <- crayola[, c("Hex Code", "Issued", "Retired")] > names(crayola) <- c("colour", "issued", "retired") > crayola <- crayola[!duplicated(crayola$colour), + ] > crayola$retired[crayola$retired == ""] <- 2010 |
Plotting
Instead of geom_rect() I will show two options of plotting the same data using geom_bar() and geom_area() to plot the data, and need to ensure that there’s one entry per colour per year it was(is) in the production.
> colours <- ddply(crayola, .(colour), transform, + year = issued:retired) |
The plot colours are manually mapped to the original colours using scale_fill_identity().
> p <- ggplot(colours, aes(year, 1, fill = colour)) + + geom_bar(width = 1, position = "fill", binwidth = 1) + + theme_bw() + scale_fill_identity() |
data:image/s3,"s3://crabby-images/3d378/3d3786f2527dd34a262d4168476b0254e747fccd" alt="crayola_colours-006.png"
And now the geom_area() version:
> p1 <- ggplot(colours, aes(year, 1, fill = colour)) + + geom_area(position = "fill", colour = "white") + + theme_bw() + scale_fill_identity() |
data:image/s3,"s3://crabby-images/863e6/863e65e549817915c6a9625333f6e48fc644030b" alt="crayola_colours-008.png"
Final Formatting
Next, the x-axis labels suggested by ggplot2 will be manualy overridden. Also I use a little trick to make sure that the labels are properly aligned.
> labels <- c(1903, 1949, 1958, 1972, 1990, 1998, + 2010) > breaks <- labels - 1 > x <- scale_x_continuous("", breaks = breaks, labels = labels, + expand = c(0, 0)) > y <- scale_y_continuous("", expand = c(0, 0)) > ops <- opts(axis.text.y = theme_blank(), axis.ticks = theme_blank()) |
> p + x + y + ops |
data:image/s3,"s3://crabby-images/438a4/438a468e1e949d55599d92671d0c201bc43389a3" alt="crayola_colours-011.png"
> p1 + x + y + ops |
data:image/s3,"s3://crabby-images/1a725/1a725ef9b84ba97c58c0e7d05ed7e022193f617e" alt="crayola_colours-013.png"
The order of colours could be changed by sorting the colours by some common feature, unfortunately I did not find an automated way of doing this.
Sorting by Colour
Thanks to Baptiste who showed a way to sort the colours, the final version of the area plot resembles the original even more closely.
> library(colorspace) |
> sort.colours <- function(col) { + c.rgb = col2rgb(col) + c.RGB = RGB(t(c.rgb) %*% diag(rep(1/255, 3))) + c.HSV = as(c.RGB, "HSV")@coords + order(c.HSV[, 1], c.HSV[, 2], c.HSV[, 3]) + } > colours = ddply(colours, .(year), function(d) d[rev(sort.colours(d$colour)), + ]) |
> last_plot() %+% colours |
data:image/s3,"s3://crabby-images/884cd/884cd9763c08d59c46eaa99bea8b1d0cc6edbd81" alt="crayola_colours-017.png"
data:image/s3,"s3://crabby-images/e2f2a/e2f2ae46928d1c614ea1a787458272b909716b07" alt=""
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.