Chernoff faces in ggplot2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I am pleased to introduce my first R package for general consumption, ggChernoff.
Inspiration stems from Leland Wilkinson’s book, The Grammar of Graphics, whose principles were later implemented as the layered grammar of graphics in Hadley Wickham’s popular R package, ggplot2.
Wilkinson’s grammar generalises data visualisation. Rather than choosing a types of graph from a list—for example bar plot, pie chart or line graphs—instead we break down graphics into algebra, scales, statistics, geometry, coordinates and aesthetics.
What does that actually mean? Suppose we want to transform the following data,
name | value |
---|---|
a | 37 |
b | 21 |
c | 42 |
into the following simple visualisation.
In Grammar of Graphics terms, we produce the plot using these semantics.
- Variables
name
,value
- Algebra
name
×value
→ (a, 37), (b, 21), (c, 42).- Scales
- (37, 21, 42) → (0.88, 0.50, 1.00)
- (a, b, c) → (3, 2, 1)
- Geometry
- For each tuple, draw a line between two points.
- Coordinates
- use Cartesian (x, y) coordinates in 2-dimensional space.
- Aesthetics
- (0,
name
) → (x1, y1) - (
value
,name
) → (x2, y2)
- (0,
Our line plot above is grammatically equivalent to a bar chart—just make the lines thicker! Now, what if we swap Cartesian coordinates for polar coordinates, (r, θ)? The resulting graphic is still grammatically correct, though perhaps not very useful.
We can even transpose it to (θ, r) coordinates, for another grammatically-valid data visualisation.
These plots don’t necessarily have names, so you don’t have to hunt through a list to find the right tool to create them. It is just a matter of knowing how the Grammar of Graphics works.
Look at the specifications for our plot again. Geometry and aesthetics don’t have to be as conventional as lines and boxes of different sizes. Let’s try something completely different!
A couple of amendments to the specifications…
- Geometry
- For each tuple, draw a face.
- Aesthetics
name
→ x- 0 → y
value
→ smile
…and we produce this graph.
Using faces for multivariate data visualisation is an idea credited to Herman Chernoff, and which Wilkinson discusses in some detail in The Grammar of Graphics. In principle, you can represent data using all kinds of weird and wonderful things, so long as you can clearly define geometry, aesthetics, scales and so on to integrate them into a grammatical workflow.
My R package, ggChernoff, extends ggplot2 with a new face geom and aesthetics for its mouth and eyebrows. By mapping variables to the right aesthetic you can make your data happy or sad, angry or expressionless.
Here is another set of aesthetic mappings.
- Aesthetics
name
→ x- 0 → y
- 1 → smile
value
→ eyebrows
This is a nice way to add a bit of personality to your data visualisations. Unlike the package emoGG, which involves selecting pre-drawn icons from a list, the faces in ggChernoff are procedurally generated, in the spirit of the grammar of graphics.
More demonstrations are coming in a future post. The package ggChernoff is now available on GitHub.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.