Diverging Dot Plot and Lollipop Charts – Plotting Variance with ggplot2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Creating the Dot Plot Variance chart
The data preparation was used in the previous blog entitled: Diverging Bar Charts – Plotting Variance with ggplot2.
Refer to that if you need to know how to create the data prior to this tutorial.
Setting up the Dot Plot Variance chart
library(ggplot2) ggplot(mtcars, aes(x=CarBrand, y=mpg_z_score, label=mpg_z_score)) + geom_point(stat='identity', aes(col=mpg_type), size=6) + scale_color_manual(name="Mileage (deviation)", labels = c("Above Average", "Below Average"), values = c("above"="#00ba38", "below"="#0b8fd3")) + geom_text(color="white", size=2) + labs(title="Diverging Dot Plot (ggplot2)", subtitle="Z score showing Normalised mileage", caption="Produced by Gary Hutson") + ylim(-2.5, 2.5) + coord_flip()
This is very similar to the previous plot we created in the previous post, however there are a few differences. The main difference is that we use a geom_point() geometry and set the colour of the points based on whether the said point deviates above and below the average. In addition, we use the geom_text() to set the colour of the text in the points to white and specify the size of the text. The final difference is that I have added a Y limit (ylim) range of -2.5 standard deviation to positive 2.5 standard deviations.
Running this block of code, along with the data preparation code, will give you a chart that looks as below:
Creating the Diverging Lollipop Chart
The code below shows how to build the diverging lollipop chart in R and ggplot2:
ggplot(mtcars, aes(x=CarBrand, y=mpg_z_score, label=mpg_z_score)) + geom_point(stat='identity', aes(col=mpg_type), size=6) + scale_color_manual(name="Mileage (deviation)", labels = c("Above Average", "Below Average"), values = c("above"="#00ba38", "below"="#0b8fd3")) + geom_segment(aes(y = 0, x = CarBrand, yend = mpg_z_score, xend = CarBrand), color = "black") + geom_text(color="white", size=2) + labs(title="Diverging Lollipop Chart", subtitle="Z score for normalised mileage", caption="Produced by Gary Hutson") + ylim(-2.5, 2.5) + coord_flip() + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
Similar geometries are used here. What has been added here is the geom_segment() this shows how the line segments need to be added. The starting y is equal to 0 on the Y scale and the starting x is the first car by the car brand. Similarly, the end of the x (xend) is also the CarBrand.
The only other difference is to add a theme constraint to the end of the code to turn off the major and minor grid lines, this is achieved by setting the panel.grid.major and panel.grid.minor equal to element_blank().
The completed graph and plot is shown below:
There – we now have some lovely looking charts that can be put into a report to report on variance between categorical variables.
This post appears on R-Bloggers – please check out all the other cool blogs featured on this site.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.