Six Things I Always Google When Using ggplot2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I often use {ggplot2} to create graphs but there are certain things I always have to Google. I figured I’d create a post for quick reference for myself but I’d love to hear what you always have to look up!
- Remove the Legend
- Change Legend Title and Labels
- Manually Change Colors
- Remove X Axis Labels
- Start the Y Axis at a Specific Number
- Use Scales on the Y Axis
library(tidyverse) knitr::opts_chunk$set(out.width = '100%')
To showcase what’s happening, I am going to use a TidyTuesday dataset: Spotify songs! Let’s start by creating a simple graph.
# Load Data spotify_songs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv') spotify_songs %>% ggplot(aes(x = playlist_genre)) + geom_histogram(stat = "count")
Remove the legend
theme(legend.position = “none”)
Ahh… this one always gets me. Sometimes when your color is mostly just for aesthetics, it doesn’t make sense to also have a color legend. This removes the legend and makes the graph look cleaner.
spotify_songs %>% ggplot(aes(x = playlist_genre, fill = playlist_genre)) + geom_histogram(stat = "count") + theme(legend.position = "none")
Change Legend Title and Labels
scale_fill_discrete(name = “New Legend Title”, labels = c(“lab1” = “Label 1”, “lab2” = “Label 2”))
Alright, say I do want the legend. How do I make it something readable?
spotify_songs %>% ggplot(aes(x = playlist_genre, fill = playlist_genre)) + geom_histogram(stat = "count") + scale_fill_discrete(name = "Playlist Genre", labels = c("edm" = "EDM", "latin" = "Latin", "pop" = "Pop", "r&b" = "R&B", "rap" = "Rap", "rock" = "Rock"))
Manually Change Colors
scale_fill_manual(“New Legend Title”, values = c(“lab1” = “#000000”, “lab2” = “#FFFFFF”))
This is a bit tricker, in that you cannot use scale_fill_manual
and scale_fill_discrete
separately on the same plot as they override each other. However, if you want to change the labels and the colors together, you can use scale_fill_manual
like below.
spotify_songs %>% ggplot(aes(x = playlist_genre, fill = playlist_genre)) + geom_histogram(stat = "count") + scale_fill_manual(name = "Playlist Genre", labels = c("edm" = "EDM", "latin" = "Latin", "pop" = "Pop", "r&b" = "R&B", "rap" = "Rap", "rock" = "Rock"), values = c("edm" = "#68B39B", "latin" = "#F6C7FF", "pop" = "#ADFFE5", "r&b" = "#CCB576", "rap" = "#B3A070", "rock" = "#d3d3d3"))
Remove X Axis Labels
theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())
In this case, since we have a legend, we don’t need any x axis labels. Sometimes I use this if there’s redundant information or if it otherwise makes the graph look cleaner.
spotify_songs %>% ggplot(aes(x = playlist_genre, fill = playlist_genre)) + geom_histogram(stat = "count") + scale_fill_manual(name = "Playlist Genre", labels = c("edm" = "EDM", "latin" = "Latin", "pop" = "Pop", "r&b" = "R&B", "rap" = "Rap", "rock" = "Rock"), values = c("edm" = "#68B39B", "latin" = "#F6C7FF", "pop" = "#ADFFE5", "r&b" = "#CCB576", "rap" = "#B3A070", "rock" = "#d3d3d3")) + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())
Start the Y Axis at a Specific Number
scale_y_continuous(name = “New Y Axis Title”, limits = c(0, 1000000))
Often times, we want our graph’s y axis to start at 0. In this example it already does, but this handy parameter allows us to set exactly what we want our y axis to be.
spotify_songs %>% ggplot(aes(x = playlist_genre, fill = playlist_genre)) + geom_histogram(stat = "count") + scale_fill_manual(name = "Playlist Genre", labels = c("edm" = "EDM", "latin" = "Latin", "pop" = "Pop", "r&b" = "R&B", "rap" = "Rap", "rock" = "Rock"), values = c("edm" = "#68B39B", "latin" = "#F6C7FF", "pop" = "#ADFFE5", "r&b" = "#CCB576", "rap" = "#B3A070", "rock" = "#d3d3d3")) + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank()) + scale_y_continuous(name = "Count", limits = c(0, 10000))
Use scales on the Y Axis
scale_y_continuous(label = scales::format)
Depending on our data, we may want the y axis to be formatted a certain way (using dollar signs, commas, percentage signs, etc.). The handy {scales} package allows us to do that easily.
spotify_songs %>% ggplot(aes(x = playlist_genre, fill = playlist_genre)) + geom_histogram(stat = "count") + scale_fill_manual(name = "Playlist Genre", labels = c("edm" = "EDM", "latin" = "Latin", "pop" = "Pop", "r&b" = "R&B", "rap" = "Rap", "rock" = "Rock"), values = c("edm" = "#68B39B", "latin" = "#F6C7FF", "pop" = "#ADFFE5", "r&b" = "#CCB576", "rap" = "#B3A070", "rock" = "#d3d3d3")) + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank()) + scale_y_continuous(name = "Count", limits = c(0, 10000), labels = scales::comma)
There we have it! Six things I always eventually end up Googling when I am making plots using {ggplot2}. Hopefully now I can just look at this page instead of searching each and every time!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.