How to Make Stunning Scatter Plots in R: A Complete Guide with ggplot2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Scatter Plots with R
Do you want to make stunning visualizations, but they always end up looking like a potato? It’s a tough place to be. Luckily, R makes it easy to produce great-looking visuals. Today you’ll learn how to create impressive scatter plots with R and the ggplot2
package.
Read the series from the beginning:
This article demonstrates how to make a scatter plot for any occasion and how to make it look extraordinary at the same time. After reading, visualizing relationships between any continuous variables shouldn’t be a problem. Today you’ll learn how to:
- Make your first scatter plot
- Change size, color, shapes, and themes
- Add text and labels
- Add titles, subtitles, captions, and axis labels
- Add other useful things
Make Your First Scatter Plot
R has many datasets built-in, and one of them is mtcars
. It’s one of the most popular datasets, and today you’ll use it to make a lot of scatter plots. Package-wise, you’ll only need ggplot2
.
Here’s how to import the packages and take a look at the first couple of rows:
The most widely used R package for data visualization is ggplot2
. It’s a straightforward package based on the layering principle. The first layer is used to specify the data, and the layers after are used to make and tweak the visualization.
The geom_point()
layer is used to draw scatter plots. Your first chart will show the relationship between the mpg
attribute on the x-axis, and the hp
column on the y-axis:
We agree with you – it’s not the prettiest visualization. You’ll learn how to deal with that in the following sections.
Change Size, Color, Shape, and Theme
You can’t make stunning visuals with default stylings. You can change a couple of things in the geom_point()
layer, such as shape, color, size, and so on.
Here’s how to make the points blue and a bit larger:
Better, but what if you don’t want to hardcode color and size values? You can put variable names instead. Here’s how to change the color based on the cyl
variable and size by qsec
:
Changing shapes is also straightforward. Dots aren’t appropriate for every use case, and you’re free to change the shape with the shape
attribute. You can find the list of all available shapes here.
The following code snippet replaces dots with triangles:
And finally, let’s talk about themes. The default one isn’t for everyone because it’s a bit too harsh with the background. With R, you can change the theme with a single line of code:
Now that’s progress. But it’s still not quite there yet. Let’s see how to add text and labels next.
Add Text and Labels
You can use text and labels to add additional information to your visualizations. The only difference between these two is that there’s a box around labels, making it easier to read.
Here’s how to add text to represent car names:
The ggrepel
package is here to prevent the overlap between text. You can add text with the plain geom_text
layer, but it would be impossible to read the text for the points that are close.
If this still isn’t as readable as you would want, use labels instead of text. Here’s how:
Visualization isn’t complete without title and axis labels. Let’s see how to add and style these next.
Add Titles, Subtitles, Captions, and Axis Labels
First, you’ll learn how to add titles, subtitles, and captions to the chart. The title is mandatory for any decent visualization, and the other two can help further clarify things and for citing sources, respectively.
The most convenient way to add these is through a labs()
layer. It takes in values for title
, subtitle
, and caption
:
By default, these don’t look so great. You can change color, size, alignment, and emphasize/italicize the text in the theme()
layer. Here’s how:
Let’s talk about axis labels next. You can change and style them the same you did with titles, subtitles, and captions – in labs()
and theme()
layers.
The code snippet below adds labels for both X and Y axes and styles them a bit:
To wrap things up, let’s take a look at a couple of useful tweaks you can do to scatter plots that don’t fall into any of the discussed sections.
Add Other Useful Things
Let’s start by changing the legend position. The default position on the right might not be the best for some use cases. You can put the legend on the top by adding the legend.position
argument to the theme()
layer and specifying the position.
Here’s an example:
The other potentially useful layer you can use is geom_rug()
. It shows the variable distribution on the edges of both X and Y axes for the specified variables.
Here’s how to use it:
With this layer, you can get a rough idea of how your variables are distributed and on which point(s) most of the observations are located.
Conclusion
Today you’ve learned how to make scatter plots with R and ggplot2
and how to make them aesthetically pleasing. You’ve learned how to change colors, marker types, size, titles, subtitles, captions, axis labels, and a couple of other useful things.
This alone will be enough to make almost any data visualization you can imagine. It’s up to you now to choose an appropriate theme, color, and title.
You can expect more basic R tutorials weekly. Fill out the subscribe form below, so you never miss an update.
BQ: Are you completely new to R but have some programming experience? Check out our detailed R guide for programmers.
Learn More:
- How Our Project Leader Built Her First Shiny Dashboard with No R Experience
- A crash course in R Shiny UI
- How to translate R Shiny dashboards
- How to make R Shiny faster
- How to scale R Shiny dashboards
Appsilon is hiring for remote roles! See our Careers page for all open positions, including R Shiny Developers, Fullstack Engineers, Frontend Engineers, a Senior Infrastructure Engineer, and a Community Manager. Join Appsilon and work on groundbreaking projects with the world’s most influential Fortune 500 companies.
Article How to Make Stunning Scatter Plots in R: A Complete Guide with ggplot2 comes from Appsilon | End to End Data Science Solutions.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.