Create a line graph with ggplot

[This article was first published on Quantargo Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Use the geom_line() aesthetic to draw line graphs and customize its styling using the color parameter. Specify which coordinates to use for each line with the group parameter.

  • Create your first line graph using geom_line()
  • Define how different lines are connected using the group parameter
  • Change the line color of a line graph using the color parameter
ggplot(___) + 
  geom_line(
    mapping = aes(x = ___, y = ___, 
                  group = ___, 
                  color = ___)
)

Introduction to line graphs

Line graphs are used to visualize the trajectory of one numeric variable against another. Unlike scatter plots the x- and y-coordinates are not visualized through points but are instead connected through lines. Line graphs are most typically used if one variable changes continuously against another numeric variable which is the case for most time series charts (e.g. prices, customers, CO2 concentration, temperature over time), continuous functions (e.g. sine sin(x)) or other near-continuous relationships (real-world supply/demand curves).

Quiz: Line Graphs

Which of the following statements about line graphs are correct?
  • Line graphs are typically used to plot the relationship between categorical and numeric variables.
  • Line graphs are typically used to plot variables of type numeric.
  • For line graphs it is not necessary that the relationship between two variables shows continuity.
  • Line graphs can be used to plot time series.
Start Quiz

Creating a simple line graph

ggplot(___) + 
  geom_line(
    mapping = aes(x = ___, y = ___, 
                  group = ___, 
                  color = ___)
)

Japan is among the countries with the highest life expectancy. Using the gapminder_japan dataset we determine how the life expectancy in Japan has developed over time. We need to:

  1. Specify the dataset within ggplot()
  2. Define the geom_line() plot layer
  3. Map the year to the x-axis and the life expectancy lifeExp to the y-axis with the aes() function

Note that the ggplot2 library needs to be loaded first with library(ggplot2).

library(ggplot2)
ggplot(gapminder_japan) + 
  geom_line(
    mapping = aes(x = year, y = lifeExp)
)

Exercise: Plot life expectancy of Brazil

Create your first line graph showing the life expectancy of people from Brazil over time.

  1. Use the ggplot() function and specify the gapminder_brazil dataset as input
  2. Add a geom_line() layer to the plot
  3. Map the year to the x-axis and the life expectancy lifeExp to the y-axis with the aes() function
Start Exercise

Adding more lines

ggplot(___) + 
  geom_line(
    mapping = aes(x = ___, y = ___, 
                  group = ___, 
                  color = ___)
)

So far we only focused on single lines, but what if we have multiple countries in the dataset and want to somehow differentiate them?

Line graphs are often extended and used for the comparison of two or more lines. Multiple line graphs show the absolute differences between observations but also how the specific trajectories relate to each other. For example, let’s answer the question: How has life expectancy changed in the countries Austria and Hungary over time?

We first filter the dataset for both countries of interest. Then, we set the variable country as the group argument for the aesthetic mapping. The group argument tells ggplot which observations belong together and should be connected through lines. By specifying the country variable ggplot creates a separate line for each country. To make the lines easier to distinguish we also map color to the country so that each country line has a different color.

gapminder_comparison <- 
  filter(gapminder, country %in% c("Austria", "Hungary"))

ggplot(data = gapminder_comparison) + 
  geom_line(mapping = aes(x = year, y = lifeExp, 
                          group = country, 
                          color = country)
            )

Note that ggplot also separates the lines correctly if only the color mapping is specified (the group parameter is implicitly set).

Exercise: Compare life expectancy

Create a line graph to compare the life expectancy lifeExp in the countries Japan, Brazil and India.

  1. Use the data set gapminder_comparison in your ggplot() function which contains only data for the countries Japan, Brazil and India.
  2. Create a line graph with the geom_line() function
  3. Map the year to the x-axis and the life expectancy lifeExp to the y-axis with the aes() function
  4. Map the group and the color parameter to the country variable.
Start Exercise

Exercise: Compare populations

Compare the population growth over the last decades in the countries Austria, Hungary and Serbia.

  1. Use the data set gapminder_comparison in your ggplot() function which contains only data for the countries in question.
  2. Create a line graph with geom_line()
  3. Map the year to the x-axis and the population pop to the y-axis with aes()
  4. Map the group and the color parameter to the country variable.
Start Exercise

Quiz: Malformed Plot

gapminder_comparison <- filter(gapminder, country %in% c("Brazil", "China", "India"))
ggplot(data = gapminder_comparison) + 
  geom_line(mapping = aes(x = year, y = pop))
What has gone wrong in this plot?
  • The population numbers are scaled differently in the plotted countries
  • The group aesthetic should be used to map the population pop variable.
  • The color aesthetic should be used to map the population lifeExp variable.
  • The group aesthetic should be used to map the year variable.
  • The group aesthetic should be used to map the country variable.
Start Quiz

Create a line graph with ggplot is an excerpt from the course Introduction to R, which is available for free at quantargo.com

VIEW FULL COURSE

To leave a comment for the author, please follow the link and comment on their blog: Quantargo Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)