Create a line graph with ggplot
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Use the geom_line()
aesthetic to draw line graphs and customize its styling using the color
parameter. Specify which coordinates to use for each line with the group
parameter.
- Create your first line graph using
geom_line()
- Define how different lines are connected using the
group
parameter - Change the line color of a line graph using the
color
parameter
ggplot(___) + geom_line( mapping = aes(x = ___, y = ___, group = ___, color = ___) )
Introduction to line graphs
Line graphs are used to visualize the trajectory of one numeric variable against another. Unlike scatter plots the x- and y-coordinates are not visualized through points but are instead connected through lines. Line graphs are most typically used if one variable changes continuously against another numeric variable which is the case for most time series charts (e.g. prices, customers, CO2 concentration, temperature over time), continuous functions (e.g. sine sin(x)
) or other near-continuous relationships (real-world supply/demand curves).
Quiz: Line Graphs
Which of the following statements about line graphs are correct?- Line graphs are typically used to plot the relationship between categorical and numeric variables.
- Line graphs are typically used to plot variables of type
numeric
. - For line graphs it is not necessary that the relationship between two variables shows continuity.
- Line graphs can be used to plot time series.
Creating a simple line graph
ggplot(___) + geom_line( mapping = aes(x = ___, y = ___, group = ___, color = ___) )
Japan is among the countries with the highest life expectancy. Using the gapminder_japan
dataset we determine how the life expectancy in Japan has developed over time. We need to:
- Specify the dataset within
ggplot()
- Define the
geom_line()
plot layer - Map the
year
to the x-axis and the life expectancylifeExp
to the y-axis with theaes()
function
Note that the ggplot2 library needs to be loaded first with library(ggplot2)
.
library(ggplot2) ggplot(gapminder_japan) + geom_line( mapping = aes(x = year, y = lifeExp) )
Exercise: Plot life expectancy of Brazil
Create your first line graph showing the life expectancy of people from Brazil over time.
- Use the
ggplot()
function and specify thegapminder_brazil
dataset as input - Add a
geom_line()
layer to the plot - Map the
year
to the x-axis and the life expectancylifeExp
to the y-axis with theaes()
function
Adding more lines
ggplot(___) + geom_line( mapping = aes(x = ___, y = ___, group = ___, color = ___) )
So far we only focused on single lines, but what if we have multiple countries in the dataset and want to somehow differentiate them?
Line graphs are often extended and used for the comparison of two or more lines. Multiple line graphs show the absolute differences between observations but also how the specific trajectories relate to each other. For example, let’s answer the question: How has life expectancy changed in the countries Austria and Hungary over time?
We first filter the dataset for both countries of interest. Then, we set the variable country
as the group
argument for the aesthetic mapping. The group argument tells ggplot which observations belong together and should be connected through lines. By specifying the country
variable ggplot creates a separate line for each country. To make the lines easier to distinguish we also map color
to the country
so that each country line has a different color.
gapminder_comparison <- filter(gapminder, country %in% c("Austria", "Hungary")) ggplot(data = gapminder_comparison) + geom_line(mapping = aes(x = year, y = lifeExp, group = country, color = country) )
Note that ggplot also separates the lines correctly if only the color
mapping is specified (the group
parameter is implicitly set).
Exercise: Compare life expectancy
Create a line graph to compare the life expectancy lifeExp
in the countries Japan, Brazil and India.
- Use the data set
gapminder_comparison
in yourggplot()
function which contains only data for the countriesJapan
,Brazil
andIndia
. - Create a line graph with the
geom_line()
function - Map the
year
to the x-axis and the life expectancylifeExp
to the y-axis with theaes()
function - Map the
group
and thecolor
parameter to thecountry
variable.
Exercise: Compare populations
Compare the population growth over the last decades in the countries Austria, Hungary and Serbia.
- Use the data set
gapminder_comparison
in yourggplot()
function which contains only data for the countries in question. - Create a line graph with
geom_line()
- Map the
year
to the x-axis and the populationpop
to the y-axis withaes()
- Map the
group
and thecolor
parameter to thecountry
variable.
Quiz: Malformed Plot
gapminder_comparison <- filter(gapminder, country %in% c("Brazil", "China", "India")) ggplot(data = gapminder_comparison) + geom_line(mapping = aes(x = year, y = pop))What has gone wrong in this plot?
- The population numbers are scaled differently in the plotted countries
- The
group
aesthetic should be used to map the populationpop
variable. - The
color
aesthetic should be used to map the populationlifeExp
variable. - The
group
aesthetic should be used to map theyear
variable. - The
group
aesthetic should be used to map thecountry
variable.
Create a line graph with ggplot is an excerpt from the course Introduction to R, which is available for free at quantargo.com
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.