Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- Create your first bar chart using
geom_col()
- Fill bars with color using the
fill
aesthetic
ggplot(___) + geom_col( mapping = aes(x = ___, y = ___, fill = ___) )
Introduction to bar charts
Bar charts visualize numeric
values grouped by categories. Each category is represented by one bar with a height defined by each numeric
value.
Bar charts are well suited to compare values among different groups e.g. number of votes by parties, number of people in different countries or GDP per capita in different countries. Bar charts are a bit spacious and work best if the number of groups to compare is rather small.
Below you can find an example showing the number of people (in millions) in the five biggest countries by population in 2007:
Creating a simple bar chart
ggplot(___) + geom_col( mapping = aes(x = ___, y = ___, fill = ___) )
In ggplot2, bar charts are created using the geom_col()
geometric layer. The geom_col()
layer requires the x
aesthetic mapping which defines the different bars to be plotted. The height of each bar is defined by the variable specified in the y
aesthetic mapping. Both mappings, x
and y
are required for geom_col()
.
Let’s create our first bar chart with the gapminder_top5
dataset. It contains population (in millions) and life expectancy data for the biggest countries by population in 2007.
ggplot(gapminder_top5) + geom_col(aes(x = country, y = pop))
We see that the resulting bars are sorted by the country names in alphabetical order by default.
Exercise: Plot life expectancy by country
Create a bar chart showing the life expectancy of the five biggest countries by population in 2007.
- Use the
ggplot()
function and specify thegapminder_top5
dataset as input - Add a
geom_col()
layer to the plot - Plot one bar for each
country
(x aesthetic) - Use life expectancy
lifeExp
as bar height (y aesthetic)
Filling bars with color
ggplot(___) + geom_col( mapping = aes(x = ___, y = ___, fill = ___) )
Like other geoms geom_col()
allows users to map additional dataset variables to the color attribute of the bar. The fill
aesthetic can be used to fill the entire bars with color. A usual confusion is the color
aesthetic which specifies the line color of each bar’s border instead of the fill color.
Based on the gapminder_top5
dataset we plot the population (in millions) of the biggest countries and use the continent
variable to color each bar:
ggplot(gapminder_top5) + geom_col(aes(x = country, y = pop, fill = continent))
Since the continent
variable is a categorical variable the bars have a clear color scheme for each continent. Let’s see what happens if we use a numeric
variable like life expectancy lifeExp
instead:
ggplot(gapminder_top5) + geom_col(aes(x = country, y = pop, fill = lifeExp))
The bar colors have now changed according the continuous legend on the right. We see that also numeric
variables can be used to fill
bars.
Exercise: Plot population size by country
Create a bar chart showing the population (in millions) of the five biggest countries by population in 2007.
- Use the
ggplot()
function and specify thegapminder_top5
dataset as input - Add a
geom_col()
layer to the plot - Plot one bar for each
country
(x aesthetic) - Use population
pop
as bar height (y aesthetic) - Use the GDP per capita
gdpPercap
asfill
aesthetic
Stacked bar charts
ggplot(___) + geom_col( mapping = aes(x = ___, y = ___, fill = ___) )
In some circumstances it might be useful to plot multiple numeric values variables within each bar. Examples are numeric values describing one specific entity (e.g. customers) split among various categories (customer segments) so that the bar height represents the total number (all customers).
The plot below shows the number of phones (in thousands) by continent from 1956 to 1961 as a stacked bar chart:
ggplot(world_phones) + geom_col(aes(x = year, y = phones, fill = region))
Exercise: Plot number of crimes by US states
Create a bar chart showing the number of crimes by US state per 100,000 residents in 1973.
- Use the
ggplot()
function and specify theus_arrests
dataset as input - Add a
geom_col()
layer to the plot - Plot one bar for each
state
(x aesthetic) - Use the number of
cases
as bar height (y aesthetic) - Use the
crime
type asfill
aesthetic.
Create your first bar chart is an excerpt from the course Introduction to R, which is available for free at quantargo.com
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.