Site icon R-bloggers

How to make a line chart with ggplot2

[This article was first published on r-bloggers – SHARP SIGHT LABS, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



Last week’s blog post about Amazon’s search for a location for a second headquarters left me thinking about the company’s growth.

After looking at the long term growth of the stock price, it occurred to me that visualizing the stock price data would be a great example of how to create a line chart in R using < inline_code>ggplot2.

So in this blog post, I’ll show you how to make a line chart with < inline_code>ggplot2, step by step.

Let’s jump in.

First, we’ll load several packages: we’ll load < inline_code>readr, which we’ll use to read in the data; < inline_code>tidyverse, which includes < inline_code>ggplot2, < inline_code>dplyr, and several other important packages; and < inline_code>stringr, which will let us do some string manipulation.

#===============
# LOAD PACKAGES
#===============

library(readr)
library(tidyverse)
library(stringr)

Now that we’ve loaded the packages that we need, we’ll read in the data.

The data are contained in a < inline_code>.csv file that I’ve uploaded to the Sharp Sight webpage.

We’ll use < inline_code>readr::read_csv() to read in the file. This is an extremely straightforward use of < inline_code>read_csv().

#==========
# READ DATA
#==========

stock_amzn <- read_csv("http://sharpsightlabs.com/wp-content/uploads/2017/09/AMZN_stock.csv")

Now we’ll quickly inspect the data by looking at the column names and printing out the first few rows of data.

#========
# INSPECT
#========

stock_amzn %>% names()
stock_amzn %>% head()

Upon inspection, you can see that the column names are capitalized. This is a minor problem, but ideally you want your variable names to be lower case; this makes them easier to type.

To convert the variable names to all lower case, we’ll use the < inline_code>str_to_lower() function from the < inline_code>stringr package.

#=========================================================
# CHANGE COLUMN NAMES: lower case
# - in the raw form (as read in) the first letter of 
#   each variable is capitalized. 
# - This makes them harder to type!  Not ideal.
# - we'll use stringr::str_to_lower() to change the column
#   names to lower case 
#=========================================================

colnames(stock_amzn) <- colnames(stock_amzn) %>% str_to_lower()

# inspect
stock_amzn %>% names()

Here, on the right-hand-side of the assignment operator, we’re using < inline_code>colnames(stock_amzn) to retrieve the column names. Then we pipe the column names into < inline_code>str_to_lower() which converts the names to lower case.

The resulting output is then re-assigned to the column names of the dataframe. We do this by using the following: < inline_code>colnames(stock_amzn) < - . Essentially, we’re taking the result from the right-hand-side and assigning that result to the column names using < inline_code>colnames(). To be clear, < inline_code>colnames() can both retrieve column names and set the column names.

Now that the data are in the right form, let’s make a simple line chart.

#======
# PLOT 
#======

#--------------------------------------
# FIRST ITERATION
# - this is the quick-and-dirty version
#--------------------------------------

ggplot(data = stock_amzn, aes(x = date, y = close)) +
  geom_line()



This is about as simple as it gets in < inline_code>ggplot2, but let’s break it down.

The < inline_code>ggplot() function indicates that we’re going to use < inline_code>ggplot2 to make a plot.

The < inline_code>data = parameter specifies that we’re going to be plotting data in the < inline_code>stock_amzn dataframe.

Then, the < inline_code>aes() function allows us to specify our variable mappings. With the statement < inline_code>x = date, we are mapping the < inline_code>date variable to the x-axis. Similarly, with the statement < inline_code>y = close, we are mapping the < inline_code>close variable to the y-axis.

Finally, < inline_code>geom_line() specifies that we want to draw lines.

Again, this is just about as simple as it gets.

Once you know more about how < inline_code>ggplot2 works, you can format the plot.

Having said that, let’s take a look at a ‘polished’ version of the plot … a version that’s been heavily formatted:

#--------------------------------------
# POLISHED VERSION
# - this is the 'finalized' version
# - we arrive at this after a lot of
#   itteration ....
#--------------------------------------

ggplot(stock_amzn, aes(x = date, close)) +
  geom_line(color = 'cyan') +
  geom_area(fill = 'cyan', alpha = .1) +
  labs(x = 'Date'
       , y = 'Closing\nPrice'
       , title = "Amazon's stock price has increased dramatically\nover the last 20 years") +
  theme(text = element_text(family = 'Gill Sans', color = "#444444")
        ,panel.background = element_rect(fill = '#444B5A')
        ,panel.grid.minor = element_line(color = '#4d5566')
        ,panel.grid.major = element_line(color = '#586174')
        ,plot.title = element_text(size = 28)
        ,axis.title = element_text(size = 18, color = '#555555')
        ,axis.title.y = element_text(vjust = 1, angle = 0)
        ,axis.title.x = element_text(hjust = 0)
        ) 

And here’s the final chart:



If you’re a beginner, don’t be intimidated: this finalized chart is not hard to do.

Really. With a little practice, you should be able to learn to create a well-formatted chart like this very quickly. It should take you only a few hours to learn how the code works, and you should be able to memorize this syntax within a week or two.

… and when I say memorize, I mean that you should be able to write all of this code from memory.

Ideally, if you’re fluent in < inline_code>R and < inline_code>ggplot2, it should only take you 10 or 15 minutes to write all of this code, start to finish.

Sign up now, and discover how to become fluent in R

Are you still struggling with R and < inline_code>ggplot2?

Becoming fluent in R is really straightforward, if you know how to practice.

If you’re ready to master R, sign up for our email list.

Not only will you receive tutorials (delivered to your inbox) …

… but you’ll also get lessons on how to practice so that you can master R as quickly as possible.

And if you sign up right now, you’ll also get access to our “Data Science Crash Course” for free.

SIGN UP NOW

The post How to make a line chart with < inline_code>ggplot2 appeared first on SHARP SIGHT LABS.

To leave a comment for the author, please follow the link and comment on their blog: r-bloggers – SHARP SIGHT LABS.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.