Line Plots in R-Time Series Data Visualization
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Line Plot in R, this tutorial will show you how to create simple line plots, adjust the axis labels and colors of plots, and create multiple line graphs.
Line plots aid in the visualization of time series data. Throughout this lesson, you will see examples of line plots created with the “EUStockMarkets” time series dataset included with the ggplot2 package.
Data Farme
Enter the question mark EUStockMarkets (?EUStockMarkets ) to see more information about the data.
This dataset has 1860 rows and 4 variables, thus just the first six rows are displayed.
dataset, use the head() function.
EuStockDF<-as.data.frame(EuStockMarkets) EuStockDF$Date<-as.numeric(time(EuStockMarkets)) head(EuStockDF)
This dataset is a “mts,” which stands for multivariate time series object.
Because ggplot cannot plot time series objects, you must first convert it to a data frame and then use the time() function to retrieve the date information.
There is now a “Date” column in the dataset.
DAX SMI CAC FTSE Date 1 1628.75 1678.1 1772.8 2443.6 1991.496 2 1613.63 1688.5 1750.5 2460.2 1991.500 3 1606.51 1678.6 1718.0 2448.2 1991.504 4 1621.04 1684.1 1708.1 2470.4 1991.508 5 1618.16 1686.6 1723.1 2484.7 1991.512 6 1610.61 1671.6 1714.3 2466.8 1991.515
Line Plots in R
One of the main functions to create line plots is geom_line().
To make a line plot, you must first provide variables for the x and y axes using the ggplot() function.
The x-axis in this example with the EuStockMarkets dataset is “date,” and the y axis is “SMI,” which is the daily closing prices for the stock market index SMI (for Switzerland.)
The geom_line() method is then used to generate the line.
The x and y-axis labels will be the names of the variables you selected when you created the plot by default.
ggplot(EuStockDF,aes(x=Date, y=SMI))+geom_line()
However, variable names are frequently insufficiently descriptive.
For example, if someone were to glance at this graph, they might not know what “SMI” stands for.
To modify the axis labels, use the labs() method and provide the y axis labels with a new name.
ggplot(EuStockDF,aes(x=Date, y=SMI))+geom_line()+labs(y="Closing Price Of Switzerland(SMI) stock index")
It is now evident what this graph is telling us thanks to more detailed labeling.
Multiple lines on the same plot are often useful for comparing multiple variables across time.
You can add another geom_line() method to the plot to add another line to the plot.
ggplot(EuStockDF,aes(x=Date))+ ggplot(EuStockDF,aes(x=Date))+ geom_line(aes(y=DAX),color="blue")+ geom_line(aes(y=SMI),color="red")+ geom_line(aes(y=CAC),color="pink")+ geom_line(aes(y=FTSE),color="green")
In this graphic, for example, there is a line for each stock index.
However, there are a number of disadvantages to employing this strategy.
For some variables, the y axis label and scaling may not make sense.
In addition, for each new lines, you must manually use geom_line() and specify the color.
For the plot to make sense, you may need to perform a lot of tweaking.
Also, in the first geom_line() function, the y axis in the plot is labeled with the name of the y variable.
Let’s have a look at another technique to make a multiple-line plot using the same data.
If you look at the original EUStockMarkets dataset, you’ll see that the market stock indexes are organized by columns rather than rows.
This is a typical issue that can be resolved by cleaning the data so that it is aligned in rows rather than columns.
This simplifies the process of creating several line charts.
You can use the function tidy, which is provided in the “broom” library, to simply “tidy” up the time series object.
You do not need to install anything because the broom is included in tidyverse, but you may need to load the broom.
library(broom) library(dplyr) tidy_stocks<-tidy(EuStockMarkets) %>% rename(Date=index, Stock_Index=series, Price=value)
By default, the column names are index, series, and value.
To modify the names of the columns, use rename().
As you can see, the dataset has been reduced to three columns.
head(data.frame(tidy_stocks)) Date Stock_Index Price 1 1991.496 DAX 1628.75 2 1991.496 SMI 1678.10 3 1991.496 CAC 1772.80 4 1991.496 FTSE 2443.60 5 1991.500 DAX 1613.63 6 1991.500 SMI 1688.50
There is a date column, which was extracted automatically from the time series object.
The Stock Index column contains the names of each column in the original dataset.
Finally, there’s the Price column, which contains the numbers from the original dataset.
In the original data frame, for example, the first row corresponds to the first four rows.
in the remodeled dataset
Because you have a clean dataset, it is simple to make a multiple-line visualization.
You don’t have to add geom_line() for each new line you want to plot.
To make the line plot, enter x as the date, y as the price, and the color as the variable you cleaned up. Stock Index is the variable in this case.
All of the line colors and the legend are automatically set, so you don’t have to do anything else.
ggplot(tidy_stocks,aes(x=Date,y=Price))+ geom_line(aes(color=Stock_Index))
Line plots are used to depict time series data, as you now know.
You also learned how to make a line plot by defining an x- and y-axis with the geom_line() function, then changing the X and Y axis names and line colors.
Finally, you learned two methods for making numerous line charts.
The post Line Plots in R-Time Series Data Visualization appeared first on finnstats.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.