Plotly for R – Multi-Layer Plots
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If you are new to plotly
, consider first reading our introductory post:
Introduction to Interactive Graphics in R with plotly
Often when analyzing data, it is necessary to produce a complex plot that requires multiple graphical layers. In plotly
, multi-layer plots can be specified as a pipeline of data manipulations (dplyr
only) and visual mappings. This is possible because dplyr
verbs can be used on a plotly
object to modify the underlying data. In programming, mutability refers to the ability of an object to be modified after its creation. The mutability of plotly
objects allows for a pipeline where you can add a graphical layer based on one version of the data, modify the data with dplyr
, and then add a second layer based on the modified data. This design choice provides great flexibility in developing complex plots while still remaining intuitive. The resulting code is easy to read and understand, and it fits perfectly into a tidyverse
workflow.
Mutability
To demonstrate the ability to manipulate the underlying data of a plotly
object, we’ll use a simple example using the mpg
dataset.
library(tidyverse) library(plotly) mpg_plotly <- mpg %>% plot_ly()
plot_ly()
maps the R objects we pass into it into a JavaScript plotly
object.
In a simple case we can then pass the plotly
object into an add_*()
function to specify how we’d like the data to be mapped to a graphical layer.
mpg_plotly %>% add_markers(x = ~cty, y = ~hwy)
As opposed to other plot objects (from base
, ggplot2
, etc), plotly
objects are mutable. The data underlying the object can be manipulated using dplyr
commands. A useful function to inspect the current data of the object is plotly_data()
.
mpg ## # A tibble: 234 x 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.8 1999 4 auto~ f 18 29 p comp~ ## 2 audi a4 1.8 1999 4 manu~ f 21 29 p comp~ ## 3 audi a4 2 2008 4 manu~ f 20 31 p comp~ ## 4 audi a4 2 2008 4 auto~ f 21 30 p comp~ ## 5 audi a4 2.8 1999 6 auto~ f 16 26 p comp~ ## 6 audi a4 2.8 1999 6 manu~ f 18 26 p comp~ ## 7 audi a4 3.1 2008 6 auto~ f 18 27 p comp~ ## 8 audi a4 q~ 1.8 1999 4 manu~ 4 18 26 p comp~ ## 9 audi a4 q~ 1.8 1999 4 auto~ 4 16 25 p comp~ ## 10 audi a4 q~ 2 2008 4 manu~ 4 20 28 p comp~ ## # ... with 224 more rows mpg_plotly %>% plotly_data() ## # A tibble: 234 x 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.8 1999 4 auto~ f 18 29 p comp~ ## 2 audi a4 1.8 1999 4 manu~ f 21 29 p comp~ ## 3 audi a4 2 2008 4 manu~ f 20 31 p comp~ ## 4 audi a4 2 2008 4 auto~ f 21 30 p comp~ ## 5 audi a4 2.8 1999 6 auto~ f 16 26 p comp~ ## 6 audi a4 2.8 1999 6 manu~ f 18 26 p comp~ ## 7 audi a4 3.1 2008 6 auto~ f 18 27 p comp~ ## 8 audi a4 q~ 1.8 1999 4 manu~ 4 18 26 p comp~ ## 9 audi a4 q~ 1.8 1999 4 auto~ 4 16 25 p comp~ ## 10 audi a4 q~ 2 2008 4 manu~ 4 20 28 p comp~ ## # ... with 224 more rows
Since we haven’t manipulated the object in any way, plotly_data()
returns the data that we passed in.
Let’s say that we only want to plot the miles-per-gallon data for pickup trucks.
pickup_plotly <- mpg_plotly %>% filter(class == "pickup") %>% add_markers(x = ~cty, y = ~hwy) pickup_plotly
pickup_plotly %>% plotly_data() ## # A tibble: 33 x 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 dodge dako~ 3.7 2008 6 manu~ 4 15 19 r pick~ ## 2 dodge dako~ 3.7 2008 6 auto~ 4 14 18 r pick~ ## 3 dodge dako~ 3.9 1999 6 auto~ 4 13 17 r pick~ ## 4 dodge dako~ 3.9 1999 6 manu~ 4 14 17 r pick~ ## 5 dodge dako~ 4.7 2008 8 auto~ 4 14 19 r pick~ ## 6 dodge dako~ 4.7 2008 8 auto~ 4 14 19 r pick~ ## 7 dodge dako~ 4.7 2008 8 auto~ 4 9 12 e pick~ ## 8 dodge dako~ 5.2 1999 8 manu~ 4 11 17 r pick~ ## 9 dodge dako~ 5.2 1999 8 auto~ 4 11 15 r pick~ ## 10 dodge ram ~ 4.7 2008 8 manu~ 4 12 16 r pick~ ## # ... with 23 more rows
This equivalent plotly
object can also be obtained by filtering the data prior to passing it into plot_ly()
. However, the ability to modify the object will prove to be useful when creating more complex multi-layer plots.
plotly_pickup_1 <- mpg %>% filter(class == "pickup") %>% plot_ly() plotly_pickup_2 <- mpg %>% plot_ly() %>% filter(class == "pickup") plotly_pickup_1 %>% plotly_data() ## # A tibble: 33 x 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 dodge dako~ 3.7 2008 6 manu~ 4 15 19 r pick~ ## 2 dodge dako~ 3.7 2008 6 auto~ 4 14 18 r pick~ ## 3 dodge dako~ 3.9 1999 6 auto~ 4 13 17 r pick~ ## 4 dodge dako~ 3.9 1999 6 manu~ 4 14 17 r pick~ ## 5 dodge dako~ 4.7 2008 8 auto~ 4 14 19 r pick~ ## 6 dodge dako~ 4.7 2008 8 auto~ 4 14 19 r pick~ ## 7 dodge dako~ 4.7 2008 8 auto~ 4 9 12 e pick~ ## 8 dodge dako~ 5.2 1999 8 manu~ 4 11 17 r pick~ ## 9 dodge dako~ 5.2 1999 8 auto~ 4 11 15 r pick~ ## 10 dodge ram ~ 4.7 2008 8 manu~ 4 12 16 r pick~ ## # ... with 23 more rows plotly_pickup_2 %>% plotly_data() ## # A tibble: 33 x 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 dodge dako~ 3.7 2008 6 manu~ 4 15 19 r pick~ ## 2 dodge dako~ 3.7 2008 6 auto~ 4 14 18 r pick~ ## 3 dodge dako~ 3.9 1999 6 auto~ 4 13 17 r pick~ ## 4 dodge dako~ 3.9 1999 6 manu~ 4 14 17 r pick~ ## 5 dodge dako~ 4.7 2008 8 auto~ 4 14 19 r pick~ ## 6 dodge dako~ 4.7 2008 8 auto~ 4 14 19 r pick~ ## 7 dodge dako~ 4.7 2008 8 auto~ 4 9 12 e pick~ ## 8 dodge dako~ 5.2 1999 8 manu~ 4 11 17 r pick~ ## 9 dodge dako~ 5.2 1999 8 auto~ 4 11 15 r pick~ ## 10 dodge ram ~ 4.7 2008 8 manu~ 4 12 16 r pick~ ## # ... with 23 more rows
Multi-layer Example
Now that we’ve set the foundation, we can look at a more complicated example.
We’ll be using the txhousing
dataset from ggplot2
, which tracks housing prices for cities in Texas over time. Let’s start by plotting the time trend for each city.
txhousing ## # A tibble: 8,602 x 9 ## city year month sales volume median listings inventory date ## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Abilene 2000 1 72 5380000 71400 701 6.3 2000 ## 2 Abilene 2000 2 98 6505000 58700 746 6.6 2000. ## 3 Abilene 2000 3 130 9285000 58100 784 6.8 2000. ## 4 Abilene 2000 4 98 9730000 68600 785 6.9 2000. ## 5 Abilene 2000 5 141 10590000 67300 794 6.8 2000. ## 6 Abilene 2000 6 156 13910000 66900 780 6.6 2000. ## 7 Abilene 2000 7 152 12635000 73500 742 6.2 2000. ## 8 Abilene 2000 8 131 10710000 75000 765 6.4 2001. ## 9 Abilene 2000 9 104 7615000 64500 771 6.5 2001. ## 10 Abilene 2000 10 101 7040000 59300 764 6.6 2001. ## # ... with 8,592 more rows all_cities <- txhousing %>% group_by(city) %>% plot_ly(x = ~date, y = ~median) %>% add_lines( name = "Texan Cities", line = list(width = 1.33), alpha = 0.2, hoverinfo = "none" ) %>% ungroup() all_cities
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.