Interactive visualizations with R – a minireview
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Interactive visualization allows deeper exploration of data than static plots. Javascript libraries such as d3 have made possible wonderful new ways to show data. Luckily the R community has been active in developing R interfaces to some popular javascript libraries to enable R users to create interactive visualizations without knowing any javascript.
In this post I have reviewed some of the most common interactive visualization packages in R with simple example plots along with some comments and experiences. Here are the packages included:
- ggplot2 – one of the best static visualization packages in R
- ggvis – interactive plots from the makers of ggplot2
- rCharts – R interface to multiple javascript charting libraries
- plotly – convert ggplot2 figures to interactive plots easily
- googleVis – use Google Chart Tools from R
You can either jump straight to the example visualization or read my comments first. The R markdown source code for this blog post with embedded visualizations can be found in Github. I have probably missed some important features and documentation, and also clear mistakes are possible. Please point those out in the comments, and I’ll fix them. It is also important to note that I am a heavy ggplot2 user, and hence my comments may also be biased!
Also other libraries for creating interactive visualizations from R do exist, such as clickme, RIGHT, ggobi, iplots, gg2v, rVega, cranvas and r2d3. Some of these are not under active development anymore. I might include some of those into the comparison here in the future. The d3Network package is also worth checking if you need cool interactive network visualizations.
Technical features
All four packages use javascript for the interactive visualizations, and are cabable of producing most of the standard plot types. The syntaxes vary somewhat, as ggvis uses the pipe operator %>%
(familiar for dplyr users), replacing the +
in ggplot2. rCharts uses several javascript libraries and also the syntax used varies between different types of charts.
All other packages than googleVis are clearly in an early development phase, which is visible in a limited features and documentation. As an experienced ggplot2 user it was often hard to adapt to the much narrower range of features included in ggvis. For example faceting is a very important feature that hopefully gets implemented soon.
Documentation-wise ggvis and googleVis seem to be the most advanced. rCharts especially suffers from the combination of multiple plot types (named rather uninformatively as rPlot
, nPlot
and so on) with practically no documentation. So producing anything else than what’s provided in the existing examples was very hard.
googleVis sets itself apart by requiring the data in a different format than the other packages. In Hadley Wickham’s terms, it assumes the data is in the messy format, in contrast to the other packages, tha assume tidy data. This makes it somewhat hard to use, at least when one is used to using tidy data frames. See the examples below for more details.
Plotly is an interesting alternative to the other packages in that it simply takes as input a ggplot2 object and transforms it into an interactive chart that can then be embedded into websites. Using the service requires authentication, which is a clear limitation. By default all plots are made publicly visible to anyone, but there apparently is a way to produce private plots as well, with a limit in their number in the free account.
ggvis is currently the only one of these packages that can not produce map visualizations, but I assume this feature will be added in the future. plotly can use maps created with ggplot2, but not yet with the handy ggmap extension.
Sharing the visualizations
Interactive visualizations are typically meant to be shared for a larger audience. Common ways to share interactive visualizations from R are as standalone html files, embedded in R markdown documents, and embedded in Shiny applications. All the studied packages can produce standalone htmls, though possibly with some loss of interactivity.
R markdown documents are a very nice way of sharing reproducible analyses, using the knitr and rmarkdown packages. Outputs from all the studied visualization packages can be embedded in .Rmd documents, though I had some problems (see the Issues section below). All packages are also compatible with Shiny documents and applications, and have good tutorials for this.
Issues
I encountered several problems when preparing this blog post. Specifically, I had issues in embedding the plots into R markdown documents. This is made more complicated with the various available ways of turning .Rmd files into html: manual knit()
and knit2html()
functions, the Knit HTML
button in RStudio, and a Jekyll-powered blog with its own requirements. Here I have listed the most important issues, with solutions when found. Some things are still unsolved, hope someone can help me with those!
- ggvis showed up nicely with
Knit HTML
, as it creates a standalone file with the necessary javascript libraries included. However, this was not the case with my blog setup. My solution was to inlude the set of scripts (taken from the source of this page) into the header of all my blog posts (see here). Not sure if this is an optimal solution. - rCharts: Embedding rCharts to R markdown did not quite work either as shown e.g. here. With
Knit HTML
button the line that worked wasrchars.object$print(include_assets=TRUE)
, whereas with the blog the line wasrchars.object$show('iframesrc', cdn=TRUE)
. - plotly: Embedding plotly charts into R markdown documents did not work as shown here, but adding
session="knit"
to theggplotly()
call solved the issue (thanks to Scott Chamberlain and Marianne Corvellec for help!). Note that in this post I embedded existing plotly charts manually. - There are still two charts that do now show up in this post. I have quite limited understanding of how knitr, jekyll and the javascript tools work together, and could not get these to work. Perhaps the scripts somehow conflict with each other?
I also noticed some minor issues:
- googleVis was missing axis labels by default
- rCharts is missing legend titles, and behaves strangely on scatter plot: legend shows partially incorrect information, and the plot area is too tight
Summary
In general, being able to produce valid interactive html charts from R markdown without knowing any javascript is great! All of the packages great sensible outputs, but there are also a lot of differences. I love ggplot2, and hence I also like ggvis, as it pays attention to graphical details following the grammar of graphics principles. However, the package is still missing a lot of important features, such as faceting. In many cases rCharts can do what ggvis can not (yet), and so it is a good alternative. However, the missing documentation makes it hard to create customized plots. Plotly has a really nice idea and implementation, but requirement for authentication and limited number of private plots reduce the usability a lot. Google’s Motion charts are cool and useful, but otherwise the input data format logic that differs from the packages makes using the package too hard in practice.
Example visualizations
Here I have made example plots with the interactive tools: histograms, scatter plots and line plots. Source code is available in Github. First we need to install and load the necessary R packages:
## Install necessary packages install.packages("devtools") library("devtools") install.packages("ggvis") install.packages("googleVis") install_github("ramnathv/rCharts") install_github("ropensci/plotly") install.packages("dplyr") install.packages("tidyr") install.packages("knitr") # Load packages library("ggvis") library("googleVis") library("rCharts") library("plotly") library("dplyr") library("tidyr") library("knitr") # Define image sizes img.width <- 450 img.height <- 300 options(RCHART_HEIGHT = img.height, RCHART_WIDTH = img.width) opts_chunk$set(fig.width=6, fig.height=4)
Plotly needs some setting up (using the credentials from here).
# Plotly requires authentication py <- plotly("RgraphingAPI", "ektgzomjbx")
Prepare the mtcars data set a bit.
# Use mtcars data data(mtcars) mtcars$cyl <- factor(mtcars$cyl) mtcars$am <- factor(mtcars$am) # Compute mean mpg per cyl and am mtcars.mean <- mtcars %>% group_by(cyl, am) %>% summarise(mpg_mean=mean(mpg)) %>% select(cyl, am, mpg_mean) %>% ungroup()
Histograms
ggplot
hist.ggplot <- ggplot(mtcars, aes(x=mpg)) + geom_histogram(binwidth=1) hist.ggplot
ggvis
hist.ggvis <- mtcars %>% ggvis(x = ~mpg) %>% layer_histograms(width=1) %>% set_options(width = img.width, height = img.height) hist.ggvis
rCharts
# rCharts histogram needs manual binning and counting! hist.rcharts <- rPlot(x="bin(mpg,1)", y="count(id)", data=mtcars, type="bar") # Use this with 'Knit HTML' button # hist.rcharts$print(include_assets=TRUE) # Use this with jekyll blog hist.rcharts$show('iframesrc', cdn=TRUE)