Site icon R-bloggers

Salaries by alma mater – an interactive visualization with R and plotly

[This article was first published on Alexej's blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Based on an interesting dataset from the Wall Street Journal I made the above visualization of the median starting salary for US college graduates from different undergraduate institutions (I have also looked at the mid-career salaries, and the salary increase, but more on that later). However, I thought that it would be a lot more informative, if it were interactive. To the very least I wanted to be able to see the school names when hovering over or clicking on the points with the mouse.

Luckily, this kind of interactivity can be easily achieved in R with the library plotly, especially due to its excellent integration with ggplot2, which I used to produce the above figure. In the following I describe how exactly this can be done.

Before I show you the interactive visualizations, a few words on the data preprocessing, and on how the map and the points are plotted with ggplot2:

Now, entering p into the R console will generate the figure shown at the top of this post.

However, we want to…

…make it interactive

The function ggplotly immediately generates a plotly interactive visualization from a ggplot object. It’s that simple! :smiley: (Though I must admit that, more often than I would be okay with, some elements of the ggplot visualization disappear or don’t look as expected. :fearful:)

The function argument tooltip can be used to specify which aesthetic mappings from the ggplot call should be shown in the tooltip. So, the code

ggplotly(p, tooltip = c("text", "starting"),
         width = 800, height = 500)

generates the following interactive visualization.

Now, if you want to publish a plotly visualization to https://plot.ly/, you first need to communicate your account info to the plotly R package:

Sys.setenv("plotly_username" = "??????")
Sys.setenv("plotly_api_key" = "????????????")

and after that, posting the visualization to your account at https://plot.ly/ is as simple as:

plotly_POST(filename = "Starting", sharing = "public")

More visualizations

Finally, based on the same dataset I have generated an interactive visualization of the median mid-career salaries by undergraduate alma mater (the R script is almost identical to the one described above). The resulting interactive visualization is embedded below.

Additionally, it is quite informative to look at a visualization of the salary increase from starting to mid-career.

To leave a comment for the author, please follow the link and comment on their blog: Alexej's blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.