The Hitchhiker’s Guide to Ggplot2 in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Published: 2016-11-30
Updated: 2017-08-14
César A. Hidalgo
About the book
You can find the book here. In the last update we changed many parts of the code, doing a necessary transition from old school Base R to modern Tidyverse tools. The reason to do that was to update parts of the code according to the last version of the R packages we use, simplify the code, make the code more readable, and in the end include feedback from our students during last academic term.
This is a book that may look complete but changes in R package are always demanding changes in the examples contained within the book. This is why the electronic format is perfect for the purpose of this work. Trapping it inside a dead tree book is ultimately a waste of time and resources in my on view.
Aside from being my first book, this is also my first collaborative work. I wrote it in a 50-50 collaboration with Jodie Burchell. Jodie is an amazing data scientist. I highly recommend reading her blog Standard Error where you can find really good material on Reproducible Research and more.
This is a technical book. The scope of the book is to go straight to the point and the writing style is similar to a recipe with detailed instructions. It is assumed that you know the basics of R and that you want to learn how to create beautiful plots.
Each chapter will explain how to create a different type of plot, and will take you step-by-step from a basic plot to a highly customised graph. The chapters’ order is by degree of difficulty.
Every chapter is independent from the others. You can read the whole book or go to a section of interest and we are sure that it will be easy to understand the instructions and reproduce our examples without reading the first chapters.
In total this book contains 237 pages (letter paper size) of different recipes to obtain an acceptable aesthetic result. You can download the book for free (yes, really!) from Leanpub.
How the book started?
Almost a year ago I finished writing the eleventh tutorial in a series on using ggplot2 I created with Jodie Burchell.
I asked Jodie to co-authors some blog entries when I found her blog and I realised that my interest in Data Science was reflected on her blog. The book comes after those entries on our blogs.
A few weeks later those tutorials evolved into the shape of an ebook. The reason behind it was that what we started to write had an unexpected success. We even had RTs from important people in the R community such as Hadley Wickham. Finally the book was released by Leanpub.
We also included a pack that contains the Rmd files that we used to generate every chart that is displayed in the book.
Why Leanpub?
Leanpub is a platform where you can easily write your book by using MS Word among other writing software and it even has GitHub and Dropbox integration. We went for R Markdown with LaTeX output, and that means that Leanpub is both easy to use and flexible at the same time.
Even more, Leanpub enables the audience to download your books for free, if you allow it, or you can define a price range with a suggested price indication. The website gives the authors a royalty of 90% minus 50 cents per sale (compared to other platforms this is convenient for the authors). You can also sell your books with additional exercises, lessons in video, etc.
For example, last year I updated all the examples contained in the book just a few days after ggplot2 2.2 was released and my readers had a notification email just after I uploaded the new version. People who pay or does not pay for your books can download the newer versions of if for free.
If that’s not enough Leanpub allows you to create bundles and sell your books as a set or you can charge another price for your book plus additional material such as Rmarkdown notebooks, instructional videos and more.
What I learned from my first book?
At the moment I am teaching Data Visualization and from my students I learned that good visualizations come after they learn the visualization concepts. Coding cleary helps but coding goes after the fundamentals.
It would be better to teach visualization fundamentals first and not in parallel while coding, and this applies specially when a part of your audience has never wrote code before.
I got a lot of feedback from my students last term. That was really helpful to improve the book and dive some steps in smaller pieces to facilitate the understading of the Grammar of Graphics.
The interested reader may find some remarkable books that can be read before mine. I highly recommend:
- Data Visualisation: A Handbook for Data Driven Design
- Storytelling with Data: A Data Visualization Guide for Business Professionals
- The Functional Art: An introduction to information graphics and visualization.
- The Grammar of Graphics
Those are really good books that show the fundamentals of Data Visualisation and provide the key concepts and rules needed to communicate effectively with data.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.