A Year of #TidyTuesday
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
One of my goals for 2021 was to participate in the #TidyTuesday challenge on a regular basis. This blog post reflects on the past year of data visualisations.
For those of you who don’t know, #TidyTuesday is a weekly data challenge aimed at the R community. Every week a new dataset is posted alongside a chart or article related to that dataset, and ask participants explore the data. You can access the data and find out more here.
This tweet from @ kustav_sen shows everyone who has participated in TidyTuesday at least 10 times this year.
Another year of #TidyTuesday now complete! ?Here is a collage of all the participants with more than 10 posts over 2021. Big thanks to @thomas_mock for leading the initiative?
— Kaustav Sen (@kustav_sen) December 27, 2021
Interactive version: https://t.co/g6MqsVVPrJ#RStats #d3js #dataviz
1/3 pic.twitter.com/yb2Iz2HZn7
The first thing I want to reflect on is what I was trying to achieve by taking part in TidyTuesday.
- I got better at processing data. The data sets for TidyTuesday are usually relatively clean, so this was more about getting comfortable with {tidyverse} packages like {dplyr} and {tidyr} to get data into the format I needed it to be.
- I got faster at doing basic things in {ggplot2}. I don’t need to look up syntax to change plot elements quite as much. Although Google still likes to remind me how often I’ve looked up multi-column legends…
- I tried out a lot of different (sometimes new, sometime old) packages in R, and discovered a lot of new ways of plotting data. Sometimes it worked, sometimes it didn’t.
- I found lots of cool people on Twitter who also think playing with data in their spare time is fun.
I saw this tweet from @ aabattani reflecting on a year of Tableau vizzes, and I wanted to do the same for my #TidyTuesday contributions.
In the spirit of @josh_tapley's year in review, I wanted to do a little more reflecting on my @tableaupublic vizzes. So here is:
— autumn? (@aabattani) December 27, 2021
– my first viz of this year
– my most recent viz of the year
– my favorite viz I posted this year
– and one I'd like to re-do
what are yours? pic.twitter.com/MUuWNlimRF
First visualisation
My first contribution to #TidyTuesday back in January 2021 visualised the cost of different infrastructure projects. I still quite like this plot, although there are a few things I’d do differently. I’d probably colour and sort these by continent because the order of countries doesn’t make much sense. Different colours for pre- and post- 2021 expenditure may also be more helpful than a dashed line. I would also add a more detailed subtitle or summary to explain what’s going on.
In terms of R code, I reordered the variables manually by start date using factor(df$country_name[order(df$start_year)],levels=df$country_name[order(df$start_year)])
and there are definitely better ways of doing this e.g. with fct_reorder()
.
Overall, I think this was a pretty good first attempt.
My PhD project analyses data obtained from railway networks and so does this week's #TidyTuesday!
— Nicola Rennie (@nrennie35) January 6, 2021
Code on GitHub: https://t.co/XCjc1qsQIj pic.twitter.com/T6LsNKGKbZ
Most recent visualisation
I had the idea for this one in my head when I saw the data set, so it was more about trying to create the idea I had rather than playing around with different ideas.
The shades of green aren’t quite different enough for the legend to be useful. Instead, I’d add the values as text underneath each icon, and add an arrow on the right hadn side to indicate increasing caffeine levels.
I really like the clean, minimalist design of this one.
Looking at how much caffeine there is in different #Starbucks drinks for #TidyTuesday this week!
— Nicola Rennie (@nrennie35) December 21, 2021
Thanks @StarTrek_Lt for the data!
Code: https://t.co/f6VyY6vaSC #DataVisualization #DataViz #DataScience #RStats pic.twitter.com/3kRCY7aN6v
Favourite visualisation
My favourite data for #TidyTuesday in 2021 came from the Duke Lemur Center. One of the nice features was the personal data on each lemur, so you could track individual lemurs and their offspring over time. This was the point of the year, where I started to experiment with the graphic design side of visualisation more using package like {cowplot} and {patchwork} to combine plots and overlay images (like this one of an adorable lemur).
I also ended up following the Duke Lemur Center on twitter after this so my twitter feed regularly features lemur photos which is no bad thing.
Looking at collared brown lemurs at @DukeLemurCenter for #TidyTuesday this week. The oldest lemur, Yvette, born in 1959, lived to 32.6 years.
— Nicola Rennie (@nrennie35) August 24, 2021
Code: https://t.co/IEBznAlkGk#DataVisualization #DataViz #DataScience #RStats pic.twitter.com/Pv5NxqOuKF
Visualisation I’d like to redo
I remember having a bit of an idea for this plot, but not quite knowing how to create it, I ended up using the {ggbump} package to create the sigmoids. Several weeks later, I discovered that this was actually called a Sankey chart and that there was a {ggplot2} compatible R package for creating them called {ggalluvial}. I’ve used it several times since, and I think it would definitely improve the execution of this idea.
A topical one for this week's #TidyTuesday (as I currently bake in 32°C heat), looking at the change in the percentage of California's population affected by droughts in the last 20 years.
— Nicola Rennie (@nrennie35) July 20, 2021
Code: https://t.co/d8o9unYSB8#DataVisualization #DataViz #DataScience #RStats pic.twitter.com/abpTgOyQ8p
Final thoughts
Overall, #TidyTuesday has been one of the best and most fun work-related things I decided to do on 2021. During my PhD, although I had a lot of data, it all looked quite similar so there wasn’t a lot of opportunity to experiment with different plots. #TidyTuesday is a nice environment to experiment with different plots and packages because you don’t have to get it right. For me at least, it’s about trying something new rather than making publication-ready plots. I’m definitely looking forward to participating in #TidyTuesday in 2022, and beyond.
Huge thanks to @ thomas_mock for putting #TidyTuesday together every week, and thanks to everyone who contributed data. I might add some data of my own next year.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.