Teaching targets with Penguins

[This article was first published on rOpenSci - open tools for open science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Many researchers are becoming more aware of the importance of reproducibility. Although reproducibility involves a diverse array of topics and tools, one rOpenSci package has gained considerable attention for enabling reproducible analysis pipelines in R: targets, by Will Landau.

Why a targets workshop?

Despite its popularity, targets can be challenging to learn because it requires code to be organized in a way that is different from what most people are used to.

One of the authors (Joel) had been using targets since the time of its predecessor, drake, and had taught several smaller workshops (for example, this one) to help others learn to use it. Although these workshops were useful, they were produced informally and didn’t demonstrate the full range of capabilities of targets.

The idea for a full, one-day workshop came out of a fortuitous set of events. Joel was planning to visit Oslo, Norway, for one week to teach a completely separate workshop on Spatial Phylogenetics. He reached out to his colleague Mo, another active member of rOpenSci, to ask about getting together for a coffee, whereupon Mo asked if Joel would be up for teaching another workshop during his visit. Mo had seen Joel on an rOpenSci community call about targets just a couple of weeks prior, and really wanted to take advantage of his experience to start digging into targets. Joel, on his side, realized this was just the motivation he needed to fully flesh out the workshop materials, so he happily agreed.

In this post, we will describe the process of designing the curriculum, share highlights from the first (and so far, only) workshop, and end with a look toward the future.

Curriculum design

At the workbench

The Carpentries Workbench was used to write the curriculum. Workbench is a set of R packages developed by Zhian Kamvar that makes it easy to render beautifully formatted Carpentries workshop curricula from (R)markdown. This allows the author to focus on the content rather than the formatting. Furthermore, Workbench provides lesson templates with sections like Questions, Objectives, and Key Points that help guide curriculum development. While Workbench is geared towards Carpentries lessons, it is an excellent tool for designing online curricula in general, so we highly recommend it for anyone interested in developing (technical) teaching materials.

From tar_script() to palmerpenguins

Joel’s previous workshops used the default targets template generated by tar_script() as a starting point and demonstrated various concepts with short, ad-hoc code examples. The initial workshop as developed by Joel was a nice step-wise walk into targets, with different small datasets to highlight targets features. While this approach worked for short demos, we realized that for a full day workshop, it would be better to have a more well-defined analysis goal to work towards.

Taking inspiration from a recently developed targets workshop by Matt Brousil, we decided to switch to the palmerpenguins dataset. palmerpenguins has some excellent properties for teaching data analysis. It includes the raw data in a CSV file, so we can teach data loading and cleaning, a nearly ubiquitous part of any workflow. The data are organized in a way that is intuitive, allowing the participants to quickly move into analysis without having to tease apart data structure. Finally, there is an interesting pattern that is eventually revealed when analyzing the penguins’ bill length and depth: although the data seem to indicate a negative relationship overall, the slope actually becomes positive when species are taken into account, demonstrating a statistical phenomenon known as “Simpson’s Paradox”.

The second iteration of the workshop stil followed Joels original “path” through targets, but now keeping to the single dataset through the workshop (we did still switch to something else a couple of times to prove a point).

However, using a single dataset for the whole workshop does come with some challenges. There are still times when it may be easier to demonstrate a certain concept with a different example, like when demonstrating speed increases with parallelization. Since the dataset is rather small, the models all finish in a negligible amount of time, and it is difficult to appreciate the difference when running in parallel. However, even though bringing up a different example may seem trivial to someone who is familiar with targets, we realized this context switching may add considerable cognitive load to learners.

The first workshop

Setup

In the days before the workshop, we (Mo and Joel) communicated and revised the lesson plan together through a series of GitHub Pull Requests. As Mo had no experience with targets, but lots of R and teaching R experience, she provided feedback on the general flow of the lesson. Joel, with more experience in using and teaching targets, provided context and objectives for each lesson. This made a really nice dynamic between the two of us, and made the total experience even more enjoyable.

The workshop was hosted by the Digital Scholarship Center (DSC) at the University of Oslo, which also hosts the local Carpentries community (Carpentry@UiO). The DSC took care of the important administrative tasks required to host a workshop: setting up a website to announce the workshop, handling the sign-ups (including a waiting list), booking an accessible and suitable room, etc. Additionally, they provided sticky-notes and extension cords for the room on the day. In short, organising the workshop was made very smooth by the expert and friendly staff of the DSC.

Participants

We had 19 participants sign up, and more than half of those who signed up came (about normal no-show rate for DSC workshops). The participants covered a wide range of disciplines, from psychology to biology to humanities, and in every academic stage. This diverse group was perfect to test the workshop on!

Feedback

Overall, feedback was quite positive, so we feel that we are definitely on the right track with this implementation of the workshop. Participants said that they feel more confident about using targets for their own analyses. The biggest challenge seems to be writing functions, which may be unfamiliar to many researchers but is absolutely necessary for effective use of targets. This is something that probably needs to be tailored to each workshop: the instructor should understand the participants’ skill level with writing functions and provide more instruction on this topic as needed.

Next steps

We are very happy with the success of the first (unofficial) Carpentries targets workshop. Our biggest hope is that more people will teach the workshop and contribute back to it. If you have any suggestions about the material, please feel free to submit a PR or file an issue, or just reach out to us by email.

Happy workflowing!

Joel and Mo posing with grins after the workshop in front of a large projector screen showing the rOpenSci website behind them.

Joel and Mo after the workshop

To leave a comment for the author, please follow the link and comment on their blog: rOpenSci - open tools for open science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)