Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week RStudio hosted their conference, rstudio::conf, in Austin and a whole lot of members of the R community came to see what’s new, where the community and the field might be heading and to enjoy tacos.
R in production
A major theme from this conference was R in production. Joe Cheng kicked this one off in his keynote about Shiny in production, presenting how the answer to “Can Shiny be used for production?” changed from “Yes, it’s quite possible” (in 2015) to “Yes, it’s quite easy” (in 2019). Of course, it isn’t all easy yet. There are still cultural challenges that arise from Shiny apps being developed by R user who aren’t necessarily software engineers, i.e., they may not have all the background knowledge about what is necessary to put their app into production. Organisational challenges may arise when entities like IT and management meet the idea of Shiny apps in production with scepticism. While Shiny makes it easy to create a web app, it doesn’t exactly make automated testing, load testing, profiling and deployment easy. However, the Shiny team at RStudio addresses these aspects and Joe presented
- shinyloadtest – for load testing of Shiny apps
- profvis – a profiler for R (“Use a profiler. Don’t use your intuition.”)
- plot caching – to speed up repeated plots
Our own Mark Sellors then continued the conversation around the cultural and organisational challenges of getting Shiny, and R in general, into production. One way is “the path of magic” where you get people on board with a fantastic Shiny app like Jacqueline Nolis and Heather Nolis have done at T-Mobile. Another path is via building the confidence the business has in the work of the data scientists. This involves a lot of building bridges between different teams so here is a list of things to tackle head on, or coincidentally, a list of “CS/software engineering concepts data scientists might need to learn more about to get things into production” as Caitlin Hudon put it. Mark also shared an R production readiness checklist. For an example that had people buzzing at the conference check out what Jacqueline Nolis and Heather Nolis did after their magic Shiny app: presentations, blog posts and code can all be found here while we wait for the conference videos.
Other talks on this topic included more recent RStudio work:
- James Blair on Democratizing R with Plumber APIs
- Sean Lopp on the RStudio Package Manager
- Jeff Allen on RStudio Connect: Past, present, and future (but really on hats for cats)
Thinking beyond the code
Felienne reminded everyone in her keynote that spreadsheets are code – functional, reactive programming at that – and advocated for a pedagogical debate on how we teach programming. A Dutch programming book aimed at children includes sentences like
- “The best part of programming is finding mistakes.”
- “Programmers only learn from making mistakes.”
- “You will fail often, and it will be frustrating.”
illustrating the “let people discover things on their own” approach. An approach of explaining ideas and then letting people practice works a lot better – in other fields and in teaching programming. Or as Felienne put it: You don’t become an expert by doing expert things.
Angela Bassa gave an excellent talk on Data Science as a team sport highlighting how to grow a data science team by adding specialisation, process and resilience.
Hilary Parker spoke about Using Data Effectively: Beyond Art and Science and casually tidied the tidy workflow.
Caitlin Hudon shared her learnings on which data science mistakes to avoid, covering analysis mistakes, how to work with developers and how to communicate with business stakeholders.
But also think about the code
It wouldn’t quite be an R conference without talks that highlight fantastic things to do with R:
- Tyler Morgan-Wall and Thomas Lin Pedersen brought a lot of movement to their visualisation talks on the rayshader and gganimatepackages.
- Jenny Bryan and Lionel Henry spoke about the why and the how of tidy-eval.
- Earo Wang covered the tsibble and fable packages for tidy time series analysis.
- Max Kuhn and Alex Hayes spoke about modelling in the tidyverse with the parship and broom packages.
- Jim Hester introduced the itdepends package with tools for making decisions about your package dependencies.
It’s all about sharing
David Robinson spoke about The Unreasonable Effectiveness of Public Work in his keynote, illustrating how helpful various forms of public work can be for a data scientist to advance their career through different stages. He recommends tweeting, blogging and contributing to open source for all stages from junior to senior. With increasing experience he also recommends giving talks, recording screencasts and writing books. If you are inspired to get out there talking, look for a meetup and/or R-Ladies group near you. If you want to speak at a bigger R conference before the next edition of rstudio::conf, check out useR! and EARL.
All throughout his keynote, David did a wonderful job of highlighting other people’s work (all while plugging his own books two slides in a row, master of the game that he is), in particular talks and tweets from the conference. His keynote basically was the first conference review, while the conference was still underway. Other great conference reviews have since been put out there by
- Athanasia Monika Mowinckel: Why RStudio::conf is the best conference experience I have had
- Jacqueline Nolis: rstudio::conf 2019 takeaways
- Julia Silge: Feeling the rstudio::conf love
- Brooke Watson: Rstudio::conf 2019: lessons learned
So we leave you with this enjoyable “further reading” list and hope to see you soon at LondonR, EARL or on Twitter!