Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The useR!2022 event was hosted by the Department of Biostatistics at Vanderbilt University Medical Center. The organizers and volunteers did an outstanding job to run the event smoothly and I am sure all the presenters and participants felt this dedication and professionalism the same way as I did.
Here are the most memorable presentations related to Shiny and containers, as it is fitting for the Hosting Data Apps website.
Best Practices for Shiny Apps with Docker
I presented this talk in the Containerization and Metaprogramming session. I summarized a couple of posts from this website, focusing on why learning Docker is beneficial for Shiny hosting and how to improve the developer experience and security of the resulting Docker image. Here are the 5 points I highlighted with links to posts where I detailed these practices:
- Choose your base images wisely
- Pay attention to dependencies
- Order layers based on how often they change
- Set a non-privileged (non-root) user
- Use caching with CICD and save the Planet
Here are the slides if you want to flip through them:
< !--kg-card-begin: html-->< !--kg-card-end: html-->
Here is a downloadable PDF version:
I have received lots of interesting and important questions, which to me indicated that there is a continued interest in containerized Shiny app development. I include the questions and the answers here for the sake of folks who could not participate in the live event.
How do you find packages, systems libraries, and dependencies?
Installing R packages and reading the error messages is the lazy option I often fall back to. But there are other, more sophisticated ways of querying the dependencies. Here are a couple of links to help you make dependency management easier:
- rstudio/r-system-requirements: contains a catalog of "rules" that can be used to systematically identify these dependencies and generate commands to install them
- r-hub/sysreqsdb: provides a database with API to quickly find out which Homebrew, Debian, Ubuntu, RHEL/Centos, etc. packages or other software needs to be available to build and use R packages – try e.g.
dockerfiler::get_sysreqs()
to see this in action - maketools to determine run-time and build-time dependencies
However, the emergence of apt
-based dependency resolution might just make this an issue of the past, at least on Debian/Ubuntu Linux.
How to prevent the Docker container from exceeding the 2GB container memory limit?
This is mostly an issue on Mac and Windows, where by default, Docker Desktop sets the runtime memory to 2 GB, allocated from the total available memory on your machine. You can change this under the Advanced settings tab.
On Linux, you would see the host's available memory as the limit. Runtime limits for memory, swap, CPUs, and GPUs can be controlled by command-line arguments.
How do you manage ports when you have many Shiny containers?
The two most common options are:
- ShinyProxy can proxy traffic to multiple instances of multiple Shiny apps or any other containerized application
- Docker Compose can also list many Shiny apps as services, and traffic can be proxied to these services using Nginx or Caddy Server
Why would ShinyProxy have a medium cost? Isn't it free?
ShinyProxy is free (no licensing fee) but you need a server, that is the cost I considered when I indicated it as not a free option, although this server hosting can be small, around $100 USD/year depending on the cloud provider.
Can you comment on using Shiny with other containerization tools, e.g. Singularity? My IT department has been reluctant to allow Docker on our servers.
If Docker is not installed, Singularity can run Docker images.
Another interesting tool that was mentioned by a participant is Singularity Python, which allows you to convert a Dockerfile to a Singularity file using a shell command or Python.
What's your recommended option for including credentials in a container?
Include these in a .gitignore
-d text file (i.e. .Renviron
) and mount a Docker volume where the app can read this text file.
If you are using a platform-as-a-service (PaaS) option, you can provide environment variables to be used at runtime within the user interface.
Do you have tips or good practices to reduce docker image size?
A larger image size provides a larger attack surface, so if you are worried about that, check out slim.ai. However, if you are worried about multiple apps taking up more space, this is not necessarily an issue. Those images share the base image, thus it is not going to be copied multiple times. Image pull for cold start can still be an issue in an autoscaling context.
Shiny apps can involve large input data sets even after compressing the input files. Do you have any suggestions for decoupling the input data from the docker image?
Building multiple apps or multi-page apps with brocuher, or splitting the Shiny app into a leaner app calling a Plumber API might be your options. This decoupling currently needs quite a bit of work because API-based usage of Shiny is not officially supported.
Tutorial
Rami Krispin and Rahul Sangole have put together a very useful tutorial titled "Docker for R users". They walked us over the foundations of Docker, setting an R development environment, and deploying R code on GitHub Actions with a container. The delivery showed that the instructors live what they teach. The examples represented know-how as a result of years of fine-tuning.
The GitHub repository for the tutorial is a treasure trove. If you missed the tutorial, dive into the files – I am sure that you will find something new and useful.
Presentations
In the Containerization and Metaprogramming session, another interesting talk that touched upon Docker was Docker for data science by Alex Gold. He is also writing a book about DevOps for Data Science, check it out.
The Shiny Applications session hosted 3 talks:
- What happens next? The day after deployment by Andrew Patterson
- Developing R Shiny apps to enhance speech ultrasound visualization and analysis by Simon Gonzalez
- Using R Shiny and Neo4j to build the CatMapper prototype application by Robert Bischoff
There were interesting Shiny apps in other sessions as well:
- Calculating CO2 equivalent (CO2e) emissions in R by Lily Clements
- Learning ggplot2 with generative art by Nicola Rennie
- Interactive dashboards without Shiny by Agustin Calatroni
- An R Shiny app for tracking COVID-19 in low- and middle-income countries by Crystal Wai
The poster sessions had some gems too, like The state of ShinyProxy – 2022 by Tobia De Koninck.
That's all for now. See you all at useR!2023.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.