Site icon R-bloggers

Automating Dockerfile creation for Shiny apps

[This article was first published on The Jumping Rivers Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

For creating a production deployment of a {shiny} application it is often useful to be able to provide a Docker image that contains all the dependencies for that application. Here we explore how one might go about automating the creation of a Dockerfile that will allow us to build such an image for a {shiny} application.

What is docker?

Docker is an open source platform that enables developers to build, deploy and run containers, standardised executable components that combine application source code with the operating system libraries and dependencies required to run that code.

A general introduction to Docker for R users can be found in this blog post by Colin Fay, and the docker website also has some excellent documentation.


Data comes in all shapes and sizes. It can often be difficult to know where to start. Whatever your problem, Jumping Rivers can help.


Obtaining system dependencies

When installing R packages, occasionally you will need additional system dependencies. When building a Docker image we will want to include the installation of those system dependencies into the Dockerfile. If we are to automate the process of writing a Dockerfile for building an image to run a {shiny} application then we need to find some programmatic solution to determining the required system dependencies.

It turns out that the RStudio Package Manager (RSPM) product has an API that can be queried to obtain the system requirements of a collection of R packages. The lovely folk at Posit also provide an instance of RSPM that anyone can make use of so it is trivial to obtain this information even if you do not have RSPM yourself. For example if we wanted to inspect the system dependencies of a package like {shiny} for Ubuntu 22.04 then a request to

https://packagemanager.rstudio.com/__api__/repos/1/sysreqs?all=false&pkgname=shiny&distribution=ubuntu&release=22.04

would do the trick.

In fact the {vetiver} package has a non-exported function, glue_sys_reqs() that will build a string for the command to install these system requirements.

glue_sys_reqs = function(pkgs) {
  rlang::check_installed("curl")
  rspm = Sys.getenv("RSPM_ROOT", "https://packagemanager.rstudio.com")
  rspm_repo_id = Sys.getenv("RSPM_REPO_ID", 1)
  rspm_repo_url = glue::glue("{rspm}/__api__/repos/{rspm_repo_id}")
  
  pkgnames = glue::glue_collapse(unique(pkgs), sep = "&pkgname=")
  
  req_url = glue::glue(
    "{rspm_repo_url}/sysreqs?all=false",
    "&pkgname={pkgnames}&distribution=ubuntu&release=22.04"
  )
  res = curl::curl_fetch_memory(req_url)
  sys_reqs = jsonlite::fromJSON(rawToChar(res$content), simplifyVector = FALSE)
  if (!is.null(sys_reqs$error)) rlang::abort(sys_reqs$error)

  sys_reqs = purrr::map(sys_reqs$requirements, purrr::pluck, "requirements", "packages")
  sys_reqs = sort(unique(unlist(sys_reqs)))
  sys_reqs = glue::glue_collapse(sys_reqs, sep = " \\\n    ")
  glue::glue(
    "RUN apt-get update -qq && \\ \n",
    "  apt-get install -y --no-install-recommends \\\n    ",
    sys_reqs,
    "\ && \\\n",
    "  apt-get clean && \\ \n",
    "  rm -rf /var/lib/apt/lists/*",
    .trim = FALSE
  )
}

Trying that out on a vector of packages we get something like

glue_sys_reqs(c("shiny", "dplyr"))
#> RUN apt-get update -qq && \ 
#>   apt-get install -y --no-install-recommends \
#>     make \
#>     zlib1g-dev && \
#>   apt-get clean && \ 
#>   rm -rf /var/lib/apt/lists/*

To grab the system requirements for all packages that are used by a {shiny} app then we could use renv::dependencies() to scan our code and list the used packages, then feed then to this function.

appdir = "app/"
pkgs = renv::dependencies(appdir)$Package
sys_reqs = glue_sys_reqs(pkgs)

Building out the rest of the Dockerfile

In order to reproduce the application that works on our system with a particular R version and the versions of packages that we have we want to build a Docker image that has that same version of R and packages. The rocker project provides a collection of Docker images for different purposes tagged for different R versions which makes this substantially easier so it’s really a case of ensuring that we match everything up.

We can write the line that will give me the rocker/shiny image for my R version fairly easily

(from_shiny_version = glue::glue("FROM rocker/shiny:{getRversion()}"))
#> FROM rocker/shiny:4.2.1

and {renv} makes it trivial to snapshot the versions of packages that we have installed and required for my project.

appdir = "app"
lockfile = "shiny_renv.lock"
renv::snapshot(
  project = appdir,
  lockfile = lockfile,
  prompt = FALSE,
  force = TRUE
)
#> Warning: could not retrieve available packages for url 'https://
#> oLSrtCuiY6IoiX2i:rteY5lTffjX6wB4v@internal.jumpingrivers.cloud/package-manager/
#> release/latest/src/contrib'

We then want to get this lock file into the Docker image and renv::restore() the state of the library.

copy_renv = glue::glue("COPY {lockfile} renv.lock")
renv_install = 'RUN Rscript -e "install.packages(\'renv\')"'
renv_restore  = 'RUN Rscript -e "renv::restore()"'

Finally we want to include the app in the image, let others know on which port the application is going to communicate (shiny-server defaults to 3838) and launch the {shiny} server on running the image.

copy_app = glue::glue("COPY {appdir} /srv/shiny-server/")
expose = ifelse(expose, glue::glue("EXPOSE {port}"), "")
cmd = 'CMD ["/usr/bin/shiny-server"]'

Combining all those steps into a single list and writing to file gives us a final Dockerfile. We can wrap this in a function to make it nicer to use:

shiny_write_docker = function(
  path = ".", appdir = "app", lockfile = "shiny_renv.lock",
  port = 3838, expose = TRUE, rspm = TRUE
) {
  rspm_env = ifelse(
    rspm,
    "ENV RENV_CONFIG_REPOS_OVERRIDE https://packagemanager.rstudio.com/cran/latest\n",
    ""
  )
  from_shiny_version = glue::glue("FROM rocker/shiny:{getRversion()}")
  renv::snapshot(
    project = path,
    lockfile = lockfile,
    prompt = FALSE,
    force = TRUE
  )
  pkgs = renv::dependencies(appdir)$Package
  sys_reqs = glue_sys_reqs(pkgs)
  copy_renv = glue::glue("COPY {lockfile} renv.lock")
  renv_install = 'RUN Rscript -e "install.packages(\'renv\')"'
  renv_restore  = 'RUN Rscript -e "renv::restore()"'
  
  copy_app = glue::glue("COPY {appdir} /srv/shiny-server/")
  expose = ifelse(expose, glue::glue("EXPOSE {port}"), "")
  cmd = 'CMD ["/usr/bin/shiny-server"]'
  
  ret = purrr::compact(list(
    from_shiny_version,
    rspm_env,
    sys_reqs,
    copy_renv,
    renv_install,
    renv_restore,
    copy_app,
    expose,
    cmd
  ))
  readr::write_lines(ret, file = file.path(path, "Dockerfile"))
}

Taking the old faithful example shiny app template as my app in a directory called app\

shiny_write_docker(path = ".", appdir = "app")
#> * Lockfile written to 'shiny_renv.lock'.
#> Finding R package dependencies ... Done!

produces the following Dockerfile

FROM rocker/shiny:4.2.0
ENV RENV_CONFIG_REPOS_OVERRIDE https://packagemanager.rstudio.com/cran/latest

RUN apt-get update -qq && apt-get install -y --no-install-recommends \
  make \
  zlib1g-dev\
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
COPY shiny_renv.lock renv.lock
RUN Rscript -e "install.packages('renv')"
RUN Rscript -e "renv::restore()"
COPY app /srv/shiny-server/
EXPOSE 3838
CMD ["/usr/bin/shiny-server"]

Running the app

From our Dockerfile we can build the image

docker build --tag auto_shiny_docker .

and run a container using that image mapping the shiny server port to the same port on localhost.

docker run --rm --publish 3838:3838 auto_shiny_docker

If we navigate in our browser to

http://localhost:3838

we should see the running application.

See also

For updates and revisions to this article, see the original post

To leave a comment for the author, please follow the link and comment on their blog: The Jumping Rivers Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.