Site icon R-bloggers

Enhancing Shiny Application Performance: Implementing Efficient Coding Practices

[This article was first published on Appsilon | Enterprise R Shiny Dashboards, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< article class="section_blog-post4-content">

A website or application’s performance and loading speed play a crucial role in determining its user experience (UX) and success. Slow performance is the main reason why users lose interest in an application. Thus it is crucial to keep an eye on your Shiny app performance from the early stage of development and stick to efficient coding practices all the time to avoid having a complex app with many performance issues accumulated over time.

Ready to tackle performance bottlenecks in your R Shiny apps? Start profiling your code today for faster, smoother results!

In this blog post, I’ll present some coding practices that can significantly enhance the performance of Shiny applications. We’ll explore data loading optimization techniques, caching strategies, asynchronous operations, and other suggestions on how to minimize loading and processing times for a smoother user experience.

Optimization Loop

Performance optimization in software applications should be a cyclical process and composed of phases like benchmarking, profiling, estimating, and optimizing. This systematic approach ensures efficient and effective performance enhancements.

1. Benchmarking

This very first stage serves as the diagnostic phase, where you assess the current performance of your Shiny application to determine if there are performance issues that need addressing or to measure the impact of any optimizations made before. You need to gather data and establish a baseline for performance metrics:

  • Simulate user interaction – this involves timing the app’s performance under typical usage conditions to identify any slowdowns. Tools like {microbenchmark} or {shiny.benchmark} in R can be used for this purpose. {shinyloadtest} can simulate multiple users interacting with your application, providing insights into how performance scales under load.
  • Establish a Baseline – document the current performance metrics of your application. These metrics will serve as a baseline to measure the effectiveness of any optimizations you implement.

These articles can help you with using {shiny.benchmark}: shiny.benchmark – How to Measure Performance Improvements in R Shiny Apps, Lessons Learned with shiny.benchmark: Improving the Performance of a Shiny Dashboard

2. Profiling

Once a performance issue is confirmed, the next step is profiling, which helps identify the specific parts of the app that are causing slowdowns. Profiling tools, such as {Rprof}, {profvis} (visualizing Rprof’s output), allow developers to see how much time is spent in different parts of the app, pinpointing bottlenecks. This phase focuses on understanding why the app is slow and determining which components should be optimized first for the greatest impact. Optimizing the slowest part gives the highest payoff.

Want to dig deeper into improving your R Shiny app’s performance? Try these profiling tools to uncover and fix bottlenecks fast!

3. Estimating

After identifying the bottlenecks, the next phase involves estimating the time and resources required to address the performance issues. This includes considering the complexity of the optimization, the potential performance gains, and whether the improvement justifies the effort. Estimating helps in prioritizing optimization tasks by balancing the expected benefits against the costs.

4. Optimizing

The final phase is the actual optimization, where specific actions are taken to improve the performance of the identified bottlenecks. In the following sections, we’ll explore some useful techniques to enhance the performance of R Shiny applications.

After optimizing, the cycle repeats, starting with benchmarking to assess the impact of the changes.

Looking to speed up your R Shiny apps? Follow this guide for practical tips to boost performance and efficiency.

Data Loading

Improving data loading in R and Shiny applications can significantly enhance performance, especially for apps that handle large datasets. Here are strategies focusing on different data formats and packages, data preprocessing and leveraging different scope levels of Shiny apps.

Use Faster Solutions to Load the Data

Here are some solutions for reading the delimited data, like .csv or .tsv files:

  • read.csv (base R) is relative slow solution for reading delimited files and tends to consume much memory, but it’s straightforward to use without needing any additional packages
  • read_csv ({readr} package) offers a significant speed improvement over read.csv, with efficient memory usage and is part of the tidyverse, making it appealing for tidyverse users
  • fread ({data.table} package) is highly optimized for speed and efficiency, especially with very large datasets, though it may require some familiarity with {data.table} syntax.
  • vroom ({vroom} package) stands out for its almost instantaneous data loading through lazy loading and memory mapping, making it exceptionally fast for initial reads with very efficient memory usage.

In overall, fread and vroom are the best choices for large datasets, with vroom offering the best initial load time and fread excelling in overall speed and efficiency.

Use Efficient Data Formats

Binary column-oriented data formats like Parquet, Feather, and fst provide substantial performance benefits over traditional text formats, particularly for large datasets. These tend to be faster for both reading and writing because they minimize serialization overhead and are optimized for storage and retrieval. Here are some solutions to consider:

  • rds – R’s native serialization format – is significantly less performant than the other options below, although offers a straightforward way to serialize and deserialize R objects
  • fst – very fast for storing and retrieving data frames in R and allows to adjust compression to manage file size
  • Parquet – highly performant for analytics on large datasets due to its efficient columnar storage and compression, making it ideal for big data scenarios. It’s supported in R through the {arrow} package
  • Feather – provides extremely fast read and write speeds for data frames, optimized for quick data exchange between R and Python. Its performance is enhanced by memory mapping, accessible in R via the {arrow} package

For more information, check out this article – Fast Data Loading From Files To R , How To Speed Up Your Shiny Apps When Working With Large Datasets and Apache Arrow In R – Supercharge Your R Shiny Dashboards With 10X Faster Data Loading

Preprocess Large Data

Preprocess and clean data before using it in the application. This includes filtering, selecting relevant columns, and aggregating data. Doing this outside the app (e.g., during data preparation) can reduce the size of the data loaded into the app, improving performance.

Leverage Object Scoping Rules in Shiny

In Shiny, using the global or application scope for loading data files can improve performance as well. Data loaded in these scopes, as opposed to session and function/module level scopes, is accessible to multiple users (sessions) served by a single R process, eliminating the need for repeated loading the data by another users:

  • global-level scope – the data object needs to be declared in global.R file, so it’s loaded into R’s global environment, where it can persist even after an app stops.
  • application-level scope – the data must be declared in app.R outside the server function. Such object’s lifetime is the same as the app.

Use Database Connections

For applications that require real-time data access or handle extremely large datasets, consider using a database connection instead of loading data directly into R.

Caching

Caching is the process of storing frequently accessed data or computed results to reduce retrieval time and improve performance on subsequent identical requests. By using caching carefully, you can significantly improve the performance of your Shiny application by avoiding unnecessary recalculations.

Using bindCache()

When you pass a reactive() or render function (like renderPlot(), renderText(), renderUI(), plotly::renderPlotly()) as the first argument to bindCache(), followed by a cache key, Shiny caches the result of the function based on the value of the cache key.

data <- reactive({
	fetchData(input$year)
}) |>
	bindCache(input$year)

If the cache key’s value, in our case the value of input$year is identical in the latter executions, bindCache retrieves the function’s result from the cache instead of executing it again, saving time and resources.

bindCache allows you to specify multiple arguments as cache keys. This is useful when the output of your reactive() or render function depends on multiple inputs. By passing each relevant input as a cache key, you ensure that the cached result is used only when all of the specified inputs match their previous values. If any of the cache keys change, bindCache treats it as a cache miss and re-executes the function to generate and cache a new result, ensuring that the output always reflects the current state of the input values.

data <- reactive({
	fetchData(input$year, input$month, input$day)
}) |>
	bindCache(input$year, input$month, input$day)

Make sure to use all necessary reactive dependencies as cache keys. Otherwise, you can end up with a cache collision, where updating an input excluded from bindCache (but is needed as a cache key for the output) does not affect the cached value, causing incorrect data to be displayed.

output$text <- renderText({
	paste(input$a, input$b)
}) |>
	bindCache(input$a)

In the example above, changing input$b will not trigger the updating of the cached text value, thus displaying an outdated value.

Using renderPlot()

Prior to Shiny version 1.6.0, caching the plots could be done with renderCachedPlot()

output$plot <- renderCachedPlot({
    plot(fetchData(input$year))
  },
  cacheKeyExpr = input$year
)

Even though this code would still work with newer versions of Shiny, it’s recommended to use renderPlot() with bindCache() instead.

output$plot <- renderPlot({
    plot(fetchData(input$year))
  }) |>
  bindCache(input$year)

Plot Sizing

Plot sizing for cached plots in Shiny differs from regular plots. Regular plots match the parent div tag element’s dimensions exactly while cached plots are rendered at predetermined sizes larger than the parent div and then scaled by the browser to fit the available space. It optimizes cache efficiency by reducing the need for re-rendering the cached plot at every slight dimension change.

Using bindCache() Together with bindEvent()

output$plot <- renderPlot({
    plot(fetchData(input$year))
  }) |>
  bindCache(input$year) |>
  bindEvent(input$button)

This setup caches the plot based on input$year and ensures the plot is only re-rendered when input$button is clicked. Then it triggers bindCache(input$year) to check if the plot needs to be re-rendered based on the cache. If the cache doesn’t have an entry for the current key, it executes the renderPlot() code to generate and cache the new plot.

Cache Scoping

Application-Level Cache

By default, bindCache() uses an application-level cache (cache = “app”), meaning the cache is shared across all user sessions within the same R process. This approach can significantly improve performance, as computations done for one user can benefit others.

# Application-level cache shared across user sessions
bindCache(..., cache = "app")

While this method maximizes resource utilization and speed, it might lead to potential data leakage among sessions if not carefully managed, especially if the cache key doesn’t fully capture the inputs. Application-level cache lasts until the application is stopped.

Session-Level Cache

For scenarios where data privacy between sessions is a concern, a session-level cache (cache = "session") can be used. This scopes the cache to individual user sessions, preventing data from being shared across sessions.

# Session-level cache unique to each user session
bindCache(..., cache = "session")

This method ensures data privacy but doesn’t allow sharing computed data across sessions. Session-level cache expires with the end of the user session.

Persistent Caches Across R Processes

For caches that need to persist across multiple R processes or even system reboots, you can use cachem::cache_disk(). This creates a disk-based cache that can be shared among concurrent R processes.

shinyOptions(cache = cachem::cache_disk(file.path(dirname(tempdir()), "cache"))

This cache will persist across multiple starts and stops of the R process, but will expire when you reboot your machine (the temp folder usually gets removed on reboot). To avoid that, create a cache file outside the temp directory:

shinyOptions(cache = cachem::cache_disk("./cache"))

For more information on adjusting cache sizes or having more detailed control over caching in Shiny, check the official documentation or this article.

Using memoise()

memoise() is provided by the memoise package. Using it is very easy, you just need to pass a function as an argument of memoise() to cache the result of that function.

function_to_memoise <- function(n) {
  # do slow operation
}
memoised_function <- memoise(function_to_memoise)
memoised_function() # same results as function_to_memoise(), but with cache

Unlike bindCache() used with reactive() in Shiny, the memoise() mechanism doesn’t have anything to do with reactivity and automatically determines cache keys from function arguments. Hence, the memoized function must be a pure function, meaning their output only depends on their input without being affected by or altering the global state. Otherwise, memoized and non-memoized versions may behave inconsistently.

In Shiny applications, it’s useful to specify the application or session-level cache when creating memoized functions, as  memoise() is not integrated with Shiny’s cache scoping mechanism by default (the package is independent of Shiny), as opposed to bindCache().

Here’s how to memoize a function with an application-level cache:

# Memoize using application-level cache
function_m <- memoise(
  function_to_memoise,
  cache = getShinyOption("cache")
)

And with a session-level cache:

function(input, output, session) {
  # Call memoise() in server function and use session-level cache
  function_m <- memoise(function_to_memoise, cache = session$cache)
}

Asynchronous Programming

In this section, we’ll delve into another important aspect of improving Shiny app performance – async programming. When you need to perform a long-running operation(e.g. heavy computation or loading a large dataset), executing it in a reactive context will block the application until the operation finishes.

This is related to Shiny’s limitation of single-threaded, sequential task processing and can be frustrating to the users;negatively affecting the user experience (UX). It’s even worse in the case of multiple users using the same R process because one user’s long-running operation will affect other users’ apps performance by blocking them, and these users wouldn’t even be aware why the slowdown happens. Async programming in Shiny addresses the limitations.

Use Cases for Async Programming

One of the most common scenarios is using asynchronous programming for I/O (Input/Output) operations (which include operations like database queries, file reading, or API calls) that inherently involve waiting times. Async programming allows these tasks to proceed without blocking the user interface, ensuring that the app remains responsive even when waiting for data to load or external services to respond.

Similarly, computation intensive tasks demanding substantial computational power (e.g. running complex algorithms or building machine learning models) can significantly benefit from async programming. While multithreading might be a good solution for maximizing CPU utilization, async programming offers a way to efficiently manage these resource-intensive tasks.

Using promises, future Packages

{promises} and {future} packages were usually the first choice of R Shiny developers for handling asynchronous programming. However, the asynchronicity here was never designed to make the user’s current session responsive to other interactions during performing long-running operations in that session. Instead, it unblocks the other sessions (users) within the same R process. This can significantly improve the scalability and efficiency of a Shiny app by handling multiple users more effectively, but do not directly solve the issue of UI blocking caused by heavy computations within a single session.

Using ExtendedTask feature

Since Shiny version 1.8.1, the ExtendedTask feature has been introduced to handle asynchronous programming more effectively. The ExtendedTask class allows you to freely interact with the UI of your Shiny app, by unblocking the current session you work in completely during long-running operations. You can read more about truly non-blocking operations.

Curious about boosting Shiny app concurrency? Check out Joe Cheng’s talk on ExtendedTask at ShinyConf2024 for handling long-running tasks seamlessly!

Using shiny.worker

Another solution would be the {shiny.worker} package created by Appsilon, which delegates heavy computation tasks to a separate process using REST APIs, such that it does not freeze your Shiny app. While the job is running and the heavy calculation is being processed by the external worker, the user can still interact with the app and the UI remains responsive. Take a look at simple demo and read more how to offload heavy calculations to external service with shiny.worker

Other Strategies

There are many other techniques and strategies on improving the performance of Shiny apps, but I’m not going to go into details. Some of them are:

  • utilizing faster functions (not necessarily from base R)
  • using tabs to structure the app and take advantage of lazy loading mechanism related to this approach to get faster initial loads
  • paying attention to not overusing Shiny’s reactivity, which can cause some headaches and redundant updates in the application
  • avoiding extensive use of renderUI() in server functions in favor of update functions like updateTextInput

Feel free to read more about those in these articles: Make R Shiny Dashboards Faster with updateInput, CSS, and JavaScript, Speeding Up R Shiny – The Definitive Guide

Summing Up Enhancing Shiny Application Performance

In conclusion, enhancing Shiny app performance is essential for a smooth user experience. By optimizing data loading, utilizing caching, and implementing asynchronous programming with features like ExtendedTask, you can significantly reduce latency and improve efficiency.

These techniques ensure your app handles large datasets effectively, minimizes redundant computations, and keeps the UI responsive during long-running operations. Adopting these strategies will help you build robust and high-performing Shiny applications.

Want to take your R/Shiny skills even further? Download our ebook Level Up Your R/Shiny Team Skills for more in-depth strategies and tips

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!
< section class="section_blog-post4-related">

The post appeared first on appsilon.com/blog/.

To leave a comment for the author, please follow the link and comment on their blog: Appsilon | Enterprise R Shiny Dashboards.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version