Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In health research, a flowchart is the best way to show the flow of participants in a study when reporting results. But drawing flowcharts can be tedious to prepare and can get on your nerves.
Fortunately, there are several packages in R for drawing flowcharts using different approaches. The problem is that the programming is generally quite complex, and the numbers have to be entered manually or parameterized beforehand. These flowcharts can have reproducible problems because if data changes, we have to manually change the parameters again.
To make our lives easier, there’s a new {flowchart} package that uses the tidyverse workflow, which allows to create many different types of flowcharts in just a few steps.
The package provides a set of functions that are thought to be combined with a tidyverse pipe operator (%>%
or |>
) to create different flowchart designs directly from the study database. These functions are highly customizable and allow the user to create reproducible flowcharts in an easier and tidier way. Now we don’t need to manually set the flowchart parameters such as the box coordinates or the numbers to display, because it automatically adapts to the data we have.
For example, we can create a flowchart of the entire participant study flow with this simple tidy workflow:
Here, we will describe these steps that are involved in creating a flowchart in this example. We will use the built-in safo dataset, that comes with the package, which is a randomly generated dataset from the SAFO clinical trial. For more information and other examples, you can visit the vignette of the package.
< section id="installing-and-loading-the-package" class="level2"> < section id="installing-and-loading-the-package" class="level2">Installing and loading the package
As of March of 2024, the package is available on CRAN:
install.packages("flowchart")
You can always install the development version from Github:
remotes::install_github("bruigtp/flowchart")
Initialize the flowchart
The first step is the initialisation of the flowchart with the function as_fc()
:
library(flowchart) x <- safo |> as_fc(label = "Patients assessed for eligibility")
This will create an object of class fc, the class created for this package. Objects of this class consist of a list containing the dataset together with the information related to the flowchart being generated. Let’s see it for our example:
str(x, max.level = 1)
List of 2 $ data: tibble [925 × 21] (S3: tbl_df/tbl/data.frame) $ fc : tibble [1 × 17] (S3: tbl_df/tbl/data.frame) - attr(*, "class")= chr "fc"
The data
tibble belongs to the entire SAFO dataset as we haven’t done any further operations:
x$data
# A tibble: 925 × 21 id inclusion_crit exclusion_crit chronic_heart_failure expected_death_24h <int> <fct> <fct> <fct> <fct> 1 1 Yes No No No 2 2 No No No No 3 3 No No No No 4 4 No Yes No No 5 5 No No No No 6 6 No Yes No No 7 7 No No No No 8 8 No Yes No Yes 9 9 No No No No 10 10 No No No No # ℹ 915 more rows # ℹ 16 more variables: polymicrobial_bacteremia <fct>, # conditions_affect_adhrence <fct>, susp_prosthetic_valve_endocard <fct>, # severe_liver_cirrhosis <fct>, acute_sars_cov2 <fct>, # blactam_fosfomycin_hypersens <fct>, other_clinical_trial <fct>, # pregnancy_or_breastfeeding <fct>, previous_participation <fct>, # myasthenia_gravis <fct>, decline_part <fct>, group <fct>, itt <fct>, …
The fc
tibble represents the information on the generated flowchart, which only contains a first initial box indicating the total number of patients assessed for eligibility in the SAFO trial:
x$fc
# A tibble: 1 × 17 id x y n N perc text type group just text_color text_fs <dbl> <dbl> <dbl> <int> <int> <chr> <chr> <chr> <lgl> <chr> <chr> <dbl> 1 1 0.5 0.5 925 925 100 "Pat… init NA cent… black 8 # ℹ 5 more variables: text_fface <dbl>, text_ffamily <lgl>, text_padding <dbl>, # bg_fill <chr>, border_color <chr>
Drawing the flowchart
We can always use the fc_draw()
function to draw the associated flowchart from a fc object:
x |> fc_draw()
Building the flowchart
To build the entire flowchart, we would need to combine the initialized fc object with the desired functions until we obtain the final flowchart.
The second box showing the patients excluded from randomization can be obtained using the fc_filter()
function:
safo |> as_fc(label = "Patients assessed for eligibility") |> fc_filter(!is.na(group), label = "Randomized", show_exc = TRUE) |> fc_draw()
with show_exc = TRUE
to show the excluded subject box as well. Now $data
contains the database filtered only for the randomized subjects while $fc
contains the information for these new boxes.
Now, we can split the flowchart by the study group, using the fc_split()
function:
safo |> as_fc(label = "Patients assessed for eligibility") |> fc_filter(!is.na(group), label = "Randomized", show_exc = TRUE) |> fc_split(group) |> fc_draw()
Now, $data
contains the previously filtered database that has been grouped by the group variable.
Finally, we can apply two more times the fc_filter()
function to generate the complete flowchart we want:
safo |> as_fc(label = "Patients assessed for eligibility") |> fc_filter(!is.na(group), label = "Randomized", show_exc = TRUE) |> fc_split(group) |> fc_filter(itt == "Yes", label = "Included in intention-to-treat\n population") |> fc_filter(pp == "Yes", label = "Included in per-protocol\n population") |> fc_draw()
The idea is to combine these basic functions, fc_filter()
and fc_split(),
in any way we want to create the desired flowchart. The resulting flowchart can be further customized and enhanced using the fc_modify()
function, or combined with other flowcharts either horizontally or vertically using the fc_merge()
and fc_stack()
functions, respectively. Finally, once the final flowchart is drawn, it can be exported to the desired image format using the fc_export()
function.
More information about these features and other examples can be found in the website of the package: https://bruigtp.github.io/flowchart/.
Flowcharts made easy with the package {flowchart} was first posted on January 17, 2025 at 6:16 pm.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.