An example of using {box}
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Today I am going to make a short post on the R package {box}
which was showcased to me quite nicely by Michael Miles. It was informative and I was able to immediately see the usefulness of the {box}
library.
So what is ‘box’? Well here is the description straight from their site:
‘box’ allows organising R code in a more modular way, via two mechanisms:
- It enables writing modular code by treating files and folders of R code as independent (potentially nested) modules, without requiring the user to wrap reusable code into packages.
- It provides a new syntax to import reusable code (both from packages and from modules) which is more powerful and less error-prone than library or require, by limiting the number of names that are made available.
So let’s see how it all works.
Function
The main portion of the script looks like this:
# Main script # Script setup -------------------------------------- # Load box modules box::use(. / box / global_options / global_options) box::use(. / box / io / imports) box::use(. / box / io / exports) box::use(. / box / mod / mod) # Load global options global_options$set_global_options() # Main script --------------------------------------- # Load data, process it, and export results all_data <- getOption('data_dir') |> # Load all data imports$load_all() |> # Modify dataset mod$modify_data() |> # Export data exports$export_data()
So what does this do? Well it is grabbing data from a predefined location, modifying it and then re-exporting it. Now let’s look at all the code that is behind it, which allows us to do these things and then you will see the power of using box
Example
Let’s take a look at the global options settings.
# Set global options #' @export set_global_options <- function() { options( look_ups = 'look-ups/', data_dir = 'data/input/' ) }
Ok 6 lines, boxed down to one.
Now the import function.
# Function for importing data #' @export load_all <- function(file_path) { box::use(purrr) box::use(vroom) file_path |> # Get all csv files from folder list.files(full.names = TRUE) |> # Set list names purrr$set_names(\(file) basename(file)) |> # Load all csvs into list purrr$map(\(file) vroom$vroom(file)) }
Now the modify_data
function.
# Function for modifying data #' @export modify_data <- function(df_list) { box::use(dplyr) box::use(purrr) map_fun <- function(df) { df |> dplyr$select(name:mass) |> dplyr$mutate(lol = height * mass) |> dplyr$filter(lol > 1500) } # Apply mapping function to list purrr$map(df_list, map_fun) }
Ok again, a big savings here, instead of the above we simply call mod$modify_data()
which makes things clearner and also modular in that we can go to a very specific spot in our proejct to fix an error or add/subtract functionality.
Lastly the export.
# Function for exporting data #' @export export_data <- function(df_list) { box::use(vroom) box::use(purrr) # Export data purrr$map2(.x = df_list, .y = names(df_list), ~vroom$vroom_write(x = .x, file = paste0('data/output/', .y), delim = ',')) }
Voila! I think to even a fresh user, the power of boxing your functions is fairly apparent and to the advanced user, eyes are most likely glowing!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.