Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Today I am going to make a short post on the R package {box}
which was showcased to me quite nicely by Michael Miles. It was informative and I was able to immediately see the usefulness of the {box}
library.
So what is ‘box’? Well here is the description straight from their site:
‘box’ allows organising R code in a more modular way, via two mechanisms:
- It enables writing modular code by treating files and folders of R code as independent (potentially nested) modules, without requiring the user to wrap reusable code into packages.
- It provides a new syntax to import reusable code (both from packages and from modules) which is more powerful and less error-prone than library or require, by limiting the number of names that are made available.
So let’s see how it all works.
< section id="function" class="level1">Function
The main portion of the script looks like this:
# Main script # Script setup -------------------------------------- # Load box modules box::use(. / box / global_options / global_options) box::use(. / box / io / imports) box::use(. / box / io / exports) box::use(. / box / mod / mod) # Load global options global_options$set_global_options() # Main script --------------------------------------- # Load data, process it, and export results all_data <- getOption('data_dir') |> # Load all data imports$load_all() |> # Modify dataset mod$modify_data() |> # Export data exports$export_data()
So what does this do? Well it is grabbing data from a predefined location, modifying it and then re-exporting it. Now let’s look at all the code that is behind it, which allows us to do these things and then you will see the power of using box
< section id="example" class="level1">Example
Let’s take a look at the global options settings.
# Set global options #' @export set_global_options <- function() { options( look_ups = 'look-ups/', data_dir = 'data/input/' ) }
Ok 6 lines, boxed down to one.
Now the import function.
# Function for importing data #' @export load_all <- function(file_path) { box::use(purrr) box::use(vroom) file_path |> # Get all csv files from folder list.files(full.names = TRUE) |> # Set list names purrr$set_names(\(file) basename(file)) |> # Load all csvs into list purrr$map(\(file) vroom$vroom(file)) }
Now the modify_data
function.
# Function for modifying data #' @export modify_data <- function(df_list) { box::use(dplyr) box::use(purrr) map_fun <- function(df) { df |> dplyr$select(name:mass) |> dplyr$mutate(lol = height * mass) |> dplyr$filter(lol > 1500) } # Apply mapping function to list purrr$map(df_list, map_fun) }
Ok again, a big savings here, instead of the above we simply call mod$modify_data()
which makes things clearner and also modular in that we can go to a very specific spot in our proejct to fix an error or add/subtract functionality.
Lastly the export.
# Function for exporting data #' @export export_data <- function(df_list) { box::use(vroom) box::use(purrr) # Export data purrr$map2(.x = df_list, .y = names(df_list), ~vroom$vroom_write(x = .x, file = paste0('data/output/', .y), delim = ',')) }
Voila! I think to even a fresh user, the power of boxing your functions is fairly apparent and to the advanced user, eyes are most likely glowing!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.