Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Making changes to a legacy system can be daunting. How can we develop code when we don’t have unit tests for a part of a system we need to change? How do we ensure a new feature is unit-tested when the rest of the codebase lacks them? One approach would be to just scrape the module off (or even the whole project!) and start from scratch, but it’s often not worth the cost. So what is the alternative?
Use the Sprout Technique with Test Driven Development.
Table of Contents:
- The Task
- Sprout Technique
- Identify Insertion Point
- Develop Code in Isolation
- Call Your Code from the Legacy Code
- Summarizing Sprout Technique and Test Driven Development
The Task
Let’s consider an app that displays names of tables from a data lake that are used to build visuals. It informs users from which version of data the plots are derived. This feature needs to be extended to show tables changes history to allow users to see previous versions of data, by whom and when they were changed. Let’s formulate this requirement in a form of acceptance criteria:
- Data changes history is displayed in a table.
- Table is displayed in “Data” section
Note that we don’t know yet how these criteria should be satisfied. Currently, we know only what needs to be delivered to meet the business goal.
Sprout Technique
- Identify insertion point.
- Develop code in isolation.
- Call your code from the Legacy Code.
Identify Insertion Point
There is a module that already displays the names of tables used. We need to identify the boundaries of the feature we’re extending. In this case, it’s just a pair of uiOutput
and renderUI
in one legacy (untested) module that displays currently used tables names. This is the place where we will inject our new code.
mod_data_ui <- function(id) { ns <- shiny::NS(id) shiny::div( shiny::h1("Data"), # ..., shiny::uiOutput(ns("data_summary")) ) } mod_data_server <- function(input, output, session) { # ... output$data_summary <- renderUI({ # Some business logic }) }
Learning about the code where you are injecting a new feature will help you understand what data is available there and how the new code can interact with its surroundings – it will help you drive the decision on what the new module’s interface can be.
Develop Code in Isolation
Let’s recall our criteria:
- Data changes history is displayed in a table.
- Table is displayed in “Data” section
From exploration of the existing code we see that an object called StudyData
is available in the parent module that has a get_data_history
method allowing you to fetch the data you need.
StudyData <- R6::R6Class( classname = "StudyData", public = list( # Other methods # ... get_data_history = function() { # ... } ) )
Knowing that we can create an implementation list for the feature:
- The changes history is fetched using the
StudyData
object. - The changes history is displayed in the UI.
Let’s set up our first test, we will use Arrange, Act, Assert pattern to guide our thinking:
describe("mod_data_history", { it("should fetch data history", { # Arrange # Act # Assert }) })
Let’s focus on the first item in our implementation list. Given that we need to use StudyData object there are two possibilities:
- We can pass StudyData to the module.
- Or we can pass the history data we fetched in the parent module.
- Or we can pass the history data we fetched in the parent module.
Let’s stick to the first option to keep gates open for this module to fetch some more data to display. Let’s call our new module mod_data_history . We expect the module should call the study_data$get_data_history() method.
We need to mock a StudyData object to simulate its behavior, it allows us not to rely on actual implementation (which requires connection to the data lake). Since StudyData is a R6 class, we could mock an object with structure by hand, or we can create a reusable, simple routine that clones a R6 object:
create_mock_r6_class <- function(generator) { checkmate::assert_class(generator, "R6ClassGenerator") structure( purrr::imap(generator$public_methods, mockery::mock), class = generator$classname ) }
Then our mock object generator will look like:
.create_mock_study_data <- function() { create_mock_r6_class(StudyData) }
Note that we’re creating a wrapper for create_mock_r6_class(StudyData) – it may seem redundant now, but it’ll allow you to quickly substitute/extend this mock in all tests that use it.
Then our test becomes:
.create_mock_study_data <- function() { create_mock_r6_class(StudyData) } describe("mod_data_history", { it("should fetch data history", { # Arrange study_data <- .create_mock_study_data() # Act # Assert }) })
For now, we only expect that this module will call this get_data_history method. We can use mockery::expect_called to check if this method has been called once. Let’s put that into the Assert block:
describe("mod_data_history", { it("should fetch data history", { # Arrange study_data <- .create_mock_study_data() shiny::testServer( app = mod_data_history_server, args = list(study_data = study_data), { # Act # Assert mockery::expect_called( study_data$get_data_history, n = 1 ) } ) }) })
Running tests will throw errors mod_data_history_server
object cannot be found. Let’s create the module:
mod_data_history_ui <- function(id) { ns <- shiny::NS(id) } mod_data_history_server <- function(id, study_data) { shiny::moduleServer(id, function(input, output, session) { }) }
Now the test fails with an expected message
Failure (test-mod_data_history.R:12): mod_data_history: should fetch data history mock object has not been called 1 time
To make the test pass we add a call to this method in the module:
mod_data_history_ui <- function(id) { ns <- shiny::NS(id) } mod_data_history_server <- function(id, study_data) { shiny::moduleServer(id, function(input, output, session) { data_history <- study_data$get_data_history() }) } ✔ | F W S OK | Context ✔ | 1 | mod_data_history ══ Results ══════════════════════════════════════════════════════════════════════════════════════ [ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ] 🎯 Your tests hit the mark 🎯
- Changes history is fetched using the
StudyData
object. - Changes history is displayed in the UI.
Now we need to display the data that we just fetched. That means we need another test case for the second item in our list:
it("should display data history", { # Arrange study_data <- .create_mock_study_data() shiny::testServer( app = mod_data_history_server, args = list(study_data = study_data), { # Act # Assert } ) })
We don’t know yet what type of HTML should be produced to display this data, but from the server side we may expect that a method that parses history data and builds a HTML will be called. We will force output$data_history
to evaluate and check if the display method has been called. Note that we only make one assumption – to which output slot we send the HTML – we don’t lock ourselves to a specific rendering engine with this test (e.g., renderTable
, renderUI
). We’re not checking explicitly what value output$data_history
yields.
.study_data_history <- function() { tibble::tibble( data_name = c("data_1_v1", "data_1_v2", "data_1_v3"), user = c("user_1", "user_2", "user_1"), updated = c("2023-06-29 17:49:12", "2023-05-29 17:49:12", "2023-04-29 17:49:12"), size = c(1000, 1000, 1000) ) } .create_mock_study_data <- function() { mock <- create_mock_r6_class(StudyData) mock$get_data_history <- mockery::mock(.study_data_history()) mock } # ... it("should display data history", { # Arrange study_data <- .create_mock_study_data() mock_render_method <- mockery::mock() shiny::testServer( app = mod_data_history_server, args = list(study_data = study_data), { # Act output$data_history # Assert mockery::expect_args( mock_render_method, n = 1, data = .study_data_history() ) } ) })
You can see how we extended the mock in .create_mock_study_data
to mock a return value from its method. This test fails
✔ | F W S OK | Context ✖ | 1 1 | mod_data_history ───────────────────────────────────────────────────────────────────────────────────────────────── Error (test-mod_data_history.R:40): mod_data_history: should display data history Error in `.subset2(x, "impl")$getOutput(name)`: The test referenced an output that hasn't been defined yet: output$proxy1-data_history
Let’s add outputs, we will use {reactable}
as it implements all features we need to display the data in a shape we need.
mod_data_history_ui <- function(id) { ns <- shiny::NS(id) reactable::reactableOutput(ns("data_history")) } mod_data_history_server <- function(id, study_data) { shiny::moduleServer(id, function(input, output, session) { data_history <- study_data$get_data_history() output$data_history <- reactable::renderReactable({ render_data_history(data_history) }) }) }
Tests still fail
Error (test-mod_data_history.R:40): mod_data_history: should display data history Error in `render(data_history)`: could not find function "render_data_history"
We need to either stub render_data_history
it or inject it to the module. Let’s use stubbing for now, as we don’t need to parametrize this module with a rendering function:
it("should display data history", { # Arrange study_data <- .create_mock_study_data() mock_render_method <- mockery::mock() mockery::stub(mod_data_history_server, "render_data_history", mock_render_method) shiny::testServer( app = mod_data_history_server, args = list(study_data = study_data), { # Act output$data_history # Assert mockery::expect_args( mock_render_method, n = 1, data = .study_data_history() ) } ) })
Test are green now
- Changes history is fetched using the
StudyData
object. - Changes history is displayed in the UI.
✔ | F W S OK | Context ✔ | 4 | mod_data_history [0.4s] ══Results ══════════════════════════════════════════════════════════════════════════════════════ Duration: 0.4 s [ FAIL 0 | WARN 0 | SKIP 0 | PASS 4 ]
The whole test file looks like this:
.study_data_history <- function() { tibble::tibble( data_name = c("data_1_v1", "data_1_v2", "data_1_v3"), user = c("user_1", "user_2", "user_1"), updated = c("2023-06-29 17:49:12", "2023-05-29 17:49:12", "2023-04-29 17:49:12"), size = c(1000, 1000, 1000) ) } .create_mock_study_data <- function() { mock <- create_mock_r6_class(StudyData) mock$get_data_history <- mockery::mock(.study_data_history()) mock } describe("mod_data_history", { it("should fetch data history", { # Arrange study_data <- .create_mock_study_data() shiny::testServer( app = mod_data_history_server, args = list(study_data = study_data), { # Act # Assert mockery::expect_called( study_data$get_data_history, n = 1 ) } ) }) it("should display data history", { # Arrange study_data <- .create_mock_study_data() mock_render_method <- mockery::mock() mockery::stub(mod_data_history_server, "render_data_history", mock_render_method) shiny::testServer( app = mod_data_history_server, args = list(study_data = study_data), { # Act output$data_history # Assert mockery::expect_args( mock_render_method, n = 1, data = .study_data_history() ) } ) }) })
Now we only need an implementation of the render_data_history
function, the design of this function can also be driven by tests. Start from listing observable criteria – how we expect this function to behave. Add the first test case, red StudyData$get_data_history
and return a reactable
object, since we chose this package for rendering the data.
Call Your Code from the Legacy Code
Once we implement render_data_history
function, we can inject new module to the existing code:
mod_data_ui <- function(id) { ns <- shiny::NS(id) shiny::div( shiny::h1("Data"), # ..., mod_data_history_ui(ns("data_summary")) ) } mod_data_server <- function(input, output, session) { # ... mod_data_history_server("data_summary", study_data) }
Now we can mark our acceptance criteria as done!
- Data changes history is displayed in a table.
- Table is displayed in “Data” section
Summarizing Sprout Technique and Test Driven Development
Using Sprout Technique and Test Driven Development, we’ve successfully injected a new module into a legacy one. Thanks to tests we have a documented characterization of this new module, it:
- should fetch data history,
- should display data history.
We have robust tests that expect a call to an established interface and expect a call to a rendering function. This test suite characterizes what this module does without knowing those functions’ implementation details. Lower level details are covered by unit tests for both functions – it allows this module and those functions to evolve independently of each other. Tests for the module will remain valid when:
- structure of data returned by
get_data_history
changes, e.g. when API changes, - HTML returned by
render_data_history
changes, e.g. when implementing a new design or switching to a different tables’ library.
Start using TDD now to iterate faster and more confidently with legacy codebases!
The post appeared first on appsilon.com/blog/.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.