Key considerations for retiring/superseding an R package

Posted on February 2, 2025 by James Mba Azam in R bloggers | 0 Comments

[This article was first published on Epiverse-TRACE developer space, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Most of our work in Epiverse TRACE involves either developing an R package from scratch or adopting and maintaining an existing R package. In the former case, decision-making during development is guided by internal policies documented in the Epiverse-TRACE blueprints. However, a less common scenario for us has been taking on the maintenance of an existing package — a situation we recently encountered with the {bpmodels} R package.

In this post, I want to share some considerations and lessons learned from maintaining {bpmodels}, originally developed by Sebastian Funk at the London School of Hygiene & Tropical Medicine (with contributions by Zhian Kamvar and Flavio Finger), and the decision to retire/supersede it with {epichains}. The aim is not to define strict rules but to spark a conversation about good enough practices and alternative approaches that the R developer community has used or would like to be used more widely.

One of the first considerations was the scope of the package. When maintaining or re-imagining an R package, assessing its scope and identifying opportunities for refinement is crucial. For example, some packages have evolved significantly in the broader R ecosystem to better align with user needs. We will highlight a few examples.{plyr} was split into {dplyr} and {purrr} for manipulating data frame and list objects respectively, reflecting more specialized functionality based on object types. Similarly, {reshape} evolved into {reshape2}¹ and currently into {tidyr}, with each iteration simplifying and improving upon its predecessor. Another example is the renaming² of {ggmissing} into the more generalized {naniar}. In the epidemiology ecosystem, two examples include the evolution of the {EpiNow} into {EpiNow2} and {incidence} to {incidence2}.

For {bpmodels}, we wanted to unify the simulation functions (existing as two functions previously) and improve the function signature by renaming several of the arguments for readability. We also wanted to introduce an object-oriented workflow to aid in interoperability with existing tools such as epicontacts and epiparameter. The object-oriented backend would also allow us to implement better methods for printing, summarising and aggregating the simulation output. Some of these considerations would have been less disruptive than others but the change in function name and signature would have led to a lot of disruptions including deprecating the existing functions and arguments.

One thing is clear from the examples on scope changes – they often lead to name changes. Another important decision was whether to rebrand the package with a new name. A new name can signal a fresh approach and address limitations of the original package. The most popular example is the renaming of {ggplot} to {ggplot2}. Other examples include renaming {reshape} to {reshape2} and {tidyr}. In our case, we decided to fork the original {bpmodels} repository under Epiverse-TRACE and maintain the old package to avoid breaking scripts that rely on it. At the same time, we introduced {epichains} as the successor. The name reflects the fact that it is a package for analysing epidemiological transmission chains.

A second key consideration is the plans that the original package author(s) may have had and their views on any future changes. In our case, this was fairly straightforward because the maintainer of the original package was fully involved in the refactoring. Another package author who had made substantial contributions could be reached and gave their approval. More generally, however, bringing all package authors on board with, for example, changes in scope and name is an important step in taking on maintenance of a package and one that should not be neglected.

We also had to consider whether to archive {bpmodels} or allow it to coexist with {epichains}. We decided to keep {bpmodels} accessible to sustain the reproducibility of existing code using the package. The package was moved back to the epiforecasts GitHub organisation where it originated from. We, however, added a lifecycle badge to communicate the package’s retired status and text in the README about our plans for low maintenance.

Another technical consideration was how to handle version control and commit histories. When forking a package, it’s important to decide whether to retain the commit history. Options include squashing the history to start with a clean slate, which risks losing visibility of past contributions, or tagging the HEAD commit of the original repository and building from there. For {bpmodels}, we chose to keep the history intact to retain the contributions of its original authors.

Semantic versioning was another key decision point. Since {epichains} was not going to be available immediately but would be developed in the open (on GitHub), we needed to consider how to communicate that to potential users. We decided to start at version 0.0.0.9999 to signal an experimental and unstable phase³ while iterating on features.

Throughout this process, we drew inspiration from various sources. Hadley Wickham’s reasoning for {reshape2} as a reboot of {reshape} and Nicholas Tierney’s reason for renaming {ggmissing} to {naniar} were helpful. Additionally, a talk at UseR! 2024 entitled “retiring packages with extensive reverse dependencies” offered practical advice.

This transition has raised several questions for the community. How do you decide whether to supersede or deprecate a package? What strategies have worked for maintaining backward compatibility while introducing new tools? How do you document and communicate major changes to users? How is all of this done while appropriately crediting past contributions and retain discoverability and citation/use tracking?

We’d love to hear your thoughts and experiences. Let’s start a conversation about maintaining and evolving open-source tools in a sustainable way.

Footnotes

Notice the URL points to {reshape} instead of {reshape2} although the README mentions the latter. The README however lays out the reasons for the evolution.↩︎
See the reasons given in the changelog here and in the README.↩︎
See more details on R package versioning and what they communicate in the R packages book.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{mba_azam2025,
  author = {Mba Azam, James and Funk, Sebastian},
  title = {Key Considerations for Retiring/Superseding an {R} Package},
  date = {2025-02-03},
  url = {https://epiverse-trace.github.io/posts/superseding-bpmodels/},
  langid = {en}
}

For attribution, please cite this work as:

Mba Azam, James, and Sebastian Funk. 2025. “Key Considerations for Retiring/Superseding an R Package.” February 3, 2025. https://epiverse-trace.github.io/posts/superseding-bpmodels/.

To leave a comment for the author, please follow the link and comment on their blog: Epiverse-TRACE developer space.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Key considerations for retiring/superseding an R package

Footnotes

Reuse

Citation

Related

Footnotes

Reuse

Citation

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)