February 2025 Top 40 New CRAN Packages
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In February, one hundred fifty-nine new packages made it to CRAN. Here are my Top 40 picks in fifteen categories: Artificial Intelligence, Computational Methods, Ecology, Genomics, Health Sciences, Mathematics, Machine Learning, Medicine, Music, Pharma, Statistics, Time Series, Utilities, Visualization, and Weather.
Artificial Intelligence
chores v0.1.0: Provides a collection of ergonomic large language model assistants designed to help you complete repetitive, hard-to-automate tasks quickly. After selecting some code, press the keyboard shortcut you’ve chosen to trigger the package app, select an assistant, and watch your chore be carried out. Users can create custom helpers just by writing some instructions in a markdown file. There are three vignettes: Getting started, Custom helpers, and Gallery.
gander v0.1.0: Provides a Copilot completion experience that knows how to talk to the objects in your R environment. ellmer
chats are integrated directly into your RStudio
and Positron
sessions, automatically incorporating relevant context from surrounding lines of code and your global environment. See the vignette to get started.
GitAI v0.1.0: Provides functions to scan multiple Git
repositories, pull content from specified files, and process it with LLMs. You can summarize the content, extract information and data, or find answers to your questions about the repositories. The output can be stored in a vector database and used for semantic search or as a part of a RAG (Retrieval Augmented Generation) prompt. See the vignette.
Computational Methods
nlpembeds v1.0.0: Provides efficient methods to compute co-occurrence matrices, point wise mutual information (PMI), and singular value decomposition (SVD), especially useful when working with huge databases in biomedical and clinical settings. Functions can be called on SQL
databases, enabling the computation of co-occurrence matrices of tens of gigabytes of data, representing millions of patients over tens of years. See Hong (2021) for background and the vignette for examples.
NLPwavelet v1.0: Provides functions for Bayesian wavelet analysis using individual non-local priors as described in Sanyal & Ferreira (2017) and non-local prior mixtures as described in Sanyal (2025). See README to get started.
pnd v0.0.9: Provides functions to compute numerical derivatives including gradients, Jacobians, and Hessians through finite-difference approximations with parallel capabilities and optimal step-size selection to improve accuracy. Advanced features include computing derivatives of arbitrary order. There are three vignettes on the topics: Compatibility with numDeriv, Parallel numerical derivatives, and Step-size selection.
rmcmc v0.1.1: Provides functions to simulate Markov chains using the proposal from Livingstone and Zanella (2022) to compute MCMC estimates of expectations with respect to a target distribution on a real-valued vector space. The package also provides implementations of alternative proposal distributions, such as (Gaussian) random walk and Langevin proposals. Optionally, BridgeStan’s R interface BridgeStan
can be used to specify the target distribution. There is an Introduction to the Barker proposal and a vignette on Adjusting the noise distribution.
sgdGMF v1.0: Implements a framework to estimate high-dimensional, generalized matrix factorization models using penalized maximum likelihood under a dispersion exponential family specification, including the stochastic gradient descent algorithm with a block-wise mini-batch strategy and an efficient adaptive learning rate schedule to stabilize convergence. All the theoretical details can be found in Castiglione et al. (2024). Also included are the alternated iterative re-weighted least squares and the quasi-Newton method with diagonal approximation of the Fisher information matrix discussed in Kidzinski et al. (2022). There are four vignettes, including introduction and residuals.
Data
acledR v0.1.0: Provides tools for working with data from ACLED (Armed Conflict Location and Event Data). Functions include simplified access to ACLED’s API, methods for keeping local versions of ACLED data up-to-date, and functions for common ACLED data transformations. See the vignette to get started.
Horsekicks v1/0/2: Provides extensions to the classical dataset Death by the kick of a horse in the Prussian Army first used by Ladislaus von Bortkeiwicz in his treatise on the Poisson distribution Das Gesetz der kleinen Zahlen. Also included are deaths by falling from a horse and by drowning. See the vignette.
OhdsiReportGenerator v1.0.1: Extracts results from the Observational Health Data Sciences and Informatics result database and generates Quarto
reports and presentations. See the package guide.
wbwdi v1.0.0: Provides functions to access and analyze the World Bank’s World Development Indicators (WDI) using the corresponding API. WDI provides more than 24,000 country or region-level indicators for various contexts. See the vignette.
Ecology
rangr v1.0.6: Implements a mechanistic virtual species simulator that integrates population dynamics and dispersal to study the effects of environmental change on population growth and range shifts. Look here for background and see the vignette to get started.
Economics
godley v0.2.2: Provides tools to define, simulate, and validate stock-flow consistent (SFC) macroeconomic models by specifying governing systems of equations. Users can analyze how macroeconomic structures affect key variables, perform sensitivity analyses, introduce policy shocks, and visualize resulting economic scenarios. See Godley and Lavoie (2007), Kinsella and O’Shea (2010) for background and the vignette to get started.
Genomics
gimap v1.0.3: Helps to calculate genetic interactions in CRISPR targets by taking data from paired CRISPR screens that have been pre-processed to count tables of paired gRNA reads. Output are genetic interaction scores, the distance between the observed CRISPR score and the expected CRISPR score. See Berger et al. (2021) for background and the vignettes Quick Start, Timepoint Experiment, and Treatment Experiment.
MIC v1.0.2: Provides functions to analyze, plot, and tabulate antimicrobial minimum inhibitory concentration (MIC) data and predict MIC values from whole genome sequence data stored in the Pathosystems Resource Integration Center (2013) database or locally. See README for examples.
Health Sciences
matriz v1.0.1: Implements a workflow that provides tools to create, update, and fill literature matrices commonly used in research, specifically epidemiology and health sciences research. See README to get started.
Mathematics
flint v0.0.3: Provides an interface to FLINT
, a C library for number theory which extends GNU MPFR and GNU MP with support for arithmetic in standard rings (the integers, the integers modulo n, the rational, p-adic, real, and complex numbers) as well as vectors, matrices, polynomials, and power series over rings and implements midpoint-radius interval arithmetic, in the real and complex numbers See Johansson (2017) for information on computation in arbitrary precision with rigorous propagation of errors and see the NIST Digital Library of Mathematical Functions for information on additional capabilities. Look here to get started.
Machine Learning
tall v0.1.1: Implements a general-purpose tool for analyzing textual data as a shiny
application with features that include a comprehensive workflow, data cleaning, preprocessing, statistical analysis, and visualization. See the vignette.
“}
Medicine
BayesERtools v0.2.1: Provides tools that facilitate exposure-response analysis using Bayesian methods. These include a streamlined workflow for fitting types of models that are commonly used in exposure-response analysis – linear and Emax for continuous endpoints, logistic linear and logistic Emax for binary endpoints, as well as performing simulation and visualization. Look here to learn more about the workflow, and see the vignette for an overview.
Medicine Continued
PatientLevelPrediction v6.4.0: Implements a framework to create patient-level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features, which can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps et al. (2017). There are fourteen vignettes, including Building Patient Level Prediction Models and Best Practices.
SimTOST v1.0.2: Implements a Monte Carlo simulation approach to estimating sample sizes, power, and type I error rates for bio-equivalence trials that are based on the Two One-Sided Tests (TOST) procedure. Users can model complex trial scenarios, including parallel and crossover designs, intra-subject variability, and different equivalence margins. See Schuirmann (1987), Mielke et al. (2018), and Shieh (2022) for background. There are seven vignettes including Introduction and Bioequivalence Tests for Parallel Trial Designs: 2 Arms, 1 Endpoint.
Music
musicXML v1.0.1: Implements tools to facilitate data sonification and create files to share music notation in the musicXML format. Several classes are defined for basic musical objects such as note pitch, note duration, note, measure, and score. Sonification functions map data into musical attributes such as pitch, loudness, or duration. See the blog and Renard and Le Bescond (2022) for examples and the vignette to get started.
Pharma
emcAdr v1.2: Provides computational methods for detecting adverse high-order drug interactions from individual case safety reports using statistical techniques, allowing the exploration of higher-order interactions among drug cocktails. See the vignette.
SynergyLMM v1.0.1: Implements a framework for evaluating drug combination effects in preclinical in vivo studies, which provides functions to analyze longitudinal tumor growth experiments using linear mixed-effects models, perform time-dependent analyses of synergy and antagonism, evaluate model diagnostics and performance, and assess both post-hoc and a priori statistical power. See Demidenko & Miller (2019 for the calculation of drug combination synergy and Pinheiro and Bates (2000) and Gałecki & Burzykowski (2013) for information on linear mixed-effects models. The vignette offers a tutorial.
vigicaen v0.15.6: Implements a toolbox to perform the analysis of the World Health Organization (WHO) Pharmacovigilance database, VigiBase, with functions to load data, perform data management, disproportionality analysis, and descriptive statistics. Intended for pharmacovigilance routine use or studies. There are eight vignettes, including basic workflow and routine pharmacoviligance.
Psychology
cogirt v1.0.0: Provides tools to psychometrically analyze latent individual differences related to tasks, interventions, or maturational/aging effects in the context of experimental or longitudinal cognitive research using methods first described by Thomas et al. (2020). See the vignette.
Statistics
DiscreteDLM v1.0.0: Provides tools for fitting Bayesian distributed lag models (DLMs) to count or binary, longitudinal response data. Count data are fit using negative binomial regression, binary are fit using quantile regression. Lag contribution is fit via b-splines. See Dempsey and Wyse (2025) for background and README for examples.
oneinfl v1.0.1: Provides functions to estimate Estimates one-inflated positive Poisson, one-inflated zero-truncated negative binomial regression models, positive Poisson models, and zero-truncated negative binomial models along with marginal effects and their standard errors. The models and applications are described in Godwin (2024). See README for and example.
Time Series
BayesChange v2/0/0: Provides functions for change point detection on univariate and multivariate time series according to the methods presented by Martinez & Mena (2014) and Corradin et al. (2022) along with methods for clustering time dependent data with common change points. See Corradin et. al. (2024). There is a tutorial.
echos v1.0.3: Provides a lightweight implementation of functions and methods for fast and fully automatic time series modeling and forecasting using Echo State Networks. See the vignettes Base functions and Tidy functions.
quadVAR v0.1.2: Provides functions to estimate quadratic vector autoregression models with the strong hierarchy using the Regularization Algorithm under Marginality Principle of Hao et al. (2018) to compare the performance with linear models and construct networks with partial derivatives. See README for examples.
Utilities
aftables v1.0.2: Provides tools to generate spreadsheet publications that follow best practice guidance from the UK government’s Analysis Function. There are four vignettes, including an Introduction and Accessibility.
watcher v0.1.2: Implements an R
binding for libfswatch
, a file system monitoring library, that enables users to watch files or directories recursively for changes in the background. Log activity or run an R
function every time a change event occurs. See the README for an example.
Visualization
jellyfisher v1.0.4: Generates interactive Jellyfish plots to visualize spatiotemporal tumor evolution by integrating sample and phylogenetic trees into a unified plot. This approach provides an intuitive way to analyze tumor heterogeneity and evolution over time and across anatomical locations. The Jellyfish plot visualization design was first introduced by Lahtinen et al. (2023). See the vignette.
xdvir v0.1-2: Provides high-level functions to render LaTeX
fragments as labels and data symbols in ggplot2
plots, plus low-level functions to author, produce, and typeset LaTeX
documents, and to produce, read, and render DVI
files. See the vignette.
Weather
RFplus v1.4-0: Implements a machine learning algorithm that merges satellite and ground precipitation data using Random Forest for spatial prediction, residual modeling for bias correction, and quantile mapping for adjustment, ensuring accurate estimates across temporal scales and regions. See the vignette.
SPIChanges v0.1.0: Provides methods to improve the interpretation of the Standardized Precipitation Index under changing climate conditions. It implements the nonstationary approach of Blain et al. (2022) to detect trends in rainfall quantities and quantify the effect of such trends on the probability of a drought event occurring. There is an Introduction and a vignette Monte Carlo Experiments and Case Studies.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.