December 2019: “Top 40” New R Packages
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
One hundred fifty-two packages made it to CRAN in December. Here are my “Top 40” picks in ten categories: Data, Genomics, Machine Learning, Mathematics, Medicine, Science, Statistics, Time Series, Utilities, and Visualization.
Data
climate v0.3.0: Provides access to meteorological and hydrological data from OGIMET, University of Wyoming – atmospheric vertical profiling data, and Polish Institute of Meteorology and Water Management – National Research Institute. There is a vignette.
CCAMLRGIS v3.0.1: Loads and creates spatial data, including layers and tools that are relevant to the activities of the Commission for the Conservation of Antarctic Marine Living Resources ( CCAMLR). Have a look at the vignette.
schrute v0.1.1: Contains the complete scripts from the American version of the Office television show in tibble format. Have a look at the vignette and practice NLP.
simfinR v0.1.0: Provides access to SimFin financial data including balance sheets, cash flow and income statements through the api. Look here for details.
statcanR v0.1.0: Provides access to Statistics Canada’s Web Data Service. See Warin & Le Duc (2019) and the vignette.
Genomics
ampir v0.1.0: Implements a toolkit to predict antimicrobial peptides from protein sequences on a genome-wide scale, including an SVM model trained on publicly available antimicrobial peptide data using calculated physico-chemical and compositional sequence properties described in Meher et al. (2017). There is a brief Introduction.
simplePHENOTYPES v1.0.5: Implements algorithms for simulating pleiotropy and Linkage Disequilibrium under additive, dominance and epistatic models. See Lipka et al. (2012) and Rice and Lipka (2019) for background and the vignette for an introduction.
TreeTools v0.1.3: Provides functions for the creation, modification and analysis of phylogenetic trees and the import and export of trees from Newick, Nexus (Maddison et al. 1997), and TNT formats. There are vignettes on Loading Data, Loading Trees, and Navigating the File System.
Machine Learning
AzureVision v1.0.0: Implements an interface to Azure Computer Vision and Azure Custom Vision which allow users to leverage the cloud to carry out visual recognition tasks using advanced image processing models. There is a vignette on Computer Vision and another on Custom Vision.
dann v0.1.0: Implements discriminant Adaptive Nearest Neighbor Classification, variation of k nearest neighbors where the neighborhood is elongated along class boundaries. See Hastie (1995) for details. There is an Introduction and a vignette on Sub-dann.
eventstream v0.1.0: Provides functions to extract and classify events in contiguous spatio-temporal data streams of 2 or 3 dimensions. For details see Kandanaarachchi et al. 2018. There is an example in README.
isotree v0.1.8: Provides multi-threaded implementations of isolation forest, extended isolation forest, SCiForest, and fair-cut forest for isolation-based outlier detection, clustered outlier detection, distance or similarity approximation, and imputation of missing values as described in Cortes (2019). Look here for an example.
mlr3proba v0.1.1: Extends mlr3
for probabilistic supervised learning that includes probabilistic and interval regression, survival modeling, and other specialized models. There is a vignette on Survival Analysis.
NLPclient v1.0: Implements an interface to the Stanford CoreNLP annotation client which includes a part-of-speech (POS) tagger, a named entity recognizer (NER), a parser, and a co-reference resolution system. See README for installation details.
stray v0.1.0: Modifies the HDoutliers
package for outlier detection in high dimensional data to include the algorithm proposed in Talagala, Hyndman and Smith-Miles (2019).
tfhub v0.7.0: is a library for the publication, discovery, and consumption of reusable parts of machine learning models. Modules comprise self-contained parts of TensorFlow
graphs along with weights and assets that can be reused across different tasks in a process known as transfer learning. There is an Overview and vignettes on Key Concepts and using TensorFlow
with Keras
.
Mathematics
dual v0.0.3: Implements automatic differentiation using dual numbers and returns the output value of a mathematical function along with its exact first derivative (or gradient). For more details see Baydin et al. (2018).
set6 v0.1.0: Provides an object-oriented interface for constructing and manipulating mathematical sets, including (countably finite) sets, tuples, intervals (countably infinite or uncountable), and fuzzy variants. using R6
. See the vignette for an introduction.
Medicine
LARisk v0.1.0: Provides functions to compute lifetime attributable risk of radiation-induced cancer. See Gonzalez et al. (2012) for background and the vignette for an example.
SCtools v0.3.0: Provides extensions to the synthetic controls analyses performed by the package Synth
as detailed in Abadie et al. (2011) that include generating and plotting placebos, post/pre-MSPE (mean squared prediction error) significance tests and plots, and calculating average treatment effects for multiple treated units. There is a vignette on replicating the Basque Study and another on Alcohol Consumption.
Science
chronosphere v0.2.0: Provides functions to facilitate the spatial analyses in (paleo)environmental/ecological research and serves as a gateway to plate tectonic reconstructions, deep time global climate model results as well as fossil occurrence datasets such as the Paleobiology Database and the PaleoReefs Database. See the vignette for an introduction.Chronosphere.png
OCNet v0.1.1: Provides functions to generate analyze Optimal Channel Networks (OCNs): oriented spanning trees reproducing all scaling features characteristic of real, natural river networks. See Rinaldo et al. (2014) for an overview on the OCN concept, Furrer and Sain (2010) for the construct used, and the vignette for examples.
Statistics
bnma v1.0.0: Provides functions for network meta-analyses using Bayesian framework of Dias et al. (2013). See the vignette.
npsurvSS v1.0.1: Provides sample size and power calculations for common non-parametric tests in survival analysis including the difference in (or ratio of) t-year survival, difference in (or ratio of) p-th percentile survival, difference in (or ratio of) restricted mean survival time, and the weighted log-rank test. There are vignettes on Basic Functions, Optimal Randomization Ratio, and Delayed Treatment Effect.
sail v0.1.0: Implements sparse additive interaction learning with the strong heredity property, i.e., an interaction is selected only if its corresponding main effects are also included. See Bhatnagar et al. (2019) for background. There is also an Introduction and a vignette on supplying a user-defined design matrix.
SequenceSpikeSlab v0.1.1: Implements the algorithms described in Van Erven & Szabo (2018) to calculate the exact Bayes posterior for the Sparse Normal Sequence Model. See the vignette.
tcensReg v0.1.5: Implements maximum likelihood estimation (MLE) assuming an underlying left truncated normal distribution with left censoring described in Williams et al. (2019). See the vignette.
univariateML v1.0.0: Looks back to the roots of maximum likelihood estimation (Fisher (1921) to provide functions for the ML estimation of uni variate densities. There is an Overview and vignettes on Copula Modeling and Distributions.
Time Series
imputeFin v0.1.0: Provides functions to impute the missing values based on modeling the time series with a random walk or an autoregressive (AR) model, convenient to model log-prices and log-volumes in financial data. See Liu et al. (2019) for background and the vignette for examples.
VLTimeCausality v0.1.0: Implements a framework to infer causality on a pair of time series of real numbers based on variable-lag Granger causality and transfer entropy. See Zheleva & Berger-Wolf (2019) for the details and the vignette for examples.
Utilities
asciicast v1.0.0: Implements tools to record screen casts from R scripts and convert them to animated SVG images for use in README
files and blog posts. It includes asciinema-player
as an HTML
widget, and a knitr
engine, to embed ascii
screen casts in R Markdown documents. There is a vignette.
funneljoin v0.1.0: Implements a time-based joins to analyze sequence of events, both in memory and out of memory. See the vignette for details.
hardhat v0.1.1: Provides tools to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input. There is an Introduction and vignettes on Forging Data and Molding Data.
proffer v0.0.2: Builds on pprof
to provide profiling tools capable of detecting sources of slowness in R code. Look here for more information.
robenblas v0.2.0: Facilitates downloading, compiling and linking the OpenBLAS
library for users of any GNU/Linux
distribution. See README for help.
sortable v0.4.2: Provides functions to enables drag-and-drop behavior in Shiny apps, by exposing the functionality of the SortableJS
JavaScript library as an htmlwidget
. There is a live demo on Using Sortable and another on Using Sortable widgets, and a vignette on the Interface to SortableJS.
sparkhail v0.1.1: Implements a sparklyr
interface to Hail
, an open-source, general-purpose, Python
based data analysis tool with additional data types and methods for working with genomic data, that has been built to scale and provide first-class support for multi-dimensional structured data which is typical of genome-wide association studies. See README for information on how to use the package.
trimmer v0.8.1: Implements a lightweight toolkit to reduce the size of a list object based on user input. See the vignette.
Visualization
gggibbous v0.1.0: Extends ggplot2
to offer moon charts, pie charts where the proportions are shown as crescent or gibbous portions of a circle, like the lit and unlit portions of the moon. It i all illuminated in the vignette.
patchwork v1.0.0: Extends the ggplot2
API to allow for arbitrarily complex plot compositions by providing mathematical operators for combining multiple plots. See the vignette for examples.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.