September 2017 New Package Picks
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
There were so many interesting ideas among the 222 new packages that made it to CRAN in September that I found it exceptionally difficult to decide on the “Top 40” packages. In the end, I only managed to limit my selection to 40 by avoiding all packages that I would normally classify under “Data”: packages that are primarily intended to provide access to some data source. I hope to make up for this by providing a list of data packages sometime soon.
Below are my picks for September’s Top 40 in six categories: Computational Methods, Machine Learning, Science, Statistics, Utilities, and Visualizations.
Computational Methods
DES v1.0.0: Implements an event-oriented approach to Discrete Event Simulation. There is a tutorial.
JuliaCall 0.9.3: Implements an interface to Julia. The vignette illustrates basic usage.
Rlinsolve v0.1.1: Implements iterative solvers for sparse linear systems of equations, including basic stationary iterative solvers using Jacobi, Gauss-Seidel, Successive Over-Relaxation and SSOR methods and non-stationary, Krylov subspace methods. There is a vignette to get started. Detailed descriptions may be found in the SIAM book.
sdpt3r v0.1: Implements the SDPT3 method of Toh, Todd, and Tutuncu to solve Semi-Definite Linear Programming problems. There are several vignettes illustrating the use of the package in various applications, including D-Optimal Experimental Design and Distance Weighted Discrimination.
VeryLargeIntegers v0.1.4: Provides tools to work with arbitrarily large integers without loss of precision.
Machine Learning
bnclassify v0.3.3: Implements algorithms for learning discrete Bayesian network classifiers from data, including a number of those described in Bielza & Larranaga. There is an Introduction and vignettes giving Runtime Information and additional Technical Information.
DMRnet v0.1.0: Provides model selection algorithms for regression and classification, where the predictors can be numerical and categorical and the number of regressors exceeds the number of observations. See the papers by Maj-Kańska et al. and Pokarowski and Mielniczuk for the mathematical details.
ELMSurv v0.4: Implements an Extreme Learning Machine for Survival Analysis. Look here for details and here to get started.
fastrtext v0.2.1: Provides an interface to Facebook’s fastText library for text representation and classification. There is a List of Commands and vignettes on Supervised and Unsupervised learning.
FSelectorRcpp v0.1.8: provides an Rcpp
-based implementation of FSelector
entropy-based feature selection algorithms based on an Multi-Interval Discretization with a sparse matrix support. There are vignettes on Getting Started and Benchmarks.
googleLanguageR v0.1.0: Provides an interface to Google Cloud machine-learning APIs for text and speech tasks. Call the Cloud Translation API for detection and translation of text, the Natural Language API to analyse text for sentiment, entities or syntax, and the Cloud Speech API to transcribe sound files to text. There is an Introduction and vignettes for the NLP, Speech, and Translation APIs.
leabRa v0.1.0: Implements the Leabra (local, error-driven and associative, biologically realistic algorithm) that allows for the construction of artificial neural networks that are biologically realistic, and balances supervised and unsupervised learning within a single framework. See the vignette to get started and look here for details.
lime v0.3.0: Is a port of the Python package, which attempts to explain the outcome of black-box models by fitting local models around the points of interest. Look here for details. There is a vignette to get you started.
slowraker v0.1.0: Implements the RAKE algorithm, which can be used to extract keywords from documents without any training data. There is a Getting Started vignette and a list of FAQs.
udpipe v0.1.1: Provides a natural-language-processing toolkit for tokenization, parts-of-speech tagging, lemmatization, and dependency parsing of raw text. For details, see this paper and the vignettes on Annotating Text and Model Building.
Science
afpt v1.0.0: Implements the aerodynamic power model described in Klein Heerenbrink et al., and allows estimation and modelling of flight costs in vertebrate animal flight. There are vignettes on Basic Usage, the underlying Aerodynamic Model, and Multiple Birds.
soundgen v1.1.O: Tools for sound synthesis and acoustic analysis. There are vignettes on Acoustic Analysis and Sound Generation.
Statistics
cr17 v0.1.0: Provides tools for analyzing competing-risks models, including testing differences between groups (Gray and Fine and Gray) and visualizations of survival and cumulative incidence curves. The vignette gives examples.
EAinference v0.2.1: Provides estimator augmentation methods for statistical inference on high-dimensional data, as described in Zho and Zhou and Min. The vignette describes how to use the package.
fdAnova v0.1.0: Provides functions to perform analysis of variance testing procedures for univariate and multivariate functional data. See Cuesta-Albertos and Febrero-Bande. There is a comprehensive vignette.
geex v1.0.3: Provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations. See M-estimation as in Stefanski & Boos. There is an Introduction, as well as vignettes on M-estimation, Custom root solvers, Parameter Estimation, Software Design, and more.
mosaicModel v0.3.0: Provides functions for evaluating, displaying, and interpreting statistical models with the goal of abstracting the operations on models from the particular architecture of the model. The vignette shows how to use the package.
odr v0.3.2: Provides methods for calculating the optimal sample allocation that minimizes variance of treatment effects in a multilevel randomized trial under fixed budget and cost structure, and for performing power analyses with and without accommodating costs and budget. There is a vignette.
mvord v0.1.0: Provides a flexible framework for fitting multivariate ordinal regression models with composite likelihood methods. The vignette gives the details.
OultiersO3 v0.2.1: Provides methods for identifying potential outliers for all combinations of a dataset’s variables. The vignette shows how to use the package.
powerlmm v0.1.0: Implements both analytical and simulation methods to calculate power for two- and three-level multilevel longitudinal studies with missing data. The analytical calculations extends the method described in Galbraith et al. to three-level models. There are tutorials on Model Evaluation via Monte Carclo Simulation, Two-level Longitudinal Power Analysis, Three-level Longitudinal Power Analysis, and a vignette on the Details of Power Calculations.
randnet v0.1: Facilitates model-selection and parameter-tuning procedures for a class of random network models. Model selection can be done by a general cross-validation framework called ECV, NCV, a likelihood ratio method, and spectral methods.
threshr v1.0.0: Provides functions for the selection of thresholds for use in extreme value models, based mainly on the methodology in Northrop, Attalides and Jonathan. There is a vignette.
tscount v1.4.0: Implements likelihood-based methods for model fitting and assessment, prediction, and intervention analysis of count time series following generalized linear models. The vignette provides the details.
Utilities
basictabler v0.1.0: Provides functions to create tables from data frames and matrices, manipulate tables row-by-row, column-by-column or cell-by-cell, and then publish them using HTML
, HTML widgets
or Excel
. There is an Introduction and vignettes on Working with Cells, Outputs, Styling, Formatting, Shiny, and Excel.
bigstatsr v0.2.2: Uses file-backed matrices to provide scalable statistical tools.
keyring v1.0.0: Provides a platform-independent API to access the operating system’s credential store. It currently supports: Keychain
on macOS
, The Credential Store on Windows
, the Secret Service API on Linux
, and a simple, platform-independent store implemented with environment variables.
pinp v0.0.2: Offers a PNAS
-like style for rmarkdown
derived from the Proceedings of the National Academy of Sciences of the United States of America. The vignette shows how to get started.
re2r v0.2.0: Provides an interface to Google’s deterministic finite-automaton-based regular expression engine that is very fast at matching large amounts of text. There is an Introduction and a vignette on Syntax.
spiderbar v0.2.0: Provides a wrapper for the rep-cpp C++ library for processing robots.txt
files in accordance with the The Robots Exclusion Protocol, a set of standards for allowing or excluding robot/spider crawling of different areas of site content. Look in the README for an example of how to use the package.
tibbletime v0.0.2: Is an extension of the tibble
package that allows for the creation of time-aware tibbles. Some immediate advantages include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and calling functions similar in spirit to the map
family from purrr
on time-based tibbles. There is an Introduction and vignettes on Time-based Filtering, Changing Periodicity, and Rolling Calculaions.
Visualizations
egg v0.2.0: Provides miscellaneous functions to customize ggplot2
plots, including high-level functions to post-process layouts and allow alignment between plot panels, as well as setting panel sizes to fixed values. There is an Overview and a vignette for laying out multiple plots on a page.
ggridges v0.4.1: Extends ggplot2
to enable ridgeline plots, which are a way of visualizing changes in distributions over time or space. There is an introduction and a gallery of examples.
linemap v0.1.0: Provides functions to create maps from lines. The README file shows examples.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.