Really useful bits of code that are missing from R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
There are some pieces of code that are so simple and obvious that they really ought to be included in base R somewhere.
Geometric mean and standard deviation – a staple for anyone who deals with lognormally distributed data.
geomean <- function(x, na.rm = FALSE, trim = 0, ...) { exp(mean(log(x, ...), na.rm = na.rm, trim = trim, ...)) } geosd <- function(x, na.rm = FALSE, ...) { exp(sd(log(x, ...), na.rm = na.rm, ...)) }
A drop option for nlevels
. Sure your factor has 99 levels, but how many of them actually crop up in your dataset?
nlevels <- function(x, drop = FALSE) base::nlevels(x[, drop = drop])
A way of converting factors to numbers that is quicker than as.numeric(as.character(my_factor))
and easier to remember than the method suggested in the FAQ on R.
factor2numeric <- function(f) { if(!is.factor(f)) stop("the input must be a factor") as.numeric(levels(f))[as.integer(f)] }
A “not in” operator. Not many people know the precedence rules well enough to know that !x %in% y
means !(x %in% y)
rather than (!x) %in% y
, but x %!in% y
should be clear to all.
"%!in%" <- function(x, y) !(x %in% y)
I’m sure there are loads more snippets like this that would be useful to have; please contribute your own in the comments.
Tagged: r
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.