Really useful bits of code that are missing from R

richierocks

11 years ago

[This article was first published on 4D Pie Charts » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There are some pieces of code that are so simple and obvious that they really ought to be included in base R somewhere.

Geometric mean and standard deviation – a staple for anyone who deals with lognormally distributed data.

geomean <- function(x, na.rm = FALSE, trim = 0, ...)
{
exp(mean(log(x, ...), na.rm = na.rm, trim = trim, ...))
}

geosd <- function(x, na.rm = FALSE, ...)
{
exp(sd(log(x, ...), na.rm = na.rm, ...))
}

A drop option for nlevels. Sure your factor has 99 levels, but how many of them actually crop up in your dataset?

nlevels <- function(x, drop = FALSE) base::nlevels(x[, drop = drop])

A way of converting factors to numbers that is quicker than as.numeric(as.character(my_factor)) and easier to remember than the method suggested in the FAQ on R.

factor2numeric <- function(f)
{
   if(!is.factor(f)) stop("the input must be a factor")
   as.numeric(levels(f))[as.integer(f)]
}

A “not in” operator. Not many people know the precedence rules well enough to know that !x %in% y means !(x %in% y) rather than (!x) %in% y, but x %!in% y should be clear to all.

"%!in%" <- function(x, y) !(x %in% y)

I’m sure there are loads more snippets like this that would be useful to have; please contribute your own in the comments.

Tagged: r

To leave a comment for the author, please follow the link and comment on their blog: 4D Pie Charts » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.