Site icon R-bloggers

Read the source code

[This article was first published on The stupidest thing... » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The other day, there was a bit of a twitter conversation about qqline in R.

It made me think: how exactly is the line produced by qqline chosen? I seemed to recall that the line was through the first and third quartiles.

An advantage of R is that you can just type the name of the function and see the code:

# qqline
function (y, datax = FALSE, distribution = qnorm, probs = c(0.25,
    0.75), qtype = 7, ...)
{
    stopifnot(length(probs) == 2, is.function(distribution))
    y <- quantile(y, probs, names = FALSE, type = qtype, na.rm = TRUE)
    x <- distribution(probs)
    if (datax) {
        slope <- diff(x)/diff(y)
        int <- x[1L] - slope * y[1L]
    }
    else {
        slope <- diff(y)/diff(x)
        int <- y[1L] - slope * x[1L]
    }
    abline(int, slope, ...)
}

I was right: They take the 25th and 75th percentiles of the data and of the theoretical distribution, calculate the slope and y-intercept of the line that goes through those two points, and use abline to draw the line.

Open source means the source is open, so why not take the time to look at it?

Sometimes typing the name of the function doesn’t tell you much:

# qqnorm
function (y, ...)
UseMethod("qqnorm")

In such cases, you could try typing, for example, qqnorm.default.

Still, the comments (if there were any) get stripped off, and for long functions, it’s not pretty. So I like to keep a copy of the source code (for example, R-3.0.1.tar.gz; extract it with tar xzf R-3.0.1.tar.gz). I use grep to find the relevant files.

For example,

grep -r 'qqline' R-3.0.1/src/

shows that I should look for qqline in

R-3.0.1/src/library/stats/R/qqnorm.R

For something like cor, you might want to do:

grep -r 'cor <-' R-3.0.1/src

Or maybe:

grep -r 'cor <-' R-3.0.1/src/library/stats/R

But for cor, you probably also want to look at the C code, which is in

R-3.0.1/src/library/stats/src/cov.c

You can learn a lot about programming from the source code for R.


To leave a comment for the author, please follow the link and comment on their blog: The stupidest thing... » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.