Site icon R-bloggers

Tools to tidy up R code

[This article was first published on SAS and R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week we made an impassioned plea for attention to style in formatting R and SAS code.

While it’s always better to adopt a consistent style and use it whenever you write code, the reality is that sometimes formatting slips (or you end up repurposing code that others wrote. In those situations, the formatR package (due to Yihui Xie) is a useful tool to help tidy up code.

Imagine that the poorly formatted and somewhat obfuscated code from the last entry has been copied to the clipboard:
ds=read.csv("http://www.math.smith.edu/r/data/help.csv");attach(ds)
fOo=ks.test(age[female==1],age[female==0],data=ds)
plotdens=function(x,y,mytitle, mylab){densx = density(x)
densy = density(y);plot(densx,main=mytitle,lwd=3,xlab=mylab, 
bty="l");lines(densy,lty=2,col=2,lwd=3);xvals=c(densx$x, 
rev(densy$x));yvals=c(densx$y,rev(densy$y));polygon(xvals, 
yvals,col="gray")};mytitle=paste("Test of ages: D=",round(fOo$statistic,3),
" p=",round(fOo$p.value,2),sep="");plotdens(age[female==1],
age[female==0],mytitle=mytitle,mylab="age (in years)")
legend(50,.05,legend=c("Women","Men"),col=1:2,lty=1:2,lwd=2)

Running the commands:
library(formatR)
tidy.source()

will generate the following output:
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
attach(ds)
fOo = ks.test(age[female == 1], age[female == 0], data = ds)
plotdens = function(x, y, mytitle, mylab) {
    densx = density(x)
    densy = density(y)
    plot(densx, main = mytitle, lwd = 3, xlab = mylab, bty = "l")
    lines(densy, lty = 2, col = 2, lwd = 3)
    xvals = c(densx$x, rev(densy$x))
    yvals = c(densx$y, rev(densy$y))
    polygon(xvals, yvals, col = "gray")
}
mytitle = paste("Test of ages: D=", round(fOo$statistic, 3), " p=", 
    round(fOo$p.value, 2), sep = "")
plotdens(age[female == 1], age[female == 0], mytitle = mytitle, 
    mylab = "age (in years)") 

The cleaned up code is much easier to parse, with minimal effort. Some quibbles: I’m not as fond of using 4 spaces of indentation. And I always worry what’ll happen if there are syntax errors (yipes!). But this is a useful tool to have in your box.

To leave a comment for the author, please follow the link and comment on their blog: SAS and R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.