Custom Summary Stats as Dataframe or List
[This article was first published on theBioBucket*, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
On Stackoverflow I found this useful example on how to apply custom statistics on a dataframe and return the results as list or dataframe:Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
somedata<- data.frame(
year=rep(c(1990,1995,2000,2005,2010),times=3),
country=rep(c("US", "Brazil", "Asia"), each=5),
pct = c(0.99, 0.99, 0.98, 0.05, 0.9,
0.4, 0.5, 0.55, 0.5, 0.45,
0.7, 0.85, 0.9, 0.85, 0.75)
)
someStats <- function(x)
{
dp <- as.matrix(x$pct)-mean(x$pct)
indp <- as.matrix(x$year)-mean(x$year)
f <- lm.fit( indp,dp )$coefficients
w <- sd(x$pct)
m <- min(x$pct)
results <- c(f,w,m)
names(results) <- c("coef","sdev", "minPct")
results
}
# summary statistics as list with by():
by(somedata, list(country=somedata$country), someStats)
# ..or as dataframe with ddply():
library(plyr)
ddply(somedata, .(country), someStats)To leave a comment for the author, please follow the link and comment on their blog: theBioBucket*.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.