Use R to Analyze Players for your Fantasy Hockey League
[This article was first published on Brock's Data Adventure » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I am in a fantasy hockey league for the first time this seasons and I wanted to use R to analyze players. Since I am relatively new to R, I am quite certain this code could be improved. The code below is functional, however, and while this isn’t my complete analysis, I think it outlines how powerful R truly is.
NOTE: Have a wordpress.com and post R code? Check out this post:
http://www.r-statistics.com/2010/09/r-syntax-highlighting-for-bloggers-on-wordpress-com/
############################################################################## # Analyze fantasy hockey skater stats using hockey reference # # Author: @BrockTibert # Date: October 2010 # # Help: # http://stackoverflow.com/questions/3796266/change-the-class-of-many-columns-in-a-data-frame ############################################################################### library(RCurl) library(XML) library(ggplot2) ## loads all sort of great packages from @HadleyWickham URL <- "http://www.hockey-reference.com/leagues/NHL_2010_skaters.html" tables <- readHTMLTable(URL, header=FALSE) ## for tutorial sake -- its the first table in the list data <- tables[[1]] ## colnames cn <- c("rk", "player", "age", "tm", "pos", "gp", "g", "a", "pts", "plus_minus", "pim", "ev", "pp", "sh", "gw", "s", "s_pct", "toi", "atoi") colnames(data) <- cn ## remove the column breaks data <- data[data$rk!='Rk',] ## holds the column indexes for the loop ## need to wrap the seq so it properly uses ncol index <- c(1,3,6:(ncol(data)-1)) ## change to numeric, but since factor, need to change to a value first for(i in index) { data[,i] <- as.numeric(as.character(data[,i])) } ## check to see if team's make sense ## some players have 'TOT' if played multple teams #table(data$tm) data <- data[data$tm!='Tm', ] ## create basic rannks for stats that are ranked ## goals, assists, points, plus minus, PIM, PPG, SHG, GW goals, shots on goal index <- c() index <- c(7:9,11,13:16) ## want rank of 1 to be on the largest value index2 <- c(10) ## want the rank of 1 to be on smallest value for(i in index) { data$temp <- rank(-data[,i], ties.method="min") name <- paste(colnames(data[i]), "_rank", sep="") names(data)[ncol(data)] <- name } for(i in index2) { data$temp <- rank(data[,i], ties.method="min") name <- paste(colnames(data[i]), "_rank", sep="") names(data)[ncol(data)] <- name } ## Very Basic Analysis table(data$pos) data <- data[data$pos %in% c("C", "LW", "RW", "D"),] ## one way to remove unused levels -- just specify the levels you want ## http://www.statmethods.net/input/valuelabels.html data$pos <- factor(data$pos, levels = c("C", "LW", "RW", "D")) ## create summary stats ## has to be an easier way to do this mean.goals <- function(df) mean(df$g, na.rm=T) mean.assists <- function(df) mean(df$a, na.rm=T) mean.pim <- function(df) mean(df$pim, na.rm=T) ds.g <- ddply(data, c("pos"), mean.goals) ds.g <- rename(ds.g, c(V1 = "goals")) #reshape package to rename variables ds.a <- ddply(data, c("pos"), mean.assists) ds.a <- rename(ds.a, c(V1 = "assists")) ds.pim <- ddply(data, c("pos"), mean.pim) ds.pim <- rename(ds.pim, c(V1 = "pim")) summ <- merge(ds.g, ds.a, by="pos") summ <- merge(summ, ds.pim, by="pos") ## print basic stats by position summ ## Clean up and quit rm(list=ls()) q() n
Filed under: Fantasy HOckey, R, Tutorial Tagged: Fantasy Hockey, R

To leave a comment for the author, please follow the link and comment on their blog: Brock's Data Adventure » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.