Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Yesterday's US election is pretty much over now: most of the results are in, the pundits have offered their political analysis, and there's even been a bit of mathematical analysis of the results, too. But last night as the results were flowing in, R user Brock Tibert just wanted to track the results of the Massachusetts governor's race. The Boston Globe was providing regular updates to the precinct level counts, but it's hard to visualize the horse-race from the raw numbers. So with a little hackery and a ingenuity, Brock wrote some R code to scrape the results table from the Boston Globe webpage and visualize the results as a barplot. Here's an excerpt of his code (from his gist page):
# grab the tables from the results page tables <- readHTMLTable(URL) # take only the largest one n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) results.temp <- tables[[which.max(n.rows)]] # lets clean up the data a little.... names(results.temp) <- c("city", "pctreport", "baker", "cahill", "patrick", "stein") for(i in c(3:6)) { # convert to number, but need to remove the comma that gets pulled from the site results.temp[, i] <- as.numeric(as.character(gsub(",", "", results.temp[, i]))) } results.temp$city <- as.character(results.temp$city) # create a city/town detail dataset results.detail <- results.temp[1:nrow(results.temp)-1, ] # remove temp row # create a "dataset" that has the totals added up for you totals <- results.temp[nrow(results.temp), ] totals # a basic plot plot.data <- as.vector(t(totals[,3:6])) names(plot.data) <- c("Baker", "Cahill", "Patrick", "Stein") race.plot <- barplot(plot.data, main="2010 Gubernatorial Results using R", xlab="Candidate") text(c(1:4), y=20000, labels=plot.data)
Brock used the XML package to extract the table from the webpage, the stringr package to process the text results, and (after arranging the data), the barplot function to visualize the results.
And here's the resulting bar plot (which I ran today, so it now represents the final tallies):
Brock said in this tweet: "Not elegant, but works.". Perhaps true, but that tweet making the code available was sent at 9PM East Coast time, not long after the polls had closed and as the returns were coming in. It may not be elegant, but it's a really impressive way to visualize the results in real time.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.