Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The recent elections in Pakistan on May 11 were a great success by all means. In spite of the threats for violence by Al-Qaeda and its local franchises in Pakistan against those who would vote, millions of Pakistanis indeed stepped out to vote for an elected government. The Election Commission of Pakistan (ECP) claimed a voter turnout of 60%.
One would have hoped to see 50.5 million votes polled for a 60% turnout by the 84.2 million registered voters in the 262 ridings of the National Assembly for which the ECP reported results. However, ECP’s own data reported 44.9 million votes, resulting in a gap of app. 5.7 million votes. The actual turnout thus was close to 53%.
I used R to siphon off data for 262 ridings, which ECP reported on separate web pages. The R code is presented below.
library(XML)
# Get the URL prefix
u1<-"http://www.ecp.gov.pk/electionresult/Search.aspx?constituency=NA&constituencyid=NA-"
# loop through the 272 ridings
for (i in 1:272) {
#get the riding number
u2<- i
#complete the URL Address
url2=paste(u1,u2,sep="")
#Read the table
ridedata=readHTMLTable(url2, header=T, which=8,stringsAsFactors=F)
#Read the HTML page
web_page <- readLines(url2)
# Pull out the appropriate line with the riding name using the identifier "specialheading"
ridename <- web_page[grep("Specialheading", web_page)]
#get the starting integer for the riding name
startx=regexpr("(", ridename, fixed=TRUE)
startx=startx[1]+1
#get the last digit for the riding name
endx=regexpr("<span", ridename)
endx=endx[1]-2
#Generate the riding name
ridename=substr(ridename,startx,endx)
#merge data in one table
assign(paste0("fname",u2, sep=""), cbind(ridedata,riding=i,rname=ridename))
}
I used a simple rbind command to assemble data in one large file after storing individual riding data first in separate files. This was done because the server timed out several times during the execution, and it allowed me to restart from the riding where the system failed, rather than starting from the beginning every time.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.