[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Following last week’s short examination, I now wanted to drill down a bit more in the voting behaviour as given in data from votewatch.eu on voting of MEPs.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Votewatch’s Data describe how often MEPs voted what in the European Parliament. For each MEP the number of votes, percentages Yes, No, Abstain, number of elections and number of elections not voted. A description of the data can be found in this pdf.
Data
Compared to last week there is a slight adaptation in reading data. The same preprocessed spreadsheet is used, but next processing has been adapted.library(gdata)
library(ggplot2)
r1 <- read.xls(‘votewatch-europe-yes-no-votes-data-11-december-2013.preprocessed.xls’,
stringsAsFactors=TRUE)
# adapted in souce file:
# headers
# number of decimals
# “
# fix a double
r1$National.Party <- as.character(r1$National.Party)
(uu <- unique(r1$National.Party[r1$Country==’Hungary’ &
grepl(‘ri Sz’,r1$National.Party)]))
[1] “Fidesz-Magyar Polgári Szövetség-Keresztény Demokrata Néppárt”
[2] “Fidesz-Magyar Polgári Szövetség-Keresztény Demokrata Néppárt”
r1$National.Party[r1$National.Party %in% uu] <- uu[1]
r1$National.Party <- factor(r1$National.Party)
r1$Date <- as.Date(r1$Sdate)
r1$Group <- relevel(r1$Group,’S&D’)
r1$nYES=round(r1$pYES*r1$TotalDone/100)
r1$nNO=round(r1$pNO*r1$TotalDone/100)
r1$nAbstain=round(r1$pAbstain*r1$TotalDone/100)
# 3 vars before sum to TotalDone
r1$nNoVote=round(r1$pNoVote*r1$TotalPossible/100)
r1$nNotIn <- r1$TotalPossible-r1$nNoVote-r1$TotalDone
# 5 vars before sum to TotalPossible
Group data
In this section data per group is analyzed. The analysis results are mean estimates, their dispersion is a function of the group size. The analysis result has three stages
The script itself consists of a wrapper around a glm() call. This was the most easy approach when I did think displaying ANOVA would be a good idea. It is retained because it makes the dependent variable used clearly readable.
- proportion present,
- proportion voted conditional on being present,
- proportion votes Yes, No, Abstain, conditional on voting.
The script itself consists of a wrapper around a glm() call. This was the most easy approach when I did think displaying ANOVA would be a good idea. It is retained because it makes the dependent variable used clearly readable.
test1 <- function(response) {
g1 <- glm(response ~ Group,
data=r1,family=binomial)
g0 <- glm(response ~ 1,
data=r1,family=binomial)
print(anova(g1,g0,test=’Chisq’))
g1
}
g1 <- test1(with(r1,
cbind(
nNotIn,
nIn=TotalPossible-nNotIn)))
h1 <- test1(with(r1,
cbind(
nNoVote,
nVote=TotalPossible-nNotIn-nNoVote)))
i1 <- test1(with(r1,
cbind(
nYES,
nNotYES=TotalDone-nYES)))
j1 <- test1(with(r1,
cbind(
nNO,
nNotNO=TotalDone-nNO)))
k1 <- test1(with(r1,
cbind(
nAbstain,
nNotAb=TotalDone-nAbstain)))
It would be nice to display the results graphically. For this a prediction variable is made. In addition, the results are merged with decent labels
ug <- unique(subset(r1,,Group))
lpreds <- lapply(list(g1,h1,i1,j1,k1),
function(x) {
predsg <-
as.data.frame(predict.glm(x,ug,type=’response’,se.fit=TRUE)[-3])
predsg$Group <- ug$Group
predsg$Q <- dimnames(x$model[[1]])[[2]][1]
predsg
})
preds <- do.call(rbind,lpreds)
rel <- data.frame(
Q=c(“nNotIn”, “nNoVote”, “nYES”, “nNO”, “nAbstain”),
Vote=factor(1:5,labels=c(
‘Not Present’,
‘Not Voted’,
‘Yes’,’No’,’Abstain’)))
preds <- merge(preds,rel)
limits <- aes(ymax = fit + 2*se.fit, ymin=fit- 2* se.fit)
p1 <- ggplot(preds, aes(y=fit, x=Group,col=Vote))
p1 + geom_point(stat=”identity”) +
geom_errorbar(limits, width=0.25) +
ylab(‘Proportion’) +
coord_flip()
As I found the result a bit disappointing, I made a mosaic plot where distribution of results are next to each other.
ag <- aggregate(subset(r1,,c(nYES,nNO,nAbstain,nNoVote,nNotIn)),
by=list(Group=r1$Group),
sum)
lag <- reshape(ag,direction=’long’,
idvar=’Group’,
times=names(ag[-1]),
timevar=’Q’,
v.names=’Counts’,
varying=list(names(ag[-1])))
lag <- merge(lag,rel)
mosaicplot(xtabs(Counts ~ Group + Vote ,data=lag),
color=rainbow(nlevels(lag$Group)),
main=’Votes’,
las=2,cex=.7)
By Party
At this point, I decided that the first plot would be interesting if displayed by party. However, there are far too many parties. So, to get a bit of order, the focus is on ‘interesting’ parties. Interesting in this case is anything not S&D, ADLE/ADLE and EPP because these are far too predictable. This leaves still quite some parties, so they have to be ordered by country too. Since there are a number of parties (e.g. independent) which are present in several countries, these to be labelled by country too.
First a National.PartyF, which is ordered by country.
tween <- unique(subset(r1,,c(Country,National.Party)))
tween$National.PartyC <- as.character(tween$National.Party)
selection <- tween$National.Party %in% c(‘Independent’,’Labour Party’,’Parti Socialiste’,’Partido Popular’)
tween$National.PartyC[selection] <-
paste(tween$National.PartyC[selection],tween$Country[selection])
tween <- tween[order(tween$Country,tween$National.Party),]
tween$National.PartyF <- factor(tween$National.PartyC,
levels=tween$National.PartyC,
labels=tween$National.PartyC)
r2 <- merge(r1,tween)
The analysis, stripped down to minimum.
test2 <- function(response) {
glm(response ~ National.PartyF,
data=r2,family=binomial)
}
g2 <- test2(with(r2,
cbind(
nNotIn,
nIn=TotalPossible-nNotIn)))
h2 <- test2(with(r2,
cbind(
nNoVote,
nVote=TotalPossible-nNotIn-nNoVote)))
i2 <- test2(with(r2,
cbind(
nYES,
nNotYES=TotalDone-nYES)))
j2 <- test2(with(r2,
cbind(
nNO,
nNotNO=TotalDone-nNO)))
k2 <- test2(with(r2,
cbind(
nAbstain,
nNotAb=TotalDone-nAbstain)))
Predicting
ug2 <- unique(subset(r2,,National.PartyF))
lpreds2 <- lapply(list(g2,h2,i2,j2,k2),
function(x) {
predsg <-
as.data.frame(predict.glm(x,ug2,type=’response’,se.fit=TRUE)[-3])
predsg$National.PartyF <- ug2$National.PartyF
predsg$Q <- dimnames(x$model[[1]])[[2]][1]
predsg
})
preds2 <- do.call(rbind,lpreds2)
preds2 <- merge(preds2,rel)
Adding supporting variables. Shortening a very long name.
preds2 <- merge(preds2,rel)
preds2 <- merge(preds2,unique(subset(r2,,c(National.PartyF,Country,Group))))
levels(preds2$National.PartyF)[
levels(preds2$National.PartyF)==
“People for Real, Open and United Democracy / Conservative Party for Democracy and Success”
] <- “People for Real, Open and United Democracy / …”
The plot is still a bit large. In the top the UK has a number of parties which do know where the No button is. As do Dutch ‘Partij voor de Vrijheid’ and French ‘Mouvement pour la France’. In contrast, ‘Front National’ is probably better described as Abstained. Do not think this is a right wing thing, the ‘Communist Party of Greece’ likes the No button too, though not as much as the parties mentioned earlier.
p1 <- ggplot(preds2[!(preds2$Group %in% c(‘S&D’,’EPP’,’ALDE/ADLE’)),],
aes(y=fit, x=National.PartyF,col=Vote))
png(‘long2.png’,width=800,height=800)
p1 + geom_point(stat=”identity”) +
geom_errorbar(limits, width=0.25) +
ylab(‘Proportion’) +
coord_flip() +
xlab(‘National Party’) +
theme(legend.position=”bottom”)
Other noticeable
Anything over 0.3 which was neither Yes nor No.
preds2[preds2$fit>.3 & !(preds2$Vote %in% c(‘Yes’,’No’)),c(1,3,5,6,7)]
National.PartyF fit Vote Country Group
136 Darbo partija 0.5702479 Not Present Lithuania ALDE/ADLE
138 Darbo partija 0.3653846 Not Voted Lithuania ALDE/ADLE
246 Freiheitliche Partei Österreichs 0.3248408 Abstain Austria NI
257 Front national 0.3598585 Abstain France NI
396 JOBBIK MAGYARORSZÁGÉRT MOZGALOM 0.3164147 Abstain Hungary NI
522 Mouvement pour la France 0.3060686 Not Voted France EFD
To note:
Darbo partija is the Labour Party
Jobbik is Radical Nationalist
FPÖ and FN got often enough in the newspapers, so don’t need a link. MPF is clearly anti EU just from the name.
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.