Voting clusters in the U.N.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
After some more digging, and a suggestion by @theMexIndian I decided to see more in the depth the unvotes database that I wrote about some weeks ago.
This time, amit suggested I do some hierarchical clustering of the votes. So here goes a very dirty first attempt…
Data and setup
Nothing too impressive here… (for a discussion of the package, see the original post).
There are more than 5k unique roll calls, so if we where to open up dimensionality by each roll call, this is going to be huge, but i’ll go ahead and do it anyways, to test a hypothesis towards the end…
‘Widen’ data…
Now that we have a very high dimension data set (each variable is the vote in a roll call, for example, abstain_120, yes_120, no_120 would be a count of abstain, yes and no votes in roll call 120). This data set is basically ones and ceros. Now to do some cleaning and get the distance matrix…
Let’s graph this hierarchical clustering using the ggdendro
package…
I’m going to export these clusters and upload them on my github for anyone to download.
By issues
Now, because the latest data set was very high dimension, i’m going to condense the analysis to just votes on particular issues. The data base has seven core issues, so i’m going to try to group by issue instead of roll call. This might let us see if there are different voting blocs from the earlier set (maybe countries vote the same, except when important issues come up).
I’ll export this too…
To disprove the earlier hypothesis, i’m going to find Mexico’s neighborhood, and see if there are many countries that repeat themselves in both sets…
So 80% of the country’s are “close” to Mexico whether the vote is by issue or by roll call. This is a rough first attempt (there are probably many slight errors) but there are some interesting things to be found.
In the issues groups, the outliers in a single group are the United States and Israel (the Palestinian conflict probably is the culprit here, as I found earlier, they agree on 77% of the votes).
Then there are countries that seem to be very close culturally, and they show it in the votes…
Finally, some like-minded countries, like Chile, Colombia, Panama, Paraguay, Peru, etc are in Mexico’s neighborhood (although it’s one of the largest groups).
Tweet me up if you have any questions with the data!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.