[This article was first published on Daniel's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Alea Jacta Est
This is the very last analysis before the election. A far-right nationalist candidate, Jair Messias Bolsonaro, is leading the polls with about 40% of the intentions, while the runner up candidate, Fernando Haddad, of a leftist coalition has about 25%; all the others have 35% in total.
A very intriguing debate put forward in the press last days was if the next president would be elected right way int he primary election round. So, will the Bandwagon, shy Tory, and something else effect help electing a far-right nationalist candidate by the absolute-majority criterion vote? Bayes says don't worry about Bolsonaro’s victory by now.
Although polling houses are showing Bolsonaro’s support augmenting systematically over the last weeks, it’s fair to remember that pollsters did a very poor job in fielding the true vote share last elections. For instance, in 2014 the main polling firms mis-predicted both Dilma Rousseff’s and Aecio Neves’ true positions by saying Dilma was to win a majority with a margin, but the decision went to a instant runoff between these two candidates.
The following numbers represent the forecast with polling data made available over the last three days. Since there is a considerable number of swing voters in these polls, I did some math by distributing these undecideds before computing the final likely results. It’s a simplified simulation exercice as I do not account for time trends, house effects etc. I’m only accounting for the sample sizes.
The data
Poll of polls
Here is where the magic begins. I weigh polls so to reflect their sample sizes. The new results are shown in last line (7) of the table.
Adjusting for the undecideds
Adjusting for swing voters, the new results are now the line (8) of the table.
Adjusting for the wasting votes
Adjusting for wasting votes, follows the same principle. The last line of the following tbale (9) has the new adjusted preference distribution, with correct sample size.
Draw 1 million samples
Finally I draw a lot os samples from the posterior distribution using the weighted polls and uninformative priors to keep it simple.
Here we want to look at the margins of Bolsonaro over the combined opposition candidates. The more candidates contesting for the seat, the greater the probability that the winning candidate will receive only a minority of the votes cast.
We can also use the middle 95% range to represent the uncertainty. The numbers say about 20% of times in 1 million elections Bolsonaro appears ahead the opposition formula. Therefore, it’s very unlikely he could win the election in the primary election round.
Finally, we can plot the posterior distribution of simulated elections where Bolsonaro is greater than the combined opposing votes. Based on the polling data at hands, and very little effort, we can believe the far-right nationalist candidate won’t make it this Sunday as press pundits are suggesting.
To leave a comment for the author, please follow the link and comment on their blog: Daniel's Blog.