Visualizing US House Results with a Seats-Votes curve

[This article was first published on Offensive Politics » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A few weeks ago I wrote about ways to compare major-party returns in US House elections. I experimented with several visualizations, none as useful as the seats-votes curve. A traditional seats-votes cure measures average party performance against individual US House results. Our simplified curve uses a density plot to measure major-party (Democratic, in this case) support across all seats up for election. The seats-votes curve we use will help measure the following characteristics of the US House for a given election: number of uncontested or weakly contested seats, number safe seats, and the number of close or tossup seats. By comparing plots from different years we can track changes in major party support and electoral attitudes, both of which can have a dramatic effect on future elections and legislative priorities in the US House.

This exercise will explain the different components of seats votes plot, and then look at how Democratic party support has changed from 2002 to 2010.

Seats Votes Explained.

Though it may look simple, the modified seats-votes curve is very information-heavy and can be somewhat confusing. A small change in the contours of the curve can convey a lot of information. Please refer to the annotated graph (figure 1), and the items below for instructions on how to read a seats votes-curve.

Figure 1

  • 1) Uncontested Seats -The magnitude of the curve near 100 and near 0 represents the number of weakly contested or wholly uncontested seats. The curve near the left side of the graph measures how many seats were weakly contested by the Democrats, and the right side measures seats uncontested by the Republicans.
  • 2 & 3) Base Seats– The lump of seats to the left of the 50% vote share line represent the seats a Republican won (2), the lump to the right belongs to Democrats (3). These lumps are the respective parties base and considered a probable win. The further from the 50% line shows how large a win it was, and how much safer a seat could be considered. A shift away from the 50% mark from one year to the next implies an electorally stronger base, and an increase in magnitude represents a larger base. A shift away and a smaller magnitude can represent an electorally weaker and smaller base, respectively.
  • 4) Competitive SeatsSeats at or near the 50% mark are considered competitive seats. This portion of the curve may be a trough between the two base seat points, or sometimes a lump on its own.
  • 5) Rug plotThe 1-dimensional rug plot across the bottom of the graph provides a density estimation at a given point in the curve. The more hash marks in a given location mean a higher count for that vote share, which influences the height of the density curve.

Latest Returns

Now that we can interpret the plot in terms of base, competitive, and safe seats lets look at the modified seats votes plot from the previous article but with the latest election returns.

library(ggplot2)
# load the seats-votes data from offensivepolitics.net
df <- read.csv('http://offensivepolitics.net/data/seats-votes-2010.csv')
df$year <- as.factor(df$year)
png("2010-f2.png",width=672,height=671)
qplot(voteshare,data=df, geom=c("density", "rug"),
	xlab="Democratic Vote Share (%)", ylab="Density",
	main="Democratic Vote Share US House 2002-2010") + 
	facet_wrap(~year,nrow=3) + 
	geom_vline(xintercept=50,colour="gray50")
dev.off()

Using the information from annotated reference chart (Figure 1.) to interpret to contours of the seats votes curve we can build a narrative for the mood of the electoral for any given year. Using the latest results chart (Figure 2.) we can expand that narrative across electoral cycles.

Figure 2

For elections leading up to 2010 we see a prominent bimodal curve, with a reasonably stable amount of safe seats for each party. In 2002 and 2004 we see a large number of seats in the 30-40 percent Democratic vote share, which corresponds to a strong Republican majority. In 2006 we see the number of competitive seats increase, and 2008 we see an increase in the number of safe Democratic seats. The changes in 2006 and 2008 track nicely with the Democrats narrowly control of the US House in 2006 and then strongly expanding their majority in 2008.

The 2002-2008 plots show a Republican between 15-20%, a Democratic base between 15-18%, and competitive seats between 9-15% of the total US House. In 2010 the structure of the curve changed dramatically. The Republican base seats were right around 18%, which is in line with previous years. But the Democratic base seats look nothing like previous years. There is still a contingent of the lump of safe Democratic seats, but the count dropped by about half. The rest of the seats shifted towards the 50% line where they merged with the other competitive seats. This plot of the returns lines up nicely with the pre-election narrative of an outsize number of tossup races by CQ Politics, FiveThirtyEight, Cook Political Report, and The Rothenberg Political Report.

More Precise Numbers

The smoothed density function used in the seats-votes plot is an estimation, so it is difficult to determine exactly how many seats fall within a given vote range. We could use a histogram, but I like a cummulative distribution function (CDF) plot instead. To create a CDF plot for each years results we’ll use the built-in R ECDF function and ggplot:

png("2010-f3.png",width=672,height=671)
cdf <- ddply(df,"year",function(x) data.frame(share=x$voteshare,cdf=ecdf(x$voteshare)(x$voteshare)*100))
qplot(x=share,y=cdf,data=cdf,geom="step",
	 main="Democratic Vote Share US House 2002-2010 (Cummulative)",
	 xlab="Vote Share (%)",ylab="Total %") + 
	facet_wrap(~year,nrow=3)
dev.off()

Figure 3

Using the CDF plot we can see that Democrats received 50% or less in 60% of the races in 2010, but received 50% or less in only 42% of the races in 2008. In 2010 a full 40% of the total races fell into the competitive category, defined as received between 40% and 60% of the vote. In 2008 and 2006 that number was closer to 12%, an increase of 50% in a single year.

The combination of the seats-votes plot and CDF allow us pretty powerful insights into the current electoral power of each major party in the US House. We have hard numbers and a narrative for the 2010 US House Democratic loss that goes beyond parroting the number of seats lots. We also have some historical perspective on major-party electoral returns, and it will be interesting to see if the 2010 competitive seats remain that way in 2012.

To leave a comment for the author, please follow the link and comment on their blog: Offensive Politics » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)