Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
One of my favorite places to find new datasets is Jeremy Singer-Vine’s “Data is Plural” newsletter, which comes out most Thursday’s. There is always something there to pique my interest. From a recent edition I learned about law professor Brandon L. Garrett’s death penalty dataset, which identifies and lists all of the death penalty verdicts issued in the U.S. since 1991. I’ll be using the choroplethr
package, which makes it super easy to visualize spatial data.
First, I wanted to get a sense of what the trend is with respect to the death penalty. I’m aware that violent crime rates have been dropping, so I was curious how that affected death penalty verdicts. I used the FBI’s Uniform Crime Reporting data tool to access homicide statistics.
ibrary(lubridate) library(ggplot2) library(tidyr) library(dplyr) library(reshape2) library(gridExtra) library(plotly) library(choroplethr) library(choroplethrMaps) library(RColorBrewer) theme_set(theme_minimal()) death_penalty <- read.csv("1991_2017_individualFIPS.csv", header = TRUE, stringsAsFactors = FALSE) homicides <- read.csv("CrimeTrendsInOneVar.csv", header = TRUE) homicides_gathered <- homicides %>% gather(state, count, -Year) homicides_total <- homicides_gathered %>% group_by(Year) %>% rename(year = Year) summarise(total_hom = sum(count)) deathpen_total <- death_penalty %>% group_by(year) %>% count(defendant) %>% summarise(total_pen = sum(n)) hom_deathpen <- inner_join(deathpen_total, homicides_total, by = "year") g1 <- ggplot(hom_deathpen, aes(x=year, y=total_hom)) + geom_line() + ylim(10000, 25000) + xlab(NULL) + ylab("Homicides") g2 <- ggplot(hom_deathpen, aes(x=year, y=total_pen)) + geom_line() + ylim(0, 350) + xlab(NULL) + ylab("Death Penalties") grid.arrange(g1, g2, nrow=1, top = "U.S. Homicides/Death Penalties Over Time")
So it looks like there is a big drop in both homicides and death penalty verdicts after the mid-90s, when the crack epidemic was raging. First, let’s visualize changes in the homicide rate over time on a map, with the choroplethr
package. For this I will be using rates rather then whole numbers, and I’ll be using data I scraped from the Death Penalty Information Center, which tracks such things.
homRate <- read.csv("Homicide Rate Over Time.csv", header = TRUE, check.names = FALSE, stringsAsFactors = FALSE) library(Hmisc) homs <- homRate %>% gather(year, rate, -state) %>% mutate(value = cut2(rate, g = 7)) %>% rename("region" = "state") # choroplethr requires columns named 'region' and 'value' homs$year <- as.integer(homs$year) mycols <- brewer.pal(7, "YlGnBu") hom1996 <- homs %>% filter(year == 1996) %>% select(region, value) hom1996$region <- tolower(hom1996$region) # state names must be lower case in choroplethr choro2 <- StateChoropleth$new(hom1996) choro2$title <- "Homicide Rate (1996)" choro2$ggplot_scale <- scale_fill_manual(name = "Rate", values = mycols, drop = TRUE) choro2$render()
hom2016 <- homs %>% filter(year == 2016) %>% select(region, value) hom2016$region <- tolower(hom2016$region) choro4 <- StateChoropleth$new(hom2016) choro4$title <- "Homicide Rate (2016)" choro4$ggplot_scale <- scale_fill_manual(name = "Rate\n(per 100,000)", values = mycols, drop = TRUE) choro4$render()
Wow! Big difference in the homicide rate among the states, with lots of dark blue states in 1996 and just a few in 2016. Let’s look at the death penalty rate over time using choropleth_animated
from the choroplethr
package. I can’t embed the player into the page, but if you click on the screenshot below, it will open up the GitHub page that has the animation.
deathpen <- death_penalty %>% group_by(year, state) %>% count(defendant) %>% summarise(count = sum(n)) %>% rename("region" = "state") %>% ungroup() %>% mutate(value = cut2(count, g = 8)) library(reshape2) dp_ts <- deathpen %>% select(year, region, value) dp_ts$region <- tolower(dp_ts$region) dp_ts_wide <- dcast(dp_ts, region ~ year, value.var = "value") # convert to wide format # create a list of choropleths of death penalty verdicts per state for each year choropleths = list() for (i in 2:(ncol(dp_ts_wide))) { df = dp_ts_wide[, c(1, i)] colnames(df) = c("region", "value") title = paste0("Death Penalty Verdicts: ", colnames(dp_ts_wide)[i]) choropleths[[i-1]] = state_choropleth(df, title=title) + scale_fill_manual(values = mycols, name = "Death Penalty Verdicts") } choroplethr_animate(choropleths)
It’s hard not to notice how much lighter the maps get, but it’s also hard not to notice that there is not necessarily a correlation between states with the highest homicide rates and those with the most death penalty verdicts. For instance, California consistently has the most death penalty verdicts, even though it doesn’t have the highest murder rate.
The post Death penalty trend choropleths in R appeared first on my (mis)adventures in R programming.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.