Site icon R-bloggers

Revisiting homicide rates

[This article was first published on Quantum Forest » rblogs, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A pint of R plotted an interesting dataset: intentional homicides in South America. I thought the graphs were pretty but I was unhappy about the way information was conveyed in the plots; relative risk should be very important but number of homicides is very misleading as it also relates to country population (this problem often comes up in our discussions in Stats Chat).

Instead of just complaining I decided to try a few alternatives (disclaimer: I do not have a good eye for colors or design but I am only looking at ways that could better show relative risk). I therefore downloaded the MS Excel file, which contained a lot of information from other countries and extracted only the information relevant to these plots, which you can obtain here: homicides.csv (4 KB). Some quick code could display the following graph:

require(ggplot2)

setwd('~/Dropbox/quantumforest')
kill = read.csv('homicides.csv', header = TRUE)

kp = ggplot(kill, aes(x = year, y = country, fill = rate))

# Colors coming from
# http://learnr.wordpress.com/2010/01/26/ggplot2-quick-heatmap-plotting/
png('homicides-tile.png', width = 500, height = 500)
kp = kp + geom_tile() + scale_x_continuous(name = 'Year', expand = c(0, 0)) +
     scale_y_discrete(name = 'Country', expand = c(0, 0)) +
     scale_fill_gradient(low = 'white', high = 'steelblue', name = 'Homicide rate') +
     theme_bw() +
     opts(panel.grid.major = theme_line(colour = NA),
          panel.grid.minor = theme_line(colour = NA))
dev.off()

Tile graph for homicides.

It is also possible to use a line graph, but it quickly gets very messy, so I created totally arbitrary violence categories:

# Totally arbitrary classification
kill$type = ifelse(kill$country %in% c('Brazil', 'Colombia', 'Venezuela'),
                   'Freaking violent',
                   ifelse(kill$country %in% c('Ecuador', 'Surinam', 'Guyana'),
                          'Plain violent',
                          'Sort of quiet'))

kp2 = ggplot(kill, aes(x = year, y = rate, colour = country))

png('homicides-lines.png', width=1000, height = 300)
kp2 + geom_line() + facet_grid(. ~ type) +
      scale_y_continuous('Homicides/100,000 people') +
      scale_x_continuous('Year') + theme_bw() +
      opts(axis.text.x = theme_text(size = 10),
           axis.text.y = theme_text(size = 10),
           legend.position = 'none')
dev.off()

Another view, which still requires labeling countries.

I hope others will download the data and provide much better alternatives to display violence. If you do, please add a link in the comments.

To leave a comment for the author, please follow the link and comment on their blog: Quantum Forest » rblogs.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.