Crime Analysis – Denver-Part 2

[This article was first published on R-Projects – Stoltzmaniac, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Getting More Granular

Where we’re going
In having noticed the crime rates heading up over the last few years, taking a better look seemed more important. I want to first look at “CRIME” before looking into “TRAFFIC” in the data set. It sounds more interesting and I hope the results don’t keep me up at night.

What we’ll do in this post

  • Load the csv, format the data
    • This will all be hidden and can be found in the previous post (Part 1)
  • Look into apparent growth in crime rates from 2012 – 2014
  • We’ll focus only on those that fit the “ISCRIME” definition and not “ISTRAFFIC”

Let’s dive in!

Exploration of Data
Data provided by https://www.denvergov.org/opendata/dataset/city-and-county-of-denver-crime

Data Import & Formatting – shown in prior post: Crime Analysis – Denver-Part 1

Looking at Crime Incidents by Year

df = data %>%  
  filter(IS_CRIME==1) %>%
  filter(year!=max(year(date))) %>%
  group_by(year) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(year)

p = ggplot(df,aes(x=year,y=incidents,label=incidents))  
p + geom_bar(stat='identity') + geom_text(fontface='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Volume by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5))  

barplotCrime

df = data %>%  
  filter(IS_CRIME==1) %>%
  filter(year!=max(year(date))) %>%
  group_by(year) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(year) %>%
  mutate(year,YoYPercentageChange=round(100*(incidents-lag(incidents))/lag(incidents)),0)
df = df[!is.na(df$YoYPercentageChange),]  
p = ggplot(df,aes(x=year,y=YoYPercentageChange,label=YoYPercentageChange))  
p + geom_bar(stat='identity') + geom_text(fontface='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Percentage Change Year-Over-Year') + xlab('Year') + ylab('YoY Incident % Change') + theme(plot.title = element_text(hjust = 0.5))  

barplotCrimeChange

Observations

  • Crime rose the most between 2013 and 2012 (39% increase)
  • Crime increased each year after but at a decreasing rate
  • Examine years 2012 – 2014 to see growth changes

Highest volume of “ISCRIME” types
Identify the offense by OFFENSECATEGORY_ID and exclude months we have not seen so far this year.

#Isolate Years 2012 - 2014
data = data[data$year <= 2014 & data$year >= 2012,]  

#Sum up all incidents IS_CRIME AND IS_TRAFFIC
maxYear = max(data$year)  
maxMonthYTD = max(data$month[data$year==maxYear])

#Look into IS_CRIME only
df = data %>%  
  filter(IS_CRIME==1) %>%
  group_by(year,OFFENSE_CATEGORY_ID) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(desc(incidents))

p = ggplot(df,aes(x=factor(year),y=incidents,fill=year))  
p + geom_bar(stat='identity') + ggtitle('Crime Incidents Reported by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3)

barplotCrimeCategories

Observations
It would appear as if “all-other-crimes” has moved the needle the most between 2012 – 2014. This is not a very specific category. It’s also worth noticing that “other-crimes-against-persons” has grown as well. Both of these leads to some speculation that perhaps these vague types of crimes started being reported during this period and perhaps hadn’t been documented before.

  • Growth categories:
    • “larceny”
    • “drug-alcohol”
    • “public-disorder”
  • Declining categories:
    • “theft-from-motor-vehicle”
    • “robbery”
    • “burglary”

Many of the other categories have a much lower volume of incidents. Growth is more difficult to see in visualizatoins for these cases.

Here’s a look at growth year-over-year:

df2 = df %>%  
  group_by(OFFENSE_CATEGORY_ID) %>%
  arrange(OFFENSE_CATEGORY_ID,year) %>%
  mutate(year,YoYchange=round((incidents-lag(incidents))),0) %>% filter(year != 2012)

p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange))  
p + geom_bar(stat='identity') + ggtitle('Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=0.5, size=5,col='red', fontface='bold')  

barplotCrimeCategoryChange

Here’s a look at % growth year-over-year:

df2 = df %>%  
  group_by(OFFENSE_CATEGORY_ID) %>%
  arrange(OFFENSE_CATEGORY_ID,year) %>%
  mutate(year,YoYchange=round(100*((incidents-lag(incidents))/lag(incidents))),0) %>% filter(year != 2012)

p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange))  
p + geom_bar(stat='identity') + ggtitle('% Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY % Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=1,col='red',size=5,fontface='bold')  

barplotCrimeCategoryPercentageChange

Observations

  • “all-other-crimes” is the outright leader in change in both volume and percentage growth year-over-year with an astonishing 380% increase between 2012 and 2013
  • “drug-alcohol” grew by 173% between 2012 and 2013 but dropped down to only 27% growth the next year
  • “murder” didn’t change too much in volume compared to everything else (swinging up 7 and down 10) but was a 19% growth and a 23% decline in 2013 and 2014 respectively

Final Thoughts (for now)
Due to the vague nature of the types of crimes which grew the most, I can’t determine exactly what happened in Denver during 2013. In the less vague crimes, “drug-alcohol” saw the largest increase. This was followed by “public-disorder” and perhaps there’s a relationship there. My assumption is that one may perhaps cause the other…

I’m still curious about the seasonality and month-to-month effects. Perhaps certain types of crimes are more common during certain times. I’m also very interested to see if a new population was perhaps added to the mix in 2013. If a certain part of Denver was added in 2013 that would certainly help to explain the situation.

What I’ll do in the next crime posts

  • Look for patterns by location
  • Lay out some visualizations on maps
  • Try to identify areas with high volumes of traffic incidents (maybe I can avoid a ticket)
  • Answer the question: What types of crimes have grown the most in the last 5 years?

Code used in this post is on my GitHub

To leave a comment for the author, please follow the link and comment on their blog: R-Projects – Stoltzmaniac.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)