Site icon R-bloggers

Crime Analysis – Denver-Part 2

[This article was first published on R-Projects – Stoltzmaniac, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Getting More Granular

Where we’re going
In having noticed the crime rates heading up over the last few years, taking a better look seemed more important. I want to first look at “CRIME” before looking into “TRAFFIC” in the data set. It sounds more interesting and I hope the results don’t keep me up at night.

What we’ll do in this post

Let’s dive in!

Exploration of Data
Data provided by https://www.denvergov.org/opendata/dataset/city-and-county-of-denver-crime

Data Import & Formatting – shown in prior post: Crime Analysis – Denver-Part 1

Looking at Crime Incidents by Year

df = data %>%  
  filter(IS_CRIME==1) %>%
  filter(year!=max(year(date))) %>%
  group_by(year) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(year)

p = ggplot(df,aes(x=year,y=incidents,label=incidents))  
p + geom_bar(stat='identity') + geom_text(face='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Volume by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5))  

df = data %>%  
  filter(IS_CRIME==1) %>%
  filter(year!=max(year(date))) %>%
  group_by(year) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(year) %>%
  mutate(year,YoYPercentageChange=round(100*(incidents-lag(incidents))/lag(incidents)),0)
df = df[!is.na(df$YoYPercentageChange),]  
p = ggplot(df,aes(x=year,y=YoYPercentageChange,label=YoYPercentageChange))  
p + geom_bar(stat='identity') + geom_text(face='bold',size=6,col='white',vjust=1)+ ggtitle('Crime Percentage Change Year-Over-Year') + xlab('Year') + ylab('YoY Incident % Change') + theme(plot.title = element_text(hjust = 0.5))  

Observations

Highest volume of “ISCRIME” types
Identify the offense by OFFENSECATEGORY_ID and exclude months we have not seen so far this year.

#Isolate Years 2012 - 2014
data = data[data$year <= 2014 & data$year >= 2012,]  

#Sum up all incidents IS_CRIME AND IS_TRAFFIC
maxYear = max(data$year)  
maxMonthYTD = max(data$month[data$year==maxYear])

#Look into IS_CRIME only
df = data %>%  
  filter(IS_CRIME==1) %>%
  group_by(year,OFFENSE_CATEGORY_ID) %>%
  summarise(incidents=sum(IS_CRIME)) %>%
  arrange(desc(incidents))

p = ggplot(df,aes(x=factor(year),y=incidents,fill=year))  
p + geom_bar(stat='identity') + ggtitle('Crime Incidents Reported by Year') + xlab('Year') + ylab('Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3)

Observations
It would appear as if “all-other-crimes” has moved the needle the most between 2012 – 2014. This is not a very specific category. It’s also worth noticing that “other-crimes-against-persons” has grown as well. Both of these leads to some speculation that perhaps these vague types of crimes started being reported during this period and perhaps hadn’t been documented before.

Many of the other categories have a much lower volume of incidents. Growth is more difficult to see in visualizatoins for these cases.

Here’s a look at growth year-over-year:

df2 = df %>%  
  group_by(OFFENSE_CATEGORY_ID) %>%
  arrange(OFFENSE_CATEGORY_ID,year) %>%
  mutate(year,YoYchange=round((incidents-lag(incidents))),0) %>% filter(year != 2012)

p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange))  
p + geom_bar(stat='identity') + ggtitle('Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=0.5, size=5,col='red', face='bold')  

Here’s a look at % growth year-over-year:

df2 = df %>%  
  group_by(OFFENSE_CATEGORY_ID) %>%
  arrange(OFFENSE_CATEGORY_ID,year) %>%
  mutate(year,YoYchange=round(100*((incidents-lag(incidents))/lag(incidents))),0) %>% filter(year != 2012)

p = ggplot(df2,aes(x=factor(year),y=YoYchange,fill=year,label=YoYchange))  
p + geom_bar(stat='identity') + ggtitle('% Change in Crime Incidents vs Previous Year') + xlab('Year') + ylab('YoY % Change in Incidents') + theme(plot.title = element_text(hjust = 0.5),legend.position = 'none') + guides(fill = guide_legend(title='Year')) + coord_flip() + facet_wrap(~OFFENSE_CATEGORY_ID,ncol=3) + geom_text(hjust=1,col='red',size=5,face='bold')  

Observations

Final Thoughts (for now)
Due to the vague nature of the types of crimes which grew the most, I can’t determine exactly what happened in Denver during 2013. In the less vague crimes, “drug-alcohol” saw the largest increase. This was followed by “public-disorder” and perhaps there’s a relationship there. My assumption is that one may perhaps cause the other…

I’m still curious about the seasonality and month-to-month effects. Perhaps certain types of crimes are more common during certain times. I’m also very interested to see if a new population was perhaps added to the mix in 2013. If a certain part of Denver was added in 2013 that would certainly help to explain the situation.

What I’ll do in the next crime posts

Code used in this post is on my GitHub

To leave a comment for the author, please follow the link and comment on their blog: R-Projects – Stoltzmaniac.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.