Site icon R-bloggers

Motor Vehicle Collision Density in NYC

[This article was first published on Stable Markets » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a previous post, I visualized crime density in Boston using R’s ggmap package. In this post, I use ggmap to visualize vehicle accidents in New York City. R code and data are included in this post.

The data comes from NYC Open Data. My data cut ranges from 2012 to 2015. The data tracks the type of vehicle, the name of the street on which the accident occurs, as well as the longitude and latitude coordinates of the accident. Both coordinates are saved as a single character variable called “LOCATION”.

Below, I load the ggmap and gridExtra packages, load the data, drop all accidents without location coordinates, and parse the LOCATION variable to get the longitude and latitude coordinates. I also parse the date variable to create a year variable and use that variable to create two data sets: one with all vehicle accidents in 2013 and another with all vehicle accidents in 2014

 
library(ggmap)
library(gridExtra)

d=read.csv('.../NYPD_Motor_Vehicle_Collisions.csv')

d_clean=d[which(regexpr(',',d$LOCATION)!=-1),]

comm=regexpr(',',d_clean$LOCATION)
d_clean$loc=as.character(d_clean$LOCATION)
d_clean$lat=as.numeric(substr(d_clean$loc,2,comm-1))
d_clean$long=as.numeric(substr(d_clean$loc,comm+1,nchar(d_clean$loc)-1))
d_clean$year=substr(d_clean$DATE,7,10)

d_2013=d_clean[which(d_clean$year=='2013'),c('long','lat')]
d_2014=d_clean[which(d_clean$year=='2014'),c('long','lat')]

Next, I use get_map() to query Google Maps and get a map of NYC. I add a two-dimensional density layer to this map using stat_density2d(). I do this for both 2013 and 2014 data and use gridExtra’s grid.arrange() to place the maps side by side.

ny_plot=ggmap(get_map('New York, New York',zoom=12, maptype='terrain'))

plot1=ny_plot+
stat_density2d(data= d_2013, aes(x = d_2013$long, y = d_2013$lat, alpha=.75,fill=..level..),bins = 10, geom = 'polygon')+
guides(fill = guide_colorbar(barwidth = 1, barheight = 12)) +
scale_alpha(guide = FALSE)+ 
xlab(' ')+ylab(' ')+
ggtitle('Vehicle Accidents - NYC - 2013')

plot2=ny_plot+
stat_density2d(data= d_2014, aes(x = d_2014$long, y = d_2014$lat, alpha=.75,fill=..level..),bins = 10, geom = 'polygon')+
guides(fill = guide_colorbar(barwidth = 1, barheight = 12)) +
scale_alpha(guide = FALSE)+ 
xlab(' ')+ylab(' ')+
ggtitle('Vehicle Accidents - NYC - 2014')

grid.arrange(plot1, plot2,nrow=1,ncol=2) 

Next, I plot 2013 accident densities by borough. I write a function boro() that drops all observations with a missing street name and subsets the 2013 data based on borough. I stack them together while adding borough names, then assign each borough a color in vector col_vals. I use stat_density2d’s group parameter to plot each borough’s density layers separately with different colors.

boro=function(x){
 d_clean2=d_clean[which(d_clean$ON.STREET.NAME!='' & d_clean$BOROUGH==x),]
 d_2013_2=d_clean2[which(d_clean2$year=='2013'),c('long','lat','ON.STREET.NAME')]
return(d_2013_2)
}

man=boro('MANHATTAN')
bronx=boro('BRONX')
brook=boro('BROOKLYN')
si=boro('STATEN ISLAND')
q=boro('QUEENS')

full=rbind(cbind(man,group='Manhattan'),
            cbind(bronx,group='Bronx'),
            cbind(brook,group='Brooklyn'),
            cbind(si,group='Staten Island'),
            cbind(q,group='Queens'))
Borough=full$group

col_vals=c('Manhattan'='dodgerblue', 'Bronx'='darkred',
            'Brooklyn'='violet','Staten Island'='darkgreen',
            'Queens'='darkgoldenrod4')

ny_plot=ggmap(get_map('New York, New York',zoom=11, maptype='terrain'))

plot4=ny_plot+
  stat_density2d(data=full, geom='polygon',bins = 10, aes(x=full$long,y=full$lat,fill = Borough, alpha=..level..))+
  scale_fill_manual(values=col_vals)+
  #guides(fill = guide_colorbar(barwidth = 1, barheight = 12)) +
  scale_alpha(guide = FALSE)+ 
  xlab(' ')+ylab(' ')+
  ggtitle('NYC Vehicle Accident Density by Borough, 2013')
plot4


To leave a comment for the author, please follow the link and comment on their blog: Stable Markets » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.