Plotting Russian AiRstRikes in SyRia
[This article was first published on Fear and Loathing in Data Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
“Who do we think will rise if Assad falls?”
“Do we have a “government in a box” that we think we can fly to Damascus and put into power if the Syrian army collapses, the regime falls and ISIS approaches the capital?”
“Have we forgotten the lesson of “Animal Farm”? When the animals revolt and take over the farm, the pigs wind up in charge.”
Patrick J. Buchanan
In my new book, “Mastering Machine Learning with R”, I wanted to include geo-spatial mapping in the chapter on cluster analysis. I actually completed the entire chapter doing a cluster analysis on the Iraq Wikileaks data, plotting the clusters on a map and building a story around developing an intelligence estimate for the Al-Doura Oil Refinery, which I visited on many occasions during my 2009 “sabbatical”. However, the publisher convinced me that the material was too sensitive for such a book and I totally re-wrote the analysis with a different data set. I may or may not publish it on this blog at some point, but I want to continue to explore building maps in R. As luck would have it, I stumbled into a data set showing the locations of Russian airstrikes in Syria at the following site:
http://russia-strikes-syria.silk.co/
The data includes the latitude and longitude of the strikes along with other background information. The what, how and why the data was collected is available here:
https://www.bellingcat.com/news/mena/2015/10/26/what-russias-own-videos-and-maps-reveal-about-who-they-are-bombing-in-syria/
In short, the site tried to independently verify locations, targets etc., plus includes what they claim are the reported versus actual strike locations. When I pulled the data there were 60 strikes analyzed by the site. They were unable to determine the locations of 11 of the strikes, so we have 49 data points.
I built the data in excel and put in a .csv, which I’ve already loaded. Here is the structure of the data.
> str(airstrikes)
‘data.frame’: 120 obs. of 4 variables:
$ Airstrikes : chr “Strike 1” “Strike 10” “Strike 11” “Strike 12” …
$ Lat : chr “35.687782” “35.725846” “35.734952” “35.719518” …
$ Long : chr “36.786667” “36.260419” “36.073837” “36.072385” …
$ real_reported: chr “real” “real” “real” “real” …
> head(airstrikes)
Airstrikes Lat Long real_reported
1 Strike 1 35.687782 36.786667 real
2 Strike 10 35.725846 36.260419 real
3 Strike 11 35.734952 36.073837 real
4 Strike 12 35.719518 36.072385 real
5 Strike 13 35.309074 36.620506 real
6 Strike 14 35.817206 36.124503 real
> tail(airstrikes)
Airstrikes Lat Long real_reported
115 Strike 59 35.644864 36.338568 reported
116 Strike 6 35.740134 36.247029 reported
117 Strike 60 36.09346 37.085198 reported
118 Strike 7 35.702113 36.563525 reported
119 Strike 8 35.822472 36.018779 reported
120 Strike 9 35.725846 36.260419 reported
Since lat and long are character, I need to change them to numeric and also keep a subset of data of the actual/real strike locations.
> airstrikes$Lat = as.numeric(airstrikes$Lat)
Warning message:
NAs introduced by coercion
> airstrikes$Long = as.numeric(airstrikes$Long)
Warning message:
NAs introduced by coercion
> real=subset(airstrikes, airstrikes$real_reported==”real”)
> library(ggmap)
Loading required package: ggplot2
Google Maps API Terms of Service: http://developers.google.com/maps/terms.
Please cite ggmap if you use it: see citation(‘ggmap’) for details.
> citation(‘ggmap’)
To cite ggmap in publications, please use:
D. Kahle and H. Wickham. ggmap: Spatial Visualization with ggplot2. The
R Journal, 5(1), 144-161. URL
http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf
The first map will be an overall view of the country with the map type as “terrain”. Note that “satellite”, “hybrid” and “roadmap” are also available.
> map1 = ggmap(
get_googlemap(center=”Syria”, zoom=7, maptype=”terrain”))
With the map created as object “map1”, I plot the locations using “geom_point()”.
> map1 + geom_point(
data = real, aes (x = Long, y = Lat), pch = 19, size = 6, col=”red3″)
With the exception of what looks like one strike near Ar Raqqah, we can see they are concentrated between Aleppo and Homs with some close to the Turkish border. Let’s have a closer look at that region.
> map2 = ggmap(
get_googlemap(center=”Ehsim, Syria”, zoom=9, maptype=”terrain”))
> map2 + geom_point(data = real, aes (x = Long, y = Lat),
pch = 18, size = 9, col=”red2″)
East of Ghamam is a large concentration, so let’s zoom in on that area and add the strike number as labels.
> map3 = ggmap(
get_googlemap(center=”Dorien, Syria”,zoom=13, maptype=”hybrid”))
> map3 + geom_point(
data = real, aes (x = Long, y = Lat),pch = 18, size = 9, col=”red3″) +
geom_text(data=real,aes(x=Long, y=Lat, label=Airstrikes),
size = 5, vjust = 0, hjust = -0.25, color=”white”)
The last thing I want to do is focus in on the site for Strike 28. To do this we will require the lat and long, which we can find with the which() function.
> which(real$Airstrikes ==”Strike 28″)
[1] 21
> real[21,]
Airstrikes Lat Long real_reported
21 Strike 28 35.68449 36.11946 real
It is now just a simple matter of using those coordinates for calling up the google map.
> map4 = ggmap(
get_googlemap(center=c(lon=36.11946,lat=35.68449), zoom=17, maptype=”satellite”))
> map4 + geom_point(
data = real, aes (x = Long, y = Lat), pch = 22, size = 12, col=”red3″)
+ geom_text(data=real,aes(x=Long, y=Lat, label=Airstrikes),
size = 9, vjust = 0, hjust = -0.25, color=”white”)
From the looks of it, this seems to be an isolated location, so it was probably some sort of base or logistics center. If you’re interested, the Russian Ministry of Defense posts videos of these strikes and you can see this one on YouTube.
https://www.youtube.com/watch?v=Ape5grS9MEM
OK, so that is a quick tutorial on using ggmap, a very powerful package. We’ve just scratched the surface of what it can do. I will continue to monitor the site for additional data. Perhaps publish a Shiny app if the data is large and “rich” enough.
Cheers,
CL
To leave a comment for the author, please follow the link and comment on their blog: Fear and Loathing in Data Science.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.