Plane Crash Data – Part 2: Google Maps Geocoding API Request
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This is the second part of our series about plane crash data. To execute the code below, you’ll first need to execute the code from the first part of this series to obtain the prepared plane crash dataset.
In this part I’d like to get the geocoordinates from the Google Maps Geocoding API for the crash location and the point of departure as well as for the intended point of arrival. The location of the crash is contained in the location variable. The other two pieces of information are contained in the route variable, so we first need to extract them and store them in separate variables.
<pre class="r"><code>separators <- " - |- | -" data <- data %>% # split route variable into "from" and "to" separate(route, sep = separators, into = c("from", "to"), extra = "merge") %>% # if there was a pit stop, "to" sometimes still contains two locations. # we need only the last one. separate(to, sep = separators, into = c("pitStop", "to"), fill = "left") %>% select(-pitStop)</code>
In order to prevent weird results, we exclude incomplete cases right from the start.
<pre class="r"><code># exclude observations with NA data <- data[complete.cases(data), ] </code>
Now, in order to send requests to the Google Maps Geocoding API – which converts addresses into geocoordinates – you need to get yourself an API key. Here you go: Get Google API key
Let us store our key in an R object:
<pre class="r"><code>apiKey <- "ENTER_API_KEY_HERE"</code>
Now we have almost everything ready: We have complete data containing locations of departure, intended arrival, and crash as strings, and we have an API that converts these strings into geocoordinates. However this API returns the geocoordinates in form of a JSON string which we can’t use right away. So what we need is a function to extract the relevant information from this JSON string and store it in our dataset. Therefore we need to load the jsonlite package.
<pre class="r"><code>library("jsonlite")</code>
Look at the following function. It takes two arguments: the location and the API key. The return value is a vector containing the geocoordinates of the location. If the status of the request is “OK”, the API returns the geocoordinates (latitude lat and longitude lng) which our function writes directly into a dataframe. However if the Google API cannot return any coordinates for the requested location, the API will return the string “ZERO_RESULTS”. Then our function returns NAs. This case may occur if the location is unknown (?) or given as Sightseeing for example.
<pre class="r"><code>getGeoCoord <- function(loc, apiKey) { # create request request <- paste0("https://maps.googleapis.com/maps/api/geocode/json?", "address=", gsub(" ", "+", loc), "&key=", apiKey) # extract results and convert them to strings result <- request %>% lapply(fromJSON) %>% .[[1]] if (result$status == "OK") { result <- result$results$geometry$location[1, ] } else if (result$status == "ZERO_RESULTS") { result <- data.frame(lat = NA, lng = NA) } result %>% data.frame }</code>
Now let us use the function and extract geocoordinates for the plane crash locations, the departure locations and the locations of intended arrival. We first store them in objects called coordCrash, coordFrom and coordTo. Then we add them to our existing dataframe.
<pre class="r"><code># send requests: # crash location coordCrash <- lapply(data$location, getGeoCoord, apiKey = apiKey) %>% bind_rows %>% setNames(paste0(names(.), "CrashLoc")) # departure location coordFrom <- lapply(data$from, getGeoCoord, apiKey = apiKey) %>% bind_rows %>% setNames(paste0(names(.), "From")) # intended arrival location coordTo <- lapply(data$to, getGeoCoord, apiKey = apiKey) %>% bind_rows %>% setNames(paste0(names(.), "To")) # add the new columns to data data <- cbind(data, coordCrash, coordFrom, coordTo)</code>
Now the data is ready to be visualised. This happens in the third part of the series.
Further parts of the Plane Crash series:
- Plane Crash Data – Part 1: Web Scraping
- Plane Crash Data – Part 2: Google Maps Geocoding API Request
- Plane Crash Data – Part 3: Visualisation
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.