Site icon R-bloggers

Placement: An R package to Access the Google Maps API

[This article was first published on R from Stata, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A few months ago I set out to write an R package for accessing the Maps API with my employer’s (paid) Google for Work/Premium account. At the time, I was unable to find an R package that could generate the encrypted signature, send the URL to Google and process the JSON returns in one fell swoop. Following Google’s directions for Python, however, I was able to create an R function that creates valid signatures for a URL request using the digest package’s implementation of the sha-1 algorithm. Along the way I added a few additional features that are useful in our workgroup, including (1) a function to retrieve Google Map’s distance and travel time estimates (via public transit, driving, cycling, or walking) between two places (drive_time), (2) a general purpose function for stripping address vectors of nasty characters that may break a geocode request (address_cleaner), and (3) methods for accessing the Google API with a (free) standard account (see also the excellent ggmaps package, which provides a similar facility for geocoding with Google’s standard API).

In daily use I’ve seen few issues thus far, and I’ve used earlier versions of this package to geocode about a quarter million physical locations in North America. The placement package, which includes examples, can be viewed on Github and installed in the usual way:

library(devtools)
install_github("DerekYves/placement")
library(placement)

Here’s a few examples using the standard (free) API (see here to get a free API key from Google, which has higher quota limits than supplying an empty string):

# Get coordinates for the Empire State Building and Google
address <- c("350 5th Ave, New York, NY 10118, USA",
			 "1600 Amphitheatre Pkwy,
			 Mountain View, CA 94043, USA")

coordset <- geocode_url(address, auth="standard_api", privkey="",
            clean=TRUE, add_date='today', verbose=TRUE)
## Sending address vector (n=2) to Google...
## Finished. 2 of 2 records successfully geocoded.
# View the returns
print(coordset[ , 1:5])
##        lat        lng location_type
## 1 40.74844  -73.98566       ROOFTOP
## 2 37.42234 -122.08437       ROOFTOP
##                                             formatted_address status
## 1 Empire State Building, 350 5th Ave, New York, NY 10118, USA     OK
## 2        1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA     OK

Distance calculations (note that some transit options are not accessible with the standard API):

# Bike from the NYC to Google!
address <- c("350 5th Ave, New York, NY 10118, USA",
			 "1600 Amphitheatre Pkwy, 
			 Mountain View, CA 94043, USA")

# Google allows you to supply geo coordinates *or* a physical address 
# for the distance API. In this example, we will supply coordinates
# from our previous call. Google requires a string format of: 
#   "lat,lng" (with no spaces) for coordinates.

start <- paste(coordset$lat[1],coordset$lng[1], sep=",")
end   <- paste(coordset$lat[2],coordset$lng[2], sep=",")

# Get the travel time by bike (a mere 264 hours!) and distance in miles:
howfar_miles <- drive_time(address=start, dest=end, auth="standard_api",
						   privkey="", clean=FALSE, add_date='today',
						   verbose=FALSE, travel_mode="bicycling",
						   units="imperial")

# Get the distance in kilometers using physical addresses instead of lat/lng:
howfar_kms <- drive_time(
     address="350 5th Ave, New York, NY 10118",
		dest="1600 Amphitheatre Pkwy, Mountain View, CA",
		auth="standard_api", privkey="", clean=FALSE,
		add_date='today', verbose=FALSE, travel_mode="bicycling",
		units="metric"
		)

with(howfar_kms, 
	 cat("Cycling from NYC to ", destination,
	 	":\n", dist_txt, " over ", 
	 	time_txt, sep=""), sep="")
## Cycling from NYC to 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA:
## 5,345 km over 11 days 17 hours

Address cleaning function:

# Clean a "messy" or otherwise incompatible address vector:
address <- c(" 350 5th Ave. ½, New York, NY 10118, USA ",
			 "  ª1600  Amphitheatre Pkwy, 
			 Mountain View, CA 94043, USA")

# View the return:
address_cleaner(address)
## 	* Replacing non-breaking spaces
## 	* Removing control characters
## 	* Removing leading/trailing spaces, and runs of spaces
## 	* Transliterating latin1 characters
## 	* Converting special address markers
## 	* Removing all remaining non-ASCII characters
## 	* Remove single/double quotes and asterisks
## 	* Removing leading, trailing, and repeated commas
## 	* Removing various c/o string patterns
## [1] "350 5th Ave.  1/2, New York, NY 10118, USA"           
## [2] "a1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA"

If you would like to apply this function to multiple address fields stored in separate columns (e.g., only “street 1” and “city”), you might try something like:

address_df[] <- sapply(address_df, placement::address_cleaner)
## Error in lapply(X = X, FUN = FUN, ...): object 'address_df' not found

Using your Google for Work account obviously requires a client ID and API key, but the methods to do so are well documented in the package help files. Feel free to shoot me an email if you run into any issues!

To leave a comment for the author, please follow the link and comment on their blog: R from Stata.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.