Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The analysis of geospatial information is currently a big trend in medicine and public health. Even though some may want to convince you that this can only be achieved with the latest and most expensive software, I am not convinced. First, analysis of spatial data dates back to at least 1856 when John Snow investigated Cholera-outbreaks in London. Second, as I try to demonstrate today some very interesting analysis and data can be retreived essentially for free.
While I have already made a post on how to plot freely availible geospatial data in R in a previous post , this post will show you how to use Python to access the google maps database and gather e.g. travel times and distances to/from various locations with known zip-codes.
Please note that this is my first Python skript. So it will certainly not meet the high standards you might have developed based on previous posts. On the up-side, you will get the baby step instructions.
Update 2011/07/03: A much more user-friendly version of the script that adds guis to select a proper csv-file, containing start and end-adressess and to store the results can be found here. If you are afraid of Python, you can use the stand-alone Mac app “batchtimer” that basically contains all files necessary from here.
A. Installing Python
- Download and install Python and the Python setuptools package so that you can use easy_install.
- Install the google directions package: Just type easy_install google.directions
B. Run the skript
The complete skript aswell as an example file with zip_codes can be downloaded here.
Here is a bit more thorough description of what it does. Parts you may want to change are marked in bold. Basically the skript consists of four parts.
1. Load the necessary packages and set-up (you need a google directions key).
import csv
from google.directions import GoogleDirections
gd = GoogleDirections(“your-google-directions-key”)
2. Read zip-codes from a file
Here is example looks like this:
zip_codes = csv.reader(open(‘/zips.csv‘, “rb”), delimiter=’ ‘, quotechar=’|’)
zips=list(zip_codes)
3. Loop through the list of zips
times=[]
miles=[]
for i in range(len(zips)):
start= (str(zips[i]) +”, Germany“)
end= (“BERLIN,” + “Germany“)
res = gd.query(start, end)
temp=res.result[“Directions”][“Duration”][“seconds”]
times.append(temp)
miles.append(res.distance)
print i
Please check if the distance is given in miles or km!
4. Write the output
out = csv.writer(open(‘/results.csv’, ‘wb’), delimiter=’;’,quotechar=’X’, quoting=csv.QUOTE_MINIMAL)
for i in range(len(times)):
out.writerow(str(zips[i])+ ” ” + str(times[i]) + ” ” + str(miles[i]))
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.