Site icon R-bloggers

Visualizing iPhone location tracking with R and Google Maps

[This article was first published on Offensive Politics » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

According to security researchers the iPhone 4 is logging location data in the background, and apparently sending some part of that data to Apple every day or few days (Wired). Silently recording location data is bad enough, but the data itself is easily recoverable from an iPhone backup. Some enterprising guys (@aallen,@petewarden) wrote an OSX application iPhone Tracker to parse and visualize the location data on a map. As appalled as I was that this data exists, I was also really interested in rewriting their visualization code in R.

Researcher Drew Conway beat me to it with stalkR, but my code is sufficiently different that I think people can learn from both. I’ll walk through the code, links to the github repo are at the end of the post.

Since the location database is stored inside an iOS backup, we’ll need to understand the structure of that backup. The backup contains a bunch of files named with a long hex string, and a few files that provide a binary table of contents. There is some nice python code (iPhone Backup Decoder to open up the table of contents and locate specific files. I was going to translate this code to R, but I decided on a brute-force approach instead. The file we’re looking for is a SQLite database, and contains several unique tables. I just try to open every file in a given backup directory as a SQLite database, and look for a known table name (CellLocation). If the file isn’t a database or the table doesn’t exist then we move on.

library(RSQLite)
library(RgoogleMaps)
 
findLocationDB <- function (basePath) {
	filename <- NA
	drv <- dbDriver("SQLite")
	for(testFileName in list.files(basePath) ) {
		# brute-force the connection
		con <- dbConnect(drv,paste(basePath,testFileName,sep=''))
		# try and list the tables. 
		# this will fail if the file is not a sqlite db
		tableList <- tryCatch(dbListTables(con), error=function(e) e)	
		# if class of tableList is character then we've got a sqlite DB 
		if(any(class(tableList) == "character")) {
			# look for the CellLocation table
			if(length(grep("CellLocation", tableList))>0) {
			# we've found it. save this filename
				filename <- paste(basePath,"/",testFileName,sep='')
				dbDisconnect(con)
				break
			}
		}
		dbDisconnect(con)
	}
	dbUnloadDriver(drv)	
	return(filename)
}

Now that we’ve got a function to find a location database, we can access the database and load it into a data frame.

fetchLatLongTimestamp <- function(dbLocation,loc.table.name,accuracy=1.0) {	
	ldata <- NA
	con <- dbConnect("SQLite", dbLocation)
	ldata <- dbReadTable(con, loc.table.name)
	# drop data where lat == 0.0 && long == 0.0
	ldata <- ldata[-which(ldata$Latitude == 0.0 & ldata$Longitude == 0.0),]
	# convert the mac timestamp to unix timestamp
        ldata$datetime <- as.POSIXlt(ldata$Timestamp, origin="2001-01-01")
	# downsample the lat long by accuracy to obscure the location
	ldata$Latitude <- ldata$Latitude / accuracy
	ldata$Longitude <- ldata$Longitude / accuracy	
	ldata <- ldata[,c("Latitude","Longitude", "datetime")]
	dbDisconnect(con)
	return(ldata)
}

This fetchLatLongTimestamp function will load the entire location database into a data frame, and then clean up the timestamps and remove bad location data. I had originally seen the time stamp correction code on Prof Jackman’s blog, so thanks to him for that (and pscl!).

Now we’ve got a data frame of Latitude, Longitude, and datetime stamp that looks more or less like this:

Lat Lon Timestamp
38.90612 -77.03961 2011-03-17 17:03:09
38.90563 -77.03929 2011-03-17 17:03:09
38.90567 -77.03957 2011-03-17 17:03:09
38.90574 -77.03988 2011-03-17 17:03:09
38.90561 -77.03967 2011-03-17 17:03:09

The Lat/Lon represents downtown DC, near where I bought my iPhone last month.

Now that we’ve got a data frame full of juicy location data, we need to plot it on a map. I used the fantastic RgoogleMaps package, and ripped most of the vignette (pdf: RgoogleMaps: An R Package for plotting on Google map tiles within R) for loading a map and plotting points by latitude and longitude.

If I’ve got my location data in a data frame called ldata, I can use the following to find the correct bounds and zoom level, fetch a map, and plot my location data. Again, the drawing code is basically ripped from the RgoogleMaps vignette.

## plot a map of all the positions
	bb <- qbbox(ldata$Latitude, ldata$Longitude)
	# zoomlevel 4 works for my data (US only) 
	zoomlevel <- 4
	# grab the map
	map <- GetMap.bbox(bb$lonR, bb$latR,zoom=zoomlevel,maptype="mobile")
	# plot the points as circles 
	PlotOnStaticMap(map,lon=ldata$Longitude,lat=ldata$Latitude,col="blue",verbose=0)

Which gives us:

Obviously I spent a lot of time in Washington, DC, New York, Boston, and Las Vegas. We’re just using R, I can easily slice and dice the data. Let’s say I just wanted to see my Las Vegas data (April 1st – April 4th):

	ldata.lv <- ldata[which(ldata$datetime >= as.POSIXlt('2011-04-01 23:00:00') & ldata$datetime <= as.POSIXlt('2011-04-04 14:00:00')),]
	bb.lv <- qbbox(ldata.lv$Latitude, ldata.lv$Longitude)
	# zoom level of 12 center nicely on the las vegas strip
	zoom.lv <- 12
	map.lv <- GetMap.bbox(bb.lv$lonR, bb.lv$latR,zoom=12,maptype="mobile")
	PlotOnStaticMap(map.lv,lon=ldata.lv$Longitude,lat=ldata.lv$Latitude,col="blue",verbose=0)

Which gives us:

Yes, I spent a lot of time at the Wynn, Caesar palace, and In n Out Burger.

Here is the full driver code:

## change this to the full path of a backup of an ios 4 device
backupPath <- "C:/Documents and Settings/YourUser/Application Data/apple computer/MobileSync/Backup/a6ddb1824738f61a15b3e3c87e3e8172599b7134/"
 
dbLoc <- findLocationDB(backupPath)
 
if(!is.na(dbLoc)) {
	print(sprintf("Found location database in path: %s!",dbLoc))
 
	## for Verizon phones
	# locs <- fetchLatLongTimestamp(dbLoc, "CdmaCellLocation")
	## for AT&T phones
	ldata <- fetchLatLongTimestamp(dbLoc, "CellLocation")
 
	## plot a map of all the positions
	bb <- qbbox(ldata$Latitude, ldata$Longitude)
	# zoomlevel 4 works for my data (US only) 
	zoomlevel <- 4
	# grab the map
	map <- GetMap.bbox(bb$lonR, bb$latR,zoom=zoomlevel,maptype="mobile")
	png("all-tracks.png", width=640,height=640)
	# plot the points as circles 
	PlotOnStaticMap(map,lon=ldata$Longitude,lat=ldata$Latitude,col="blue",verbose=0)
	dev.off()
 
	## limit the data to 4/1-4/4. I was in las vegas at the time.
	ldata.lv <- ldata[which(ldata$datetime >= as.POSIXlt('2011-04-01 23:00:00') & ldata$datetime <= as.POSIXlt('2011-04-04 14:00:00')),]
	bb.lv <- qbbox(ldata.lv$Latitude, ldata.lv$Longitude)
	# zoom level of 12 center nicely on the strip
	zoom.lv <- 12
	map.lv <- GetMap.bbox(bb.lv$lonR, bb.lv$latR,zoom=12,destfile="lv.png",maptype="mobile")
	png("lv-tracks.png",width=640,height=640)
	PlotOnStaticMap(map.lv,lon=ldata.lv$Longitude,lat=ldata.lv$Latitude,col="blue",verbose=0)
	dev.off()
 
} else {
	print(sprintf("Could not find location database in path: %s",backupPath))
}

You can see this code on my iPhone location with R github repo. One big missing feature from the original application is animation, which I may add later. Patches and comments are greatly appreciated!

To leave a comment for the author, please follow the link and comment on their blog: Offensive Politics » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.