Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This is a Christmas present to myself to celebrate 10,000km of commuting on my bicycle: a lovely frame print of all my GPS traces on a home-made map of London. Here's how I made it.
This blog was contributed to R Bloggers
Step one: getting Strava data
The process has become slightly convoluted in May 2018. In the past, it was possible to download a Strava archive which contained all activities as GPX; however, GDPR regulations led to a change in bulk export format.
Anyway: request your Strava archive, wait for the email, download it.
activities.csv
gives us a list of activities… which we can then scrape from Strava.
$ cat activities/activities.csv | head -5 id,date,name,type,description,elapsed_time,distance,commute,gear,filename 1988225457,2018-11-27 19:11:32,Evening Ride,Ride,"",1468,6075.0,true,1x1x8,activities/2125648250.fit.gz 1992478096,2018-11-29 10:02:58,Morning Ride,Ride,"",1449,5869.0,true,1x1x8,activities/2130096306.fit.gz 1992478209,2018-11-29 23:04:16,Night Ride,Ride,"",1166,5930.0,true,1x1x8,activities/2130096453.fit.gz 1992479540,2018-11-30 08:51:29,Morning Ride,Ride,"",1427,6563.0,true,Michel le fixie,activities/2130097780.fit.gz
Clone hugovk/strava-tools, a set of Python tools to do automate interactions with Strava. There are some instructions at the top of download_fit_as_gpx.py
.
Follow these before running $ python download_fit_as_gpx.py
Some time later… here we are!
$ ls activities/activities | head -5 20160908-170525-Ride.gpx 20160909-080222-Ride.gpx 20160909-180740-Ride.gpx 20160912-081014-Ride.gpx 20160912-172611-Ride.gpx
Step two: Ingesting activities in R
Install marcusvolz/strava, an excellent suite of R functions that help us load activities in a (big, big) dataframe.
library(devtools) devtools::install_github("marcusvolz/strava") library(strava) library(tidyverse) # main ingestion method # ACHTUNG: slow! data <- process_data("./activities/activities/") data %>% head() id lat lon ele time dist_to_prev cumdist time_diff_to_prev cumtime 1 1 45.78514 4.717282 295.7 2015-12-26 11:21:48 0.000000000 0.00000000 0 0 2 1 45.78512 4.717559 295.3 2015-12-26 11:22:12 0.021643479 0.02164348 24 24 3 1 45.78512 4.717573 295.2 2015-12-26 11:22:13 0.001111138 0.02275462 1 25 4 1 45.78512 4.717583 295.2 2015-12-26 11:22:17 0.000785528 0.02354015 4 29 5 1 45.78509 4.717684 295.1 2015-12-26 11:22:38 0.008715879 0.03225602 21 50 6 1 45.78508 4.717694 295.1 2015-12-26 11:22:39 0.001181249 0.03343727 1 5
Step three: Visualising our activities
We can piece together a simple plot (besides throwing a caveman-like plot(data)
) limited to London:
library(ggplot2) theme_opts<-list(theme(panel.grid.minor = element_blank(), panel.background = element_blank(), plot.background = element_blank(), axis.line = element_blank(), axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), axis.title.x = element_blank(), axis.title.y = element_blank(), plot.title = element_blank(), panel.grid.major = element_line(colour = 'transparent'), legend.position = "none")) ggplot(data, aes(lon, lat, group = id)) + geom_path(colour="steelblue",alpha = 0.2) + coord_map(projection = "mercator", xlim = c(-0.41, 0.08), ylim = c(51.42, 51.73), clip = "on") + theme_opts
Obviously we are going to need some geographical landmarks to get our bearings. geotheory/londonShapefiles is one of the simplest and quickest ways to get London boundaries loaded in – and that includes the river Thames.
library(devtools) install_github("geotheory/londonShapefiles") library("londonShapefiles") thames <- load_thames() thames.proj <- spTransform(thames, CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")) thames.df <- fortify(thames.proj) ggplot() + geom_polygon(thames.df, mapping=aes(long, lat, group = group, fill = I("skyblue")), alpha = 0.4) + geom_path(data, mapping=aes(lon, lat, group = id, alpha = 0.9, color = I("tomato")), size=1/3) + coord_map(projection = "mercator", xlim = c(-0.41, 0.08), ylim = c(51.42, 51.73)) + theme_opts
But behold, we can also get data straight from Open Street Map thanks to osmdata. The syntax is relatively concise and OSM's web interface allows you to query features directly from there:
In our example, we grab all the parks in London and them use geom_sf()
to represent both the polygons
and multipolygons
(the osmdata
object has plenty inside it). We need both polygons and multipolygons because, according to the OSM wiki:
Any area that is more complex than that (e.g., because its outline consists of several ways joined together, or because the area consists of multiple disjunct parts, or has holes) requires a multipolygon relation.
library(osmdata) parks2 <- opq(bbox = c(-0.41, 51.42, 0.08, 51.73)) %>% add_osm_feature(key = 'leisure', value = "park") %>% osmdata_sf() ggplot() + geom_polygon(thames.df, mapping=aes(long, lat, group = group, fill = I("skyblue")), alpha = 0.4) + geom_sf(parks2$osm_polygons, mapping=aes(fill = I("springgreen4"), color = I("transparent")), alpha = 0.4) + geom_sf(parks2$osm_multipolygons, mapping=aes(fill = I("springgreen4"), color = I("transparent")), alpha = 0.4) + geom_path(data, mapping=aes(lon, lat, group = id, alpha = 0.9, color = I("tomato")), size=1/3) + coord_sf(xlim = c(-0.41, 0.08), ylim = c(51.42, 51.73)) + theme_opts
Well, we're almost there. For this first draft I stopped at the map above, exported an SVG and sent that to the lab for printing.
There is no limit to how much you can play with this data. One thing that really struck me was how imprecise my GPS tracking devices were in the City of London (despite being prosumer equipment). Below is a representation of one year of data, on which I've overlaid the OSM paths corresponding to the streets around Bishopsgate, Cheapside, Monument, Cannon Street, Aldgate, Smithfields, London Bridge and Blackfriars.
streets <- opq(bbox = c(-0.21, 51.47, 0.05, 51.61)) %>% add_osm_feature(key = 'highway', value = "primary") %>% osmdata_sf() streets2 <- opq(bbox = c(-0.21, 51.47, 0.05, 51.61)) %>% add_osm_feature(key = 'highway', value = "secondary") %>% osmdata_sf() streets3 <- opq(bbox = c(-0.21, 51.47, 0.05, 51.61)) %>% add_osm_feature(key = 'highway', value = "tertiary") %>% osmdata_sf() + geom_sf(streets$osm_lines, mapping=aes(fill = I("transparent"), color = I("darkgray")), size = 0.5) + geom_sf(streets2$osm_lines, mapping=aes(fill = I("transparent"), color = I("darkgray")), size = 0.3) + geom_sf(streets3$osm_lines, mapping=aes(fill = I("transparent"), color = I("darkgray")), size = 0.1)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.