Site icon R-bloggers

An API for @racently

[This article was first published on R | datawookie, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

@racently is a side project that I have been nursing along for a couple of years. It addresses a problem that I have as a runner: my race results are distributed across a variety of web sites. This makes it difficult to create a single view on my running performance (or lack thereof) over time. I suspect that I am not alone in this. Anyway, @racently was built to scratch my personal itch: my running results are now all aggregated in one place.

A few months ago @DanielCunnama suggested that I add the ability to creating running groups in @racently. This sounded like a good idea. It also sounded like a bit of work and TBH I just did not have the time. So I made a counter-suggestion: how about an API so that he could effectively aggregate the data in any way he wanted? He seemed happy with the idea, so it immediately went onto my backlog. And there it stayed. But @DanielCunnama is a persistent guy (perhaps this is why he’s a class runner!) and he pinged me relentlessly about this… until Sunday when I relented and created the API.

And now I’m happy that I did, because it gives me an opportunity to write up a quick post about how these data can be accessed from R.

Profiles on @racently

I’m going to use Gerda Steyn as an example. I hope she doesn’t mind. This is what Gerda’s profile looks like on @racently.

Now there are a couple of things I should point out:

  1. This profile is far from complete. Gerda has run a lot more races than that. These are just the ones that we currently have in our database. We’re adding more races all the time, but it’s a long and arduous process.
  2. The result for the 2019 Comrades Marathon was when she won the race!

A view like this can be created for any runner on the system. Most runners in South Africa should have a profile (unless they have explicitly requested that we remove it!).

Pulling Data with the API

Supposing that you wanted to do some analytics on the data. You’d want to pull the data into R or Python. You could scrape the site, but the API makes it a lot easier to access the data.

Load up some helpful packages.

library(glue)
library(dplyr)
library(purrr)
library(httr)

Set up the URL for the API endpoint and the key for Gerda’s profile.

URL = "https://www.racently.com/api/athlete/{key}/"

key = "7ef6fbc8-4169-4a98-934e-ff5fa79ba103"

Send a GET request and extract the results from the response object, parsing the JSON into an R list.

response <- glue(URL) %>% GET() %>% content()

Extract some basic information from the response.

response$url
## [1] "http://www.racently.com/api/athlete/7ef6fbc8-4169-4a98-934e-ff5fa79ba103/"
response$name
## [1] "Gerda Steyn"
response$gender
## [1] "F"

Now get the race results. This requires a little more work because of the way that the JSON is structured: an array of licenses, each of which has a nested array of race result objects.

response$license %>% map_dfr(function(license) {
  license$result %>%
    map_dfr(as_tibble)} %>%
    mutate(
      club = license$club,
      number = license$number,
      date = as.Date(date)
    )
  ) %>%
  arrange(desc(date))
##   date       race          distance time     club    number
## 1 2019-06-09 Comrades      86.8 km  05:58:53 Nedbank     NA
## 2 2018-06-10 Comrades      90.2 km  06:15:34 Nedbank   8300
## 3 2018-05-20 RAC           10.0 km  00:35:38 Nedbank   8300
## 4 2018-05-01 Wally Hayward 10.0 km  00:35:35 Nedbank   8300
## 5 2017-06-04 Comrades      86.7 km  06:45:45 Nedbank     NA
## 6 2016-05-29 Comrades      89.2 km  07:08:23 Nedbank     NA

For good measure, let’s throw in the results for @DanielCunnama.

##    date       race               distance time     club              number
##  1 2019-09-29 Grape Run          21.1 km  01:27:49 Harfield Harriers   4900
##  2 2019-06-09 Comrades           86.8 km  07:16:21 Harfield Harriers   4900
##  3 2019-02-17 Cape Peninsula     42.2 km  03:08:47 Harfield Harriers   4900
##  4 2019-01-26 Red Hill Marathon  36.0 km  02:52:55 Harfield Harriers   4900
##  5 2019-01-13 Bay to Bay         30.0 km  02:15:55 Harfield Harriers   7935
##  6 2018-11-10 Winelands          42.2 km  02:58:56 Harfield Harriers   7935
##  7 2018-10-14 The Gun Run        21.1 km  01:22:30 Harfield Harriers   7935
##  8 2018-10-07 Grape Run          21.1 km  01:36:46 Harfield Harriers   8358
##  9 2018-09-23 Cape Town Marathon 42.2 km  03:11:52 Harfield Harriers   7935
## 10 2018-09-09 Ommiedraai         10.0 km  00:37:46 Harfield Harriers  11167
## 11 2018-06-10 Comrades           90.2 km  07:19:25 Harfield Harriers   7935
## 12 2018-02-18 Cape Peninsula     42.2 km  03:08:27 Harfield Harriers   7935
## 13 2018-01-14 Bay to Bay         30.0 km  02:11:50 Harfield Harriers   7935
## 14 2017-10-01 Grape Run          21.1 km  01:27:18 Harfield Harriers   7088
## 15 2017-09-17 Cape Town Marathon 42.2 km  02:57:55 Harfield Harriers   7088
## 16 2017-06-04 Comrades           86.7 km  07:46:18 Harfield Harriers   7088
## 17 2016-10-16 The Gun Run        21.1 km  01:19:09 Harfield Harriers     NA
## 18 2016-09-10 Mont-Aux-Sources   50.0 km  05:42:23 Harfield Harriers     NA
## 19 2016-05-29 Comrades           89.2 km  07:22:53 Harfield Harriers     NA
## 20 2016-02-21 Cape Peninsula     42.2 km  03:17:12 Harfield Harriers     NA

Wrapping Up

Let’s digress for a moment to look at a bubble plot showing the number of races on @racently broken down by runner. There are some really prolific runners.

We’ve currently got just under one million individual race results across over a thousand races. If you have the time and inclination then there’s definitely some interesting science to be done using these results. I’d be very interested in collaborating, so just shout if you are interested.

Feel free to grab some data via the API. At the moment you’ll need to search for an athlete on the main website in order to find their API key. I’ll implement some search functionality in the API when I get a chance.

Finally, here’s a talk I gave about @racently at the Bulgaria Web Summit (2017) in Sofia, Bulgaria. A great conference, incidentally. Well worth making the trip to Bulgaria.

To leave a comment for the author, please follow the link and comment on their blog: R | datawookie.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.