WebDev4R: The Ultimate Guide To Get Data Through APIs With {httr2} and R

  • Today, we’re talking about something that has been a mystery to me for quite some time. We are talking about APIs and how to get data from them using R. And we will do that with the help of the {httr2} package. This is quite a broad topic so this will be quite a long ride. Here’s our playbook for today:

    • Level 1 (easy): We dip our toes into the water and learn what an API is and how to make a request to an API. At this stage we’re just getting data from the US National Weather Service which does not require any authentication. Authentication can be a pain in the ass, so we that’s why this is the easy level.

    • Level 2 (intermediate): In Level 1 we hit the API of the weather service to get the weather data at a specific coordinate. But what if we want to get the weather forecast for an address? That’s where Google’s geocoding API comes in. It gives us coordinates to addresses that we pass to it. But this API requires authentication.

    • Level 3 (intermediate): Another form of authentication uses Bearer tokens. It’s very similar to what we already learned in Level 2. So that’s why it’s not a huge step up. But still it’s good to see how that one works. One particular example for that is OpenAI’s API. We’re going to use it to have a chat with ChatGPT from R.

    • Level 4 (hard): I already said that authentication can be a pain in the ass. In particular, the YouTube API is hard to navigate because it uses something called OAuth 2.0 authentication. This one is quite a complicated construct but {httr2} helps us here too. I’ll show you how that works.

    So that’s our gameplay for this blog post. As I’ve said, this one will be a long one. Hence the name “Ultimate Guide”. But if you want to lean back and enjoy all of this content in video form, I’ve got you covered. You can find the video version of this blog post on my YouTube channel:

    < section id="level-1-getting-to-know-apis" class="level2">

    Level 1: Getting to know APIs

    Alright, Level 1 is doing small steps with APIs. Let’s dive in.

    < section id="what-is-an-api" class="level3">

    What is an API?

    API stands for Application Programming Interface. Broadly speaking, an API is anything that we can throw code at to get results that we want. Often, this refers to some data source that we tap into. But sometimes it also simply means the syntax of code. For example, ggplot2 has a very distinctive API, i.e. a code syntax to create a chart. In this blog post, though, we will just refer to APIs as data sources.

    < section id="making-requests-to-an-api" class="level3">

    Making requests to an API

    If you’ve never worked with APIs, you know that it can feel like data is hidden away behind an API. Thankfully, the {httr2} package helps us a lot, even if we’ve never dealt with APIs before. This package allows us to use a step-by-step workflow that is always the same. Here’s how it looks and don’t worry if this doesn’t mean a thing to you yet:

    1️⃣ Stick a base URL it into request().

    2️⃣ Pipe that request into all kinds of req_*() functions to fill your request with the parameters and authentication your request needs.

    3️⃣ Perform your request after your filled in all the parameters you need with req_perform().

    4️⃣ Extract the data from the response via resp_body_json() to turn the JSON response into a list.

    So these are the steps that make up an API request to get data. And the cool thing is: With the US National Weather Service we can do that manually first to understand what’s going on. Just head to the following url using your web browser:,-77.0352

    If you navigate there, you will get cryptic data like that:

    This is what is known as a JSON file. More on that later. For now, notice that what you see at the end of the url after points/ corresponds to the coordinates that are given in the JSON output. (Compare that with the things the red arrow is pointing at)

    This means that the recipe for calling the weather API is simple: Append points/{lat},{long} at the end of the base_url, i.e. In this case, {lat},{long} corresponds to the latitude and longitude of the location you want to get weather forecasts for.

    < section id="making-a-request-with-httr2" class="level3">

    Making a request with {httr2}

    The {httr2} syntax to make this work mimics this quite well. Here’s how it looks.

    NWS_base_url <- ''
    request(NWS_base_url) |> 
    ## <httr2_request>
    ## GET,-77.0352
    ## Body: empty

    Basically, at the core of every request is the request() function that needs to know the base_url. This returns an <httr2_request> object that can be passed to further req_*() functions to modify the request.

    Here, we used req_url_path_append() to modify the request but there are also other functions. We’ll learn a couple more soon enough. Finally, to actually make the request, you can pass everything to req_perform().

    NWS_base_url <- ''
    NWS_response <- request(NWS_base_url) |> 
      ) |> 
    ## <httr2_response>
    ## GET,-77.0352
    ## Status: 200 OK
    ## Content-Type: application/geo+json
    ## Body: In memory (3091 bytes)
    < section id="understanding-the-response" class="level3">

    Understanding the response

    As you’ve just seen, your request will return a <httr2_response> and if everything went well, the output will also show you Status: 200 OK. You can get the actual content (the JSON that you’ve seen in your web browser earlier) via one of the many resp_*() functions that handle responses. This will give you a highly nested list (which is why the output looks so huge.)

    NWS_response |> 
