REST APIs and Plumber
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Moving R resources from development to production can be a challenge, especially when the resource isn’t something like a shiny
application or rmarkdown
document that can be easily published and consumed. Consider, as an example, a customer success model created in R. This model is responsible for taking customer data and returning a predicted outcome, like the likelihood the customer will churn. Once this model is developed and validated, there needs to be some way for the model output to be leveraged by other systems and individuals within the company.
Traditionally, moving this model into production has involved one of two approaches: either running customer data through the model on a batch basis and caching the results in a database, or handing the model definition off to a development team to translate the work done in R into another language, such as Java or Scala. Both approaches have significant downsides. Batch processing works, but it misses real-time updates. For example, if the batch job runs every night and a customer calls in the next morning and has a heated conversation with support, the model output will have no record of that exchange when the customer calls the customer loyalty department later the same day to cancel their service. In essence, model output is served on a lag, which can sometimes lead to critical information loss. However, the other option requires a large investment of time and resources to convert an existing model into another language just for the purpose of exposing that model as a real-time service. Neither of these approaches is ideal; to solve this problem, the optimal solution is to expose the existing R model as a service that can be easily accessed by other parts of the organization.
plumber
is an R package that allows existing R code to be exposed as a web service through special decorator comments. With minimal overhead, R programmers and analysts can use plumber
to create REST APIs that expose their work to any number of internal and external systems. This solution provides real-time access to processes and services created entirely in R, and can effectively eliminate the need to perform batch operations or technical hand-offs in order to move R code into production.
This post will focus on a brief introduction to RESTful APIs, then an introduction to the plumber
package and how it can be used to expose R services as API endpoints. In subsequent posts, we’ll build a functioning web API using plumber
that integrates with Slack and provides real-time customer status reports.
Web APIs
For some, APIs (Application Programming Interface) are things heard of but seldom seen. However, whether seen or unseen, APIs are part of everyday digital life. In fact, you’ve likely used a web API from within R, even if you didn’t recognize it at the time! Several R packages are simply wrappers around popular web APIs, such as tidycensus
and gh
. Web APIs are a common framework for sharing information across a network, most commonly through HTTP.
HTTP
To understand how HTTP requests work, it’s helpful to know the players involved. A client makes a request to a server, which interprets the request and provides a response. An HTTP request can be thought of simply as a packet of information sent to the server, which the server attempts to interpret and respond to. Every time you visit a URL in a web browser, an HTTP request is made and the response is rendered by the browser as the website you see. It is possible to inspect this interaction using the development tools in a browser.
As seen above, this request is composed of a URL and a request method, which in the case of a web browser accessing a website, is GET.
Request
There are several components of an HTTP request, but here we’ll mention on only a few.
- URL: the address or endpoint for the request
- Verb / method: a specific method invoked on the endpoint (GET, POST, DELETE, PUT)
- Headers: additional data sent to the server, such as who is making the request and what type of response is expected
- Body: data sent to the server outside of the headers, common for POST and PUT requests
In the browser example above, a GET request was made by the web browser to www.rstudio.com.
Response
The API response mirrors the request to some extent. It includes headers that contain information about the response and a body that contains any data returned by the API. The headers include the HTTP status code that informs the client how the request was received, along with details about the content that’s being delivered. In the example of a web browser accessing www.rstudio.com, we can see below that the response headers include the status code (200) along with details about the response content, including the fact that the content returned is HTML. This HTML content is what the browser renders into a webpage.
httr
The httr
package provides a nice framework for working with HTTP requests in R. The following basic example demonstrates some of what we’ve already learned by using httr
and httpbin.org, which provides a playground of sorts for HTTP requests.
library(httr) # A simple GET request response <- GET("http://httpbin.org/get") response ## Response [http://httpbin.org/get] ## Date: 2018-07-23 14:57 ## Status: 200 ## Content-Type: application/json ## Size: 266 B ## {"args":{},"headers":{"Accept":"application/json, text/xml, application/...
In this example we’ve made a GET request to httpbin.org/get and received a response. We know our request was successful because we see that the status is 200. We also see that the response contains data in JSON format. The Getting started with httr page provides additional examples of working with HTTP requests and responses.
REST
Representational State Transfer (REST) is an architectural style for APIs that includes specific constraints for building APIs to ensure that they are consistent, performant, and scalable. In order to be considered truly RESTful, an API must meet each of the following six constraints:
- Uniform interface: clearly defined interface between client and server
- Stateless: state is managed via the requests themselves, not through reliance on an external service
- Cacheable: responses should be cacheable in order to improve scalability
- Client-Server: clear separation of client and server, each with it’s on distinct responsibilities in the exchange
- Layered System: there may be intermediaries between the client and the server, but the client should be unaware of them
- Code on Demand: the response can include logic executable by the client
We could spend a lot of time diving further into each of these specifications, but that is beyond the scope of this post. More detail about REST can be found here.
Plumber
Creating RESTful APIs using R is straightforward using the plumber
package. Even if you have never written an API, plumber
makes it easy to turn existing R functions into API endpoints. Developing plumber
endpoints is simply a matter of providing specialized R comments before R functions. plumber
recognizes both #'
and #*
comments, although the latter is recommended in order to avoid potential conflicts with roxygen2
. The following defines a plumber
endpoint that simply returns the data provided in the request query string.
library(plumber) #* @apiTitle Simple API #* Echo provided text #* @param text The text to be echoed in the response #* @get /echo function(text = "") { list( message_echo = paste("The text is:", text) ) }
Here we’ve defined a simple function that takes a parameter, text
, and returns it with some additional comments as part of a list. By default, plumber
will serialize the object returned from a function into JSON using the jsonlite
package. We’ve provided specialized comments to inform plumber
that this endpoint is available at api-url/echo
and will respond to GET requests.
There are a few ways this plumber
script can be run locally. First, assuming the file is saved as plumber.R
, the following code would start a local web server hosting the API.
plumber::plumb("plumber.R")$run(port = 5762)
Once the web server has started, the API can be interacted with using any set of HTTP tools. We could even interact with it using httr
as demonstrated earlier, although we would need to open a separate R session to do so since the current R session is busy serving the API.
The other method for running the API requires a recent preview build of the RStudio IDE. Recent preview builds include features that make it easier to work with plumber
. When editing a plumber
script in a recent version of the IDE, a “Run API” icon will appear in the top right hand corner of the source editor. Clicking this button will automatically run a line of code similar to the one we ran above to start a web server hosting the API. A swagger-generated UI will be rendered in the Viewer pane, and the API can be interacted with directly from within this UI.
Now that we have a running plumber
API, we can query it using curl
from the command line to investigate it’s behavior.
$ curl "localhost:5762/echo" | jq '.' { "message_echo": [ "The text is: " ] }
In this case, we queried the API without providing any additional data or parameters. As a result, the text
parameter is the default empty string, as seen in the response. In order to pass a value to our underlying function, we can define a query string in the request as follows:
$ curl "localhost:5762/echo?text=Hi%20there" | jq '.' { "message_echo": [ "The text is: Hi there" ] }
In this case, the text
parameter is defined as part of the query string, which is appended to the end of the URL. Additional parameters could be defined by separating each key-value pair with &
. It’s also possible to pass the parameter as part of the request body. However, to leverage this method of data delivery, we need to update our API definition so that the /echo
endpoint also accepts POST requests. We’ll also update our API to consider multiple parameters, and return the parsed parameters along with the entire request body.
library(plumber) #* @apiTitle Simple API #* Echo provided text #* @param text The text to be echoed in the response #* @param number A number to be echoed in the response #* @get /echo #* @post /echo function(req, text = "", number = 0) { list( message_echo = paste("The text is:", text), number_echo = paste("The number is:", number), raw_body = req$postBody ) }
With this new API definition, the following curl
request can be made to pass parameters to the API via the request body.
$ curl --data "text=Hi%20there&number=42&other_param=something%20else" "localhost:5762/echo" | jq '.' { "message_echo": [ "The text is: Hi there" ], "number_echo": [ "The number is: 42" ], "raw_body": [ "text=Hi%20there&number=42&other_param=something%20else" ] }
Notice that we passed more than just text
and number
in the request body. plumber
parses the request body and matches any arguments found in the R function definition. Additional arguments, like other_param
in this case, are ignored. plumber
can parse the request body if it is URL-encoded or JSON. The following example shows the same request, but with the request body encoded as JSON.
$ curl --data '{"text":"Hi there", "number":"42", "other_param":"something else"}' "localhost:5762/echo" | jq '.' { "message_echo": [ "The text is: Hi there" ], "number_echo": [ "The number is: 42" ], "raw_body": [ "{\"text\":\"Hi there\", \"number\":\"42\", \"other_param\":\"something else\"}" ] }
While these examples are fairly simple, they demonstrate the extraordinary facility of plumber
. Thanks to plumber
, it is now a fairly straightforward process to expose R functions so they can be consumed and leveraged by any number of systems and processes. We’ve only scratched the surface of its capabilities and, as mentioned, future posts will walk through the creation of a Slack app using plumber
. Comprehensive documentation for plumber
can be found here.
Deploying
Up until now, we’ve just been interacting with our APIs in our local development environment. That’s great for development and testing, but when it comes time to expose an API to external services, we don’t want our laptop held responsible (at least, I don’t!). There are several deployment methods for plumber
outlined in the documentation. The most straightforward method of deployment is to use RStudio Connect. When editing a plumber
script in recent versions of the RStudio IDE, a blue publish button will appear in the top right-hand corner of the source editor. Clicking this button brings up a menu that enables the user to publish the API to an instance of RStudio Connect. Once published, API access and performance can be configured through RStudio Connect and the API can be leveraged by external systems and processes.
Conclusion
Web APIs are a powerful mechanism for providing systematic access to computational processes. Writing APIs with plumber
makes it easy for others to take advantage of the work you’ve created in R without the need to rely on batch processing or code rewriting. plumber
is exceptionally flexible and can be used to define a wide variety of endpoints. These endpoints can be used to integrate R with other systems. As an added bonus, downstream consumers of these APIs require no knowledge of R. They only need to know how to properly interact with the API via HTTP. plumber
provides a convenient and reliable bridge between R and other systems and/or languages used within an organization.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.