Site icon R-bloggers

Twitter’s REST API v1.1 with R (for Linux and Windows)

[This article was first published on joy of data » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this tutorial I am going to describe a straightforward way of how to make use of Twitter’s REST API v1.1. For that purpose I composed a little package (RTwitterAPI), so that requesting data just needs the API URL, the API parameters and a vector containing the OAuth parameters.

Before you can get started you have to login to your Twitter account on dev.twitter.comcreate an application and generate an “Access Token” for it. So let’s jump right in and fetch IDs of 10 followers of @hrw (Human Rights Watch). The necessary code is located on GitHub as a package named RTwitterAPI which may be installed using devtools::install_github().

The Linux Way …

… (which might also work for some Windows installations – not mine though) uses RCurl::getURL() for executing the GET request.

#install.packages("devtools")
library(devtools)

devtools::install_github("joyofdata/RTwitterAPI")
library(RTwitterAPI)

arams <- c(
  "oauth_consumer_key"     = "[API Key]", 
  "oauth_nonce"            = NA,
  "oauth_signature_method" = "HMAC-SHA1",
  "oauth_timestamp"        = NA,
  "oauth_token"            = "[Access Token]",
  "oauth_version"          = "1.0",
  "consumer_secret"        = "[API Secret]",
  "oauth_token_secret"     = "[Access Token Secret]"
);

url   <- "https://api.twitter.com/1.1/friends/ids.json";
query <- c(cursor=-1, screen_name="hrw", count=10);

result <- RTwitterAPI::twitter_api_call(url, query, params)

The result is a JSON containing the IDs of 10 followers who we are going to print prettified using jsonlite::prettify():

> jsonlite::prettify(result)
{
	"ids" : [
		31563425,
		277165873,
		43856403,
		22190185,
		16421475,
		20131383,
		61083773,
		156289017,
		147493686,
		61197041
	],
	"next_cursor" : 1479682130324026931,
	"next_cursor_str" : "1479682130324026931",
	"previous_cursor" : 0,
	"previous_cursor_str" : "0"
}

The Windows Way …

… requires you to install Cygwin first – which I recommend anyway because it is pretty awesome – and on Cygwin you have to install cURL. What it does is this – it crafts a full command for Cygwin using cURL and feeds this command string to Cygwin’s bash.exe via system(). The reason for this workaround is an obscure certification issue which I did not manage to resolve properly. After you installed Cygwin (and curl) all that changes is the invocation of RTwitterAPI::twitter_api_call(). In the following example I assume Cygwin to be located under “C:cygwin64”.

result <- twitter_api_call(url, query, params, 
    use_cygwin = TRUE, # yupp
    print_cmd = TRUE,  # let's see the command
    cygwin_bash = "c:\cygwin64\bin\bash.exe"
)

The bash.exe command is spat out and can be run directly from DOS prompt or from Cygwin – when you restrict the command to the respective part of the string.

c:cygwin64binbash.exe -c "/usr/bin/curl --silent --get 'https://api.twitter.com/1.1/friends/ids.json' --data 'cursor=-1&screen_name=hrw&count=10' --header 'Authorization: OAuth oauth_consumer_key="Y...0", oauth_nonce="1...5", oauth_signature="H...D", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1411422203", oauth_token="1...t", oauth_version="1.0"'"

 OAuth Version 1

Twitter provides access to its service via a REST API whose current version is 1.1. Authorization is realized through OAuth version 1.0a. Due to that, handling the API is less trivial than just appending your password to the request – but also considerably more secure. Opposed to standard password-based authentication – OAuth distinguishes between the server (Twitter), the third-party client (the program calling the API) and the resource owner (the owner of the API account) by specifying an authentication flow where the resource owner grants access to the third-party client by programmatically providing a secret token in the end. But as in this case the third-party client and the resource owner are effectively identical the process simplifies to just manually creating an access token and its corresponding “token secret” and then using those directly within the script. Referring to the OAuth 1 authentication flow chart – we can skip steps A to F and focus entirely on G.

Signing the Request for OAuth1

The authorization of a request itself happens by glueing together a string which contains all the details about that request – URL, query parameters, OAuth keys and values – and then signing this so called “signature base string” with the two secret tokens –

oauth_token_secret
 and
consumer_secret
 applying an algorithm referred to as HMAC-SHA1. HMAC-SHA1 is provided by the digest package. What you get in the end after some more processing is the
oauth_signature
 which is sent along with the request and verified by the server. The creation of that signature is implemented in RTwitterAPI/oauth1_signature.R – a detailed description may be found here.

Structure of the GET Request

The GET request which is finally sent via RCurl needs a propperly set up header containing an “Authentication” section providing all the various

oauth_*
 settings. This part is implemented in RTwitterAPI/twitter_api_call.R. The meaning of the
oauth_*
  key/values, as well as the composition of the header is described here.

A few Notes on Escaping with RCurl::curlEscape() and URLencode()

For propper generation of the signature string it is important that to be encoded characters are represented with upper case hexadecimal symbols and that “.”, “_”, “-” and “~” are not encoded. This was a bit tedious to figure out. Both requirements are documented here.

The upper case condition is met by RCurl::curlEscape() but not by URLencode():

> RCurl::curlEscape("ü")
[1] "%C3%BC"

> URLencode("ü")
[1] "%c3%bc"

The second requirement was met by my Linux R set up – but oddly not by my Window’s where RCurl::curlEscape() would escape “.”, “-“, “_” and “~”. This is why I added a condition to oauth1_signature() which causes those characters to be resubstituted if necessary.


(original article published on www.joyofdata.de)

To leave a comment for the author, please follow the link and comment on their blog: joy of data » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.