Site icon R-bloggers

Caching Encyclopedia of Life API calls

[This article was first published on rOpenSci Blog - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a recent blog post we discussed caching calls to the web offline, on your own computer. Just like you can cache data on your own computer, a data provider can do the same thing. Most of the data providers we work with do not provide caching. However, at least one does: EOL, or Encyclopedia of Life. EOL allows you to set the amount of time (in seconds) that the call is cached, within which time you can make the same call and get the data back faster. We have a number of functions to interface with EOL in our taxize package.

Install and load taxize and ggplot2.

install.packages(c("taxize", "ggplot2"))

library(taxize)
library(ggplot2)

To easily visualize the benefit of using EOL's caching, let's define a function to:

testcache <- function(terms, cache){
  first <- system.time( eol_search(terms=terms, cache_ttl = cache) )
  second <- system.time( eol_search(terms=terms, cache_ttl = cache) )
  Sys.sleep(cache+2)
  third <- system.time( eol_search(terms=terms, cache_ttl = cache) )

  df <- data.frame(labs=c('nocache','withcache','cachetimedout'), 
                   vals=c(first[[3]], second[[3]], third[[3]]))
  df$labs <- factor(df$labs, levels = c('nocache','withcache','cachetimedout'))
  ggplot(df, aes(labs, vals)) + 
    geom_bar(stat='identity') + 
    theme_grey(base_size = 20) +
    ggtitle(sprintf("search term: '%s'\n", terms)) +
    labs(y='Time to get data\n', x='')
}

Search for the term lion

testcache(terms = "lion", cache = 5)

Search for the term beetle

testcache(terms = "beetle", cache = 10)

Caching works the same way with the eol_pages function. No other API services and associated functions in taxize support caching on the server side by the data provider. Of course you can do your own caching using knitr or other methods – some of which we discussed in an earlier post.

To leave a comment for the author, please follow the link and comment on their blog: rOpenSci Blog - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.