Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Following a few weeks of testing, OpenCPU 1.6 has been released. OpenCPU is a production-ready system for embedded statistical computing with R. It provides a neat API for remotely calling R functions over HTTP via e.g. JSON or Protocol Buffers. The OpenCPU server implementation is stable and has been thorougly tested. It runs on all major Linux distributions and plays nicely with the RStudio server IDE (demo).
Similarly to shiny, OpenCPU can run as a single-user development server within the interactive R session, and as a multi-user (cloud) server for deployments on Linux. Unlinke shiny however, the cloud server comes at no extra cost. On the contrary: you are encouraged to take advantage of the cloud server which is much faster and includes cool features like user libraries, concurrent sessions, continuous integration, customizable security policies, etc.
Improvements: protolite and feather
The OpenCPU API has not changed from the 1.4 and 1.5 branch. The version bump indicates that this version targets the R 3.3 and supports the new Ubuntu 16.04. Furthermore the underlying stack of bundled R packages has been upgraded. Navigate to /ocpu/info
on your OpenCPU server to inspect the exact versions of all packages used by the system.
This version introduces two major improvements for binary data interchange. First the RProtoBuf dependency has been replaced by the much smaller protolite package, which has an optimized version of protobuf object serialization. The OpenCPU already had an API for exporting data to Protocol Buffers, it’s just much faster now.
library(httr) library(protolite) req <- GET("https://demo.ocpu.io/ggplot2/data/diamonds/pb") mydiamonds <- unserialize_pb(content(req))
New in this version is the feather
output format which can be parsed/generated with the new feather package.
library(curl) library(feather) curl_download("https://demo.ocpu.io/ggplot2/data/diamonds/feather", "diamonds.feather") mydiamonds <- read_feather("diamonds.feather")
Both pb
and feather
are a binary alternative to the text based json
format:
library(curl) library(jsonlite) con <- curl("https://demo.ocpu.io/ggplot2/data/diamonds/json") mydiamonds <- fromJSON(con)
Installation and upgrading
The download page has instructions for installing the opencpu server on various distributions, either from source or using precompiled binaries. To upgrade an existing installation of opencpu on ubuntu, simply run:
Note that this will also upgrade the version of R to 3.3.0 (if you have not already done so) which might require that you reinstall some of your R packages.
You can also install opencpu-server on any version of Debian/Ubuntu/Fedora/CentOS/RHEL by building the deb/rpm installation package from source. This is really easy, see the readme for deb or rpm.
Getting started
For those completely new to OpenCPU there several resources to get started. The presentation from last year’s useR conference gives a broad overview of the system including some basic demo’s. The example apps and jsfiddle scripts show how to use the opencpu.js JavaScript client. The server manual has contains documentation on configuring your opencpu cloud server (although installation should work out of the box).
Finally this paper from my thesis describes more generally the challenges of embedded scientific computing, and the benefits (both technical and human) of decoupling your statistical computing from your front-end or application layer.
The public demo server
To deploy your OpenCPU apps on the public server, simply push your R package to Github and configure the webhook in your repository. Whenever you push an update to Github the package will be reinstalled on the server and can directly be used remotely by anyone on the internet. You can either use the full url or the ocpu.io
shorthand url:
https://public.opencpu.org/ocpu/github/{username}/{package}/
https://{username}.ocpu.io/{package}/
These urls are fully equivalent. Simply replace {username}
with your github username, and {package}
with your package name. Note that the package name must be identical to the github repository name (as is usually the case).
On writing packages
One prerequisite for using OpenCPU is knowing how to create an R package. There is no way around this; packages are the natural container format for shipping and deploying code/data/manuals in R, and the OpenCPU API assumes this format. Luckily, writing R packages is super easy these days and can be done in less than (10 seconds) using for example RStudio.
The good thing is that once you passed this little hurdle, the full power and flexibility of R and it’s packaging become available to your applications and APIs. Hadley’s latest book on writing R packages gives a nice overview of the R packaging system, and the OpenCPU API provides an easy HTTP interface to all of these features.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.