Announcing OpenCPU 2.0: Building and Deploying Scalable R Apps and Services
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
OpenCPU 2.0 provides the most robust system available today for building and deploying R based apps and services. The server exposes a simple HTTP API for calling with R functions, scripts and managing data, which provides a very solid basis for intergrating R into any environment. The OpenCPU 2.0 cloud server naturally scales up to many concurrent users and is entirely available under the business friendly Apache2 license – at no extra cost.
The 2.0 branch is the biggest upgrade to the system since the 1.0 release 4 years ago. The server API is backwards compatible so that existing clients and apps will keep working. Internals have been rewritten to make development easier and further enhance the performance and robustness of the server system.
The version 2.0.3 is available from CRAN, Launchpad, Dockerhub, OBS and the server archive. Below a brief overview of improvements in OpenCPU 2.0!
OpenCPU Apps
The 2.0 version makes it even easier to build and deploy R webapps. An app in OpenCPU is simply an R package which may include a web frontend that interacts with R functions from the same package via the OpenCPU API. By using the R package format as a container for shipping web applications OpenCPU apps natively support for dependencies, namespaces, embedded data, documentation, etc.
Apps can be run or deployed in many ways.
- Run or develop locally using the single user server in R using
opencpu::ocpu_start_app()
- Deploy for free on
<yourname>.ocpu.io
orcloud.opencpu.org
using the CI webhook - Host your own opencpu-server, either internally or on the internet
- Ship and deploy apps in docker containers
Several example apps are available from rwebapps Github repository. You can try each app on the public cloud server or you can run it locally in R using the single-user server.
Single User server
Ther OpenCPU single-user server allows for running OpenCPU inside an interactive R session on any platform. To install the latest version in R:
install.packages("opencpu")
Version 2.0 has made it much easier to run and develop OpenCPU apps using the single user server. For example to run the rwebapps/stockapp app:
opencpu::ocpu_start_app("rwebapps/stockapp")
Or try the very cool rwebapps/markdownapp:
opencpu::ocpu_start_app("rwebapps/markdownapp")
Also try any of the other rwebapps. Each of these apps can also be used on https://rwebapps.ocpu.io/<app>
.
Cloud Server and OCPU.IO
The new version makes it super easy to publish your apps and packages on the public cloud server via the Github CI. All you need to do is set the OpenCPU webhook in your Github repository or Github organization.
Upon your next git push, your package will immediately become available on a fancy private subdomain https://<yourname>.ocpu.io/<pkg>
named after your github username or organization.
Note again that in OpenCPU an app is just an R package. You can start deploying any R package on ocpu.io to call it remotely or just for fun, even if the package does not contain any special web front-end.
Dependency Remotes
Your app or package might depend on other CRAN packages as specified in the package DESCRIPTION
file according to the standard R mechanics. However sometimes your package depends on an R package which is not on CRAN, for example from Github.
To deploy packages on OpenCPU which have non-cran dependencies, specify the Remote
in the DESCRIPTION
according to the devtools vignette. Internally the OpenCPU webhook simply uses devtools::install_github()
to install your package, so it supports everything that install_github
does.
You can even pass custom arguments to install_github
by adding them to the webhook URL as http parameters.
Improved Data Interchange
The most difficult part of building R apps and services is data interchange: getting complex structures efficiently and reliably in and out of R. A lot of energy in OpenCPU 2.0 has gone into further optimizing this critical part of the system.
The three major data formats in OpenCPU are now fully implemented by myself in highly optimized C/C++ packages:
- json: opencpu uses
jsonlite::fromJSON()
for reading andjsonlite::toJSON()
for writing json. - protobuf: opencpu uses
protolite::serialize_pb()
andprotolite::unserialize_pb()
to convert between objects and protocol buffers. - multipart/form-data: (POST only) opencpu uses
webutils::parse_multipart()
for parsing multipart.
Obviously these packages are not limited to OpenCPU; they may be used by other systems as well.
DataFrames
A special role in R is reserved for Data Frames, the common data structure for storing tabular data sets. OpenCPU adds additional output types for retrieving data frames in NDJSON, SPSS, SAS or STATA format.
For example the following URLS retrieve the “diamonds” dataset from the “ggplot2” package in various formats:
https://cran.ocpu.io/ggplot2/data/diamonds/csv
https://cran.ocpu.io/ggplot2/data/diamonds/json
https://cran.ocpu.io/ggplot2/data/diamonds/ndjson
https://cran.ocpu.io/ggplot2/data/diamonds/pb
https://cran.ocpu.io/ggplot2/data/diamonds/feather
https://cran.ocpu.io/ggplot2/data/diamonds/rda
https://cran.ocpu.io/ggplot2/data/diamonds/rds
https://cran.ocpu.io/ggplot2/data/diamonds/spss
https://cran.ocpu.io/ggplot2/data/diamonds/sas
https://cran.ocpu.io/ggplot2/data/diamonds/stata
This also shows an additional use case for OpenCPU: publishing datasets in an format agnostic way using the “lazydata” feature from R packaging format.
It is completely valid to create an R package which contains only a dataset (no functions) and deploy it on OCPU.IO to make it available in a dozen formats at once!
Server Binaries
OpenCPU 2.0 has further improved opencpu-server
, the highly configurable multi-user server implementation, to run on various distributions as well as docker. This makes installing (and uninstalling) an opencpu production server easy for users or system administrators.
The recommended platform is still Ubuntu 16.04 (Xenial) because it supports AppArmor. This is also the platform we use to host cloud.opencpu.org and ocpu.io. Installation is easy:
# Requires Ubuntu 16.04 (Xenial)
sudo add-apt-repository -y ppa:opencpu/opencpu-2.0
sudo apt-get update
sudo apt-get upgrade
# Installs OpenCPU server
sudo apt-get install -y opencpu-server
# Optional: installs rstudio in http://yourhost/rstudio
sudo apt-get install -y rstudio-server
New in version 2.0 is that we provide binary installation packages for Debian 9, Fedora 25, CentOS 6 and 7. These binaries are built on dockerhub:opencpu and can also be dowloaded from https://archive.opencpu.org.
Docker
We now provide serveral docker images for running opencpu-server both for development or deployment. The opencpu/rstudio docker image runs both opencpu-server as well as rstudio-server which is nice for development. To start the docker container on port 80 with name “mybox” you would run:
docker run --name mybox -t -p 80:80 opencpu/rstudio
If port 80 is taken on your machine you can also use 8004. Once this runs you can navigate to http://localhost/ocpu and http://localhost/rstudio in your browser to get started. You can login rstudio with username/password: opencpu/opencpu.
To get a root shell on the server (for example to install system libraries needed by certain R packages) simply run:
# Replace 'mybox' with the --name above
docker exec -i -t mybox /bin/bash
From the shell you can easily install R packages or apt-get install
system libraries or modify the server configuration in /etc/opencpu
.
Roadmap
OpenCPU 2.0 server is a major step forward towards a robust system for building and deploying R based apps and services. We will keep improving the server implementations based on our experiences and feedback from users and developers.
Next up is updating the documentation to explain some of the powerful new features that were introduced in the 2.0 branch. We will also be updating the opencpu.js JavaScript client and build some cool new R webapps, which is what OpenCPU was built for in the first place!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.