Enhancements to the AzureML package to connect R to AzureML Studio
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Andrie de Vries
We have written on several occasions about AzureML, the Microsoft machine learning studio that is part of the Cortana Analytics suite:
- Running R in the Azure ML cloud
- Call R functions from any application with the AzureML package
- Using miniCRAN in Azure ML
In September we announced that the AzureML package for R allows you to publish R functions as Azure web services. This is a brilliantly easy way to deploy your functions to other users and clients. For example, you can publish a function from R, then consume that function from Excel!
I am pleased to announce that we have completed a significant rewrite of the AzureML package. This rewrite adds several enhancements. Specifically, AzureML now also allows you to interact with:
- Workspace: connect to and manage AzureML workspaces
- Datasets: upload and download datasets to and from AzureML workspaces
- Experiments: download intermediate datasets from AzureML experiments
We have also significantly enhanced the functionality to publish and consume models
- Publish: define a custom function or train a model and publish it as an Azure Web Service
- Consume: use available web services from R in a variety of convenient formats
Interacting with datasets
This version of the AzureML package adds new functionality to interact with datasets and experiments.
The code to do this is very simple:
# Create a workspace object ws <- workspace() # List datasets datasets(ws, filter = "sample") # Download a dataset frame <- download.datasets(ws, name = "Forest fires data") head(frame)
As expected, this displays the first few lines of the resulting data frame:
X Y month day FFMC DMC DC ISI temp RH wind rain area
1 7 5 mar fri 86.2 26.2 94.3 5.1 8.2 51 6.7 0.0 0
2 7 4 oct tue 90.6 35.4 669.1 6.7 18.0 33 0.9 0.0 0
3 7 4 oct sat 90.6 43.7 686.9 6.7 14.6 33 1.3 0.0 0
4 8 6 mar fri 91.7 33.3 77.5 9.0 8.3 97 4.0 0.2 0
5 8 6 mar sun 89.3 51.3 102.2 9.6 11.4 99 1.8 0.0 0
6 8 6 aug sun 92.3 85.3 488.0 14.7 22.2 29 5.4 0.0 0
Publishing an R function as a webservice
We made many improvements to the mechanism underlying the functionality to publish a web service.
In particular, it is now very easy to provide a data frame as input to the publishing function. You no longer have to specify the classes of every column. Instead, the publishWebservice() function automatically determines the column classes of the inputs as well as the results.
To illustrate, here is an example from the help:
ws <- workspace() # Publish a simple model using the lme4::sleepdata library(lme4) set.seed(1) train <- sleepstudy[sample(nrow(sleepstudy), 120),] m <- lm(Reaction ~ Days + Subject, data = train) # Deine a prediction function to publish based on the model: sleepyPredict <- function(newdata){ predict(m, newdata=newdata) } ep <- publishWebService(ws, fun = sleepyPredict, name="sleepy lm", inputSchema = sleepstudy, data.frame=TRUE) # OK, try this out, and compare with raw data ans = consume(ep, sleepstudy)$ans plot(ans, sleepstudy$Reaction)
Installation instructions
Right now, the new version is only available at github. To install the package, use:
if(!require("devtools")) install.packages("devtools") devtools::install_github("RevolutionAnalytics/AzureML")
Additional resources:
The package has extensive help with many examples as well as a vignette. You can also:
- view the vignette at Getting Started with the AzureML Package.
- take a look at the bug bash instructions - walk-through guide with installation and configuration instructions as well as sample code
Github: AzureML, An R interface to AzureML experiments, datasets, and web services
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.