Publish R functions as stored procedures with the sqlrutils package
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If you've created an R function (say, a routine to clean up missing values in a data set, or a function to make forecasts using a machine learning model), and you want to make it easy for DBAs to use it, it's now possible to publish R functions as a SQL Server 2016 stored procedure. The sqlrutils package provides tools to convert an existing R function to a stored procedure which can then be executed by anyone with authenticated access to the database — even if they don't know any R.
To use an R function as a stored procedure, you'll need SQL Server 2016 with R Services installed. You'll also need to use the sqlrutils package to publish the function as a stored procedure: it's included with both Microsoft R Client (available free) and Microsoft R Server (included with SQL Server 2016), version 9.0 or later.
With that set up, the process is fairly simple:
- Using Microsoft R (Client or Server), call
library(sqlrutils)
to make the publishing functions available in R.- Call
help(package="sqlrutils")
to see the list of functions provided.
- Call
- Create the R function you want to publish. It should include one
data.frame
argument (this will be the input from the database) and return adata.frame
as the value of the stored procedure.- The function can also return NULL, if the goal of the function is to modify the database directly.
- You can use functions from the RevoScaleR package. For example, use
RxSqlServerData
to access data in the database with a SQL query, useRxDataStep
to transform data using R, or use functions likerxLogit
to train a predictive model. - Note that the function will only access data passed in as inputs — don't try reaching out to global variables, as they won't be there when the function runs within SQL Server.
- Declare the input parameter for the stored procedure with the
InputData
function. In addition to naming the parameter, you can specify a default query to use if none is provided. - Optionally, define one or more additional input parameters for the stored procedure. These will become inputs to the R function defined above.
- Only R objects of type POSIXct, numeric, character, integer, logical, and raw are allowed as inputs.
- Prepare the R function for publishing with the
StoredProcedure
function. This is where you name the stored procedure and define the input parameters declared above. - Publish the stored procedure to the database with
registerStoredProcedure
.- You'll need to provide a connection string with location and authentication information for the database. You can also provide this in the previous step, when you create the stored procedure.
- You can test out the stored procedure from your R console with
executeStoredProcedure
.
That's it! Now your R function is available as an ordinary stored procedure, and can be executed (with its provided input data and optional parameters) like any other stored procedure.
This is just an outline, and I've omitted some of the options for clarity. For more details on the process, you can find complete documentation at the link below.
Microsoft Docs: Generating an R Stored Procedure for R Code using the sqlrutils Package
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.