Site icon R-bloggers

What’s New in Release 6.2: Additional ScaleR Features

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by Thomas Dinsmore

Revolution R Enterprise Release 6.2 is in track for General Availability on April 22.  In previous posts, I've commented on support for open source R 2.15.3 and Stepwise Regression. Today I'll wrap this series with a summary of some of the other new features supported in this release.

Parallel Random Number Generation

For analysts seeking to perform Monte Carlo analysis (and similar techniques) Revolution R Enterprise Release 6.2 supports parallel random number generation.  The new functions rxRngNewStream, rxRngGetStream, rxRngSetStream, and rxRngDelStream provide an R interface to the parallel random number generators supplied with the Intel MKL libraries. These allow high quality parallel random numbers to be used in distributed computations performed by RevoScaleR’s rxExec function. 

This capability is a necessary foundation for randomized decision trees (trademarked by Breiman and Cutler as "Random Forests"), a technique we plan to support in Revolution R Enterprise Release 7.0.

Fast Fixed Format Text Data Source

At the request of several customers, we've provided a faster way to import or analyze fixed format text data when using the rxImport function.

By-Group Summary Statistics

Revolution R Enterprise users sometimes want to "roll up" data from one level to another — for example, to summarize transaction facts at the customer level.  The rxCube and rxSummary functions now provide options to write by-group counts or summary statistics directly to a high performance .xdf file for further analysis. Users also now have more control over the summary statistics that are reported. 

Sort, Merge and Split Enhancements

The rxSort and rxMerge functions now by default make better use of available memory; previously, they used a fixed default buffer size which could be unnecessarily small on computers with generous amounts of RAM.  Also, the rxSplit function now creates .xdf files by default if the data source is an .xdf file.

We welcome your comments and suggestions about Revolution R Enterprise.  If you have any comments or questions, please sound off in the Comments section below, or send me a note at thomas.w.dinsmore@revolutionanalytics.com.

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.