Site icon R-bloggers

Introducing mirai – a minimalist async evaluation framework for R

[This article was first published on shikokuchuo{net}, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< aside> Shikokuchuo


{mirai} is a minimalist async evaluation framework for R, released this week to CRAN.

未来 みらい mirai is Japanese for ‘future’.

It provides an extremely simple and lightweight method for concurrent / parallel code execution.

Design Notes

Whilst frameworks for parallelisation exist for R, {mirai} is designed for simplicity.

The package provides just 2 functions:

{mirai} has a tiny pure R code base, relying on a single package – {nanonext}. {nanonext} itself is a lightweight wrapper for the NNG C library with zero package dependencies.

Background R processes are created and evaluation occurs independently. mirai employs nanonext/NNG as a concurrency framework – a blazing-fast, lightweight solution for moving data between these processes seamlessly. Crucially it provides a true cross-platform abstraction layer across Linux, Windows, MacOS, the BSDs, Solaris, Illumos etc. i.e. everywhere R can go. This means that we can just call the above two functions without worrying about the underlying system implementation.

This means there is no need to specify core counts, devise work plans and the such beforehand. Also no need to separate writing code that is ready for parallel execution from how it is ultimately executed. Just wrap your expressions in eval_mirai() and run them in another process.

For scripts, this provides the ultimate control as you can map specific code to a specific process. For example if you have 8 model fits to run, you can send each one to it’s own process. This provides a simpler and more robust solution than leaving it to the system to decide, which also runs the risk of over-optimisation – you may wish to refer to this classic presentation on Python’s GIL (global interpreter lock): http://www.dabeaz.com/python/GIL.pdf. R inherits similar limitations being an interpreted language.

It can be equally handy for interactive work – if you have specified a model, are now ready to fit it and know this will take an hour to run, simply eval_mirai() and have it run in the background whilst you continue with your work. When you need the results just call_mirai() for the return value.

Use Cases

Minimise execution times by performing long-running tasks concurrently in separate processes.

Ensure execution flow of the main process is not blocked.

library(mirai)

Example 1: Compute-intensive Operations

Multiple long computes (model fits etc.) would take more time than if performed concurrently on available computing cores.

Use eval_mirai() to evaluate an expression in a separate R process asynchronously.

A ‘mirai’ object is returned immediately.

mirai <- eval_mirai(rnorm(n) + m, n = 1e8, m = runif(1))

mirai
#> < mirai >
#>  ~ use call_mirai() to resolve

Continue running code concurrent to the async operation.

# do more...

Use call_mirai() to retrieve the evaluated result when required.

call_mirai(mirai)

mirai
#> < mirai >
#>  - $value for evaluated result

str(mirai$value)
#> num [1:100000000] 1.485 -0.804 0.965 -0.128 -0.555 ...

Example 2: I/O-bound Operations

Processing high-frequency real-time data, writing results to file/database can be slow and potentially disrupt the execution flow.

Cache data in memory and use eval_mirai() to perform periodic write operations in a separate process.

A ‘mirai’ object is returned immediately.

mirai <- eval_mirai(write.csv(x, file = file), x = rnorm(1e8), file = tempfile())

Use call_mirai() to confirm the operation has completed.

call_mirai(mirai)$value
#> NULL

Above, the return value is called directly. NULL is the expected return value for write.csv().

Links

{mirai} website: https://shikokuchuo.net/mirai/
{mirai} on CRAN: https://cran.r-project.org/package=mirai

{nanonext} website: https://shikokuchuo.net/nanonext/
{nanonext} on CRAN: https://cran.r-project.org/package=nanonext

To leave a comment for the author, please follow the link and comment on their blog: shikokuchuo{net}.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.