[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A common theme over the last few decades was that we could afford to simply
sit back and let computer (hardware) engineers take care of increases in
computing speed thanks to
Moore’s law.
That same line of thought now frequently points out that we
are getting closer and closer to the physical limits of what
Moore’s law can
do for us.
So the new best hope is (and has been) parallel processing. Even our smartphones have
multiple cores, and most if not all retail PCs now possess two, four or more
cores. Real computers, aka somewhat decent servers, can be had with 24, 32
or more cores as well, and all that is before we even consider GPU
coprocessors or other upcoming changes.
And sometimes our tasks are embarassingly simple as is the case with many
data-parallel jobs: we can use higher-level operations such as those offered
by the base R package parallel
to spawn multiple processing tasks and gather the results. I covered all this
in some detail in previous talks
on High Performance Computing with R (and you can also consult the
Task View on High Performance Computing with R
which I edit).
But sometimes we can’t use data-parallel approaches. Hence we have to redo our algorithms. Which is
really hard. R itself has been relying on the (fairly mature) OpenMP
standard for some of its operations. Luke Tierney’s
(awesome) keynote in May at our
(sixth) R/Finance conference mentioned some of the issues related to
OpenMP. One which matters
is that OpenMP works really well on Linux, and
either not so well (Windows) or not at all (OS X, due the usual issue with
the gcc/clang switch enforced by Applem but the good news is that the OpenMP
toolchain is expected to make it to OS X is some more performant form
“soon”). R is still expected to make wider use of OpenMP in future versions.
Another tool which has been around for a few years, and which can be considered
to be equally mature is the
Intel Threaded Building Blocks
library, or TBB. JJ recently started to wrap this up for use by R. The first
approach resulted in a (now superseded, see below) package TBB.
But hardware and OS issues bite once again, as the Intel TBB is not really
building that well for the Windows toolchain used by R (and based on MinGW).
(And yes, there are two more options. But Boost Threads requires linking
which precludes easy use as e.g. via our
BH package. And C++11 with its
threads library (based on Boost Threads) is not yet as widely available as R
and Rcpp which means that it is not a real deployment option yet.)
Now, JJ, being as awesome as he is, went back to the drawing board and integrated a
second threading toolkit: TinyThread++,
a small header-only library without further dependencies. Not as
feature-rich as Intel Threaded Building Blocks,
but at least available everywhere. So a new package
RcppParallel, so far
only on GitHub, wraps around both TinyThread++
and Intel Threaded Building Blocks and
offers a consistent interface available on all platforms used by R.
Better still, JJ also authored several pieces demonstrating this new package for the
Rcpp Gallery:
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- A parallel matrix transformation
- A parallel vector summation
- A parallel inner product
- Parallel Distance Matrix Calculation with RcppParallel
To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box .
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.