Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions here.
1. Install Open MPI on Fedora Core
First install the openmpi libraries using:
yum install openmpi openmpi-devel openmpi-libs
The default installation on Fedora still doesn’t quite work, so you need to execute the following command as root (only once is required, after installation of the package):
ldconfig /usr/lib64/openmpi/lib/
You are not quite done: for R to work right with the libraries, you need to modify the LD_LIBRARY_PATH
environment variable to include the path to the Open MPI libraries. I have the following in my ~/.bash_profile
:
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}/usr/lib64/openmpi/lib/"
Edit your file to contain the same, and execute that line at the command prompt and you are ready to continue.
2. Install the Rmpi
package for R
Now that your Open MPI libraries are set up, and what you do next depends on what version of Rmpi
you are installing. Most likely you are installing the latest version in which case the following section applies. The instructions for older versions are retained in a later section for reference.
2.1. Current versions of the Rmpi
package
Make sure you have executed the ldconfig
command and set the LD_LIBRARY_PATH
environment variables as described in the previous section before you continue.
Since at least version 0.5-8 of the Rmpi
library you can install it from the R
command line after you have fixed the Open MPI install. At the R
prompt do:
install.packages("Rmpi", configure.args = c("--with-Rmpi-include=/usr/include/openmpi-x86_64/", "--with-Rmpi-libpath=/usr/lib64/openmpi/lib/", "--with-Rmpi-type=OPENMPI"))
It should work and install OK. This is obviously quite a mouthful to remember, but help is at hand through the options()
mechanism in R. In your ~/.Rprofile
you can add something like:
local({ my.configure.args <- list("Rmpi" = c("--with-Rmpi-include=/usr/include/openmpi-x86_64/", "--with-Rmpi-libpath=/usr/lib64/openmpi/lib/", "--with-Rmpi-type=OPENMPI"), ## Not needed for Rmpi but shown to illustrate the format "ncdf" = c("-with-netcdf_incdir=/usr/include/netcdf", "-with-netcdf_libdir=/usr/lib64/") ); options("configure.args" = my.configure.args) })
Then you can just type install.packages("Rmpi")
at the R command prompt to install the package.
2.2. Older versions of the Rmpi
package
The problem is the configuration file configure.ac
which is, unfortunately, completely brain-damaged with hard-coded assumptions about which subdirectories should contain header and library files and no way of overriding it.
Download the latest Rmpi package from CRAN and unpack it using tar zxvf Rmpi_0.5-7.tar.gz
. Go to the new Rmpi
directory and replace the file configure.ac
with the one below (for a x86_64 system; for 32 bit you probably need to change -64
to -32
):
Process this file with autoconf to produce a configure script. AC_INIT(DESCRIPTION) AC_PROG_CC MPI_LIBS=`pkg-config --libs openmpi-1.3.1-gcc-64` MPI_INCLUDE=`pkg-config --cflags openmpi-1.3.1-gcc-64` MPITYPE="OPENMPI" MPI_DEPS="-DMPI2" AC_CHECK_LIB(util, openpty, [ MPI_LIBS="$MPI_LIBS -lutil" ]) AC_CHECK_LIB(pthread, main, [ MPI_LIBS="$MPI_LIBS -lpthread" ]) PKG_LIBS="${MPI_LIBS} -fPIC" PKG_CPPFLAGS="${MPI_INCLUDE} ${MPI_DEPS} -D${MPITYPE} -fPIC" AC_SUBST(PKG_LIBS) AC_SUBST(PKG_CPPFLAGS) AC_SUBST(DEFS) AC_OUTPUT(src/Makevars)
The number 1.3.1 may change in future releases of Fedora: see /usr/lib64/pkgconfig/openmpi-*.pc
for the current value.
Still in the Rmpi
directory do the following in your shell:
autoconf cd .. tar zcvf Rmpi_0.5-7-F11.tar.gz Rmpi R CMD INSTALL Rmpi_0.5-7-F11.tar.gz
3. Test it
Now Rmpi
should be working in R:
> library("Rmpi") > mpi.spawn.Rslaves(nslaves=2) 2 slaves are spawned successfully. 0 failed. master (rank 0, comm 1) of size 3 is running on: server slave1 (rank 1, comm 1) of size 3 is running on: server slave2 (rank 2, comm 1) of size 3 is running on: server > x <- c(10,20) > mpi.apply(x,runif) [[1]] [1] 0.25142616 0.93505554 0.03162852 0.71783194 0.35916139 0.85082154 [7] 0.35404191 0.14221315 0.60063773 0.71805190 [[2]] [1] 0.84157864 0.63481773 0.38217188 0.67839089 0.27827728 0.35429266 [7] 0.04898744 0.96601584 0.25687905 0.77381186 0.69011927 0.37391028 [13] 0.19017369 0.51196594 0.51970563 0.15791524 0.21358237 0.69642478 [19] 0.12690207 0.44177656
Jump to comments.
You may also like these posts:
-
For my sins, I have done more than my fair share of analysis in Excel. I am quite capable of building and maintaining 130Mb spreadsheets (I had a dozen of them for one client). Excel is pretty much installed everywhere, so it is sometimes the only way to get started getting commercial value of the data in the organisation. But I don’t like it and let’s have a look at one reason why. In order not to always pick on Microsoft, we use another application, but you get the same results with Excel.
-
R code for Chapter 1 of Non-Life Insurance Pricing with GLMInsurance pricing is backwards and primitive, harking back to an era before computers. One standard (and good) textbook on the topic is Non-Life Insurance Pricing with Generalized Linear Models by Esbjorn Ohlsson and Born Johansson. We have been doing som…
-
R code for Chapter 2 of Non-Life Insurance Pricing with GLMWe continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohl…
-
Excel Tip: Array boolean operatorI learn something new every day. Thinking I knew pretty much everythging there is to know about Microsofts Excel spreadsheet application, I was surprised to see that you could turn any array into a boolean array depending on a condition by simply writing …
-
Can we make our analysis using the R statistical computing and analysis platform run faster? Usually the answer is yes, and the best way is to improve your algorithm and variable selection. But recently David Smith was suggesting that a big benefit of the…
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.