Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Optimizing target functions with long evaluations times can be tedious to impossible. Especially if it is hard to parallelize the target function, one might wonder whether it is possible to parallelize the optimization methods itself. Indeed, parallel implementations are available for some stochastic optimizers; see the CRAN Task View on Optimization. However, also the widely used gradient-based optimization methods like the “L-BFGS-B” method from optim()
can profit from parallelization. More precisely, at each step the target function and the (approximate) gradient can be evaluated in parallel. Taking up on this idea, we have developed the R package optimParallel, which provides parallel versions of the gradient-based optimization methods of optim()
. Its main function optimParallel()
has the same usage and output as optim()
and speeds up optimization significantly.
A simple example
Executing a gradient-based optim()
call in parallel requires to following steps:
- install and load optimParallel from CRAN,
- setup a default cluster for parallel execution using the R package parallel,
- replace
optim()
byoptimParallel()
.
For illustration, we consider the following optimization task. Note the use of Sys.sleep()
to mimic a computationally intensive function.
set.seed(13) x <- rnorm(1000, 5, 2) negll <- function(par, x) { Sys.sleep(1) -sum(dnorm(x=x, mean=par[1], sd=par[2], log=TRUE)) } optim(par=c(1,1), fn=negll, x=x, method = "L-BFGS-B", lower=c(-Inf, .0001))
The parallel version of the same task is:
install.packages("optimParallel") library("optimParallel") cl <- makeCluster(5) # set the number of processor cores setDefaultCluster(cl=cl) # set 'cl' as default cluster optimParallel(par=c(1,1), fn=negll, x=x, method = "L-BFGS-B", lower=c(-Inf, .0001))
Reduction of the optimization time
The following figure shows the results of a benchmark experiment comparing the “L-BFGS-B” method from optimParallel()
and optim()
; see the arXiv preprint for more details. Plotted are the elapsed times per iteration (y-axis) and the evaluation time of the target function (x-axis). The colors indicate the number of parameters of the target function and whether an analytic gradient was specified. The elapsed times of optimParallel()
(solid line) are smaller for all tested scenarios.
Trace the optimization path
Besides the parallelization, optimParallel()
provides additional innovations. For example, it can return log-information, which allow the user to trace the optimization path.
optimParallel(par=c(1,1), fn=negll, x=x, method = "L-BFGS-B", lower=c(-Inf, .0001), parallel=list(loginfo=TRUE))
Links
- optimParallel on CRAN
- optimParallel GIT repository
- optimParallel package manual
- optimParallel vignette on arXiv.org
- optimParallel in the R community blog edited by RStudio
- optimParallel in the Data Analytics & R blog
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.