Monitoring progress of a foreach parallel job
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Andrie de Vries
R has strong support for parallel programming, both in base R and additional CRAN packages.
For example, we have previously written about foreach and parallel programming in the articles Tutorial: Parallel programming with foreach and Intro to Parallel Random Number Generation with RevoScaleR.
The foreach package provides simple looping constructs in R, similar to lapply() and friends, and makes it easy execute each element in the loops in parallel. You can find the packages at foreach: Foreach looping construct for R and doParallel.
Tracking progress of parallel computing tasks
Parallel programming can help speed up the total completion time of your project. However, for tasks that take a long time to run, you may wish to track progress of the task, while the task is running.
This seems like a simple request, but seems remarkably hard to achieve. The reason boils down to this:
-
Each parallel worker is running in a different session of R
-
In some parallel computing setups, the workers don’t communicate with the initiating process, until the final combining step
So, if it is difficult to track progress directly, what can be done?
It seems to me the typical answer to this question fall into 3 different classes:
-
Use operating system monitoring tools, i.e. tools external to R.
-
Print messages to a file (or connection) in each worker, then read from this file, again outside of R
-
Use specialist back-ends that support this capability, e.g. the Redis database and the doRedis package
This is an area with many avenues of exploration, so I plan to briefly summarize each method and point to at least one question on StackOverflow that may help.
Method 1: Use external monitoring tools.
The question Monitoring Progress/Debugging Parallel R Scripts asks if it is possible to monitor a parallel job.
In his answer to this question, Dirk Eddelbuettel mentions that parallel back ends like MPI and PVM have job monitors, such as slurm and TORQUE. However, tools that are simpler to use, like snow do not have monitoring tools. In this case, you be forced to use methods like printing diagnostic messages to a file.
For parallel jobs using the doParallel backend, you can use standard operating system monitoring tools to see if the job is running on multiple cores. For example, in Windows, you can use the “Task Manager” to do this. Notice in the CPU utilization how each core went to maximum once the script started:
Method 2: Print messages to a file (or connection) in each worker, then read from this file, again outside of R
Sometimes it may be sufficient, or desirable, to print status messages from each of the workers. Simply adding a print() statement will not work, since the parallel workers do not share the standard output of the master job.
The question How can I print when using %dopar% asks how to do this using a snow parallel backend.
Steve Weston, the author of foreach (and one of the original founders of Revolution Analytics) wrote an excellent answer to this question.
Steve says that output produced by the snow workers gets thrown away by default, but you can use the makeCluster() argument “outfile” option to change that. Setting outfile to the empty string (“”) prevents snow from redirecting the output, often resulting in the output from your print messages showing up on the terminal of the master process.
Steve says: to create and register your cluster with something like:
library(doSNOW)
cl <- makeCluster(4, outfile="")
registerDoSNOW(cl)
He continues: Your foreach loop doesn't need to change at all. This works with both SOCK clusters and MPI clusters using Rmpi built with Open MPI. On Windows, you won't see any output if you're using Rgui. If you use Rterm.exe instead, you will. In addition to your own output, you'll see messages produced by snow which can also be useful.
Also note that this solution seems to work with doSnow, but is not supported by the doParallel backend.
Method 3: Use specialist back-ends that support this capability, e.g. the Redis database and the doRedis package
The final approach is a novel idea by Brian Lewis, and uses the Redis database as a parallel back end.
Specifically, the R package rredis allows message passing between R and Redis. The package doRedis allows you to use foreach with redis as the parallel backend. What’s interesting about Redis is that this database allows the user to create queues and each parallel worker fetches jobs from this queue. This allows for a dynamic network of workers, even across different machines.
The package has a wonderful vignette. Also take a look at the video demo at http://bigcomputing.com/doredis.html (reproduced below):
But what about actual progress bars?
During my research of available information on this topic, I could not find published a reliable way of creating progress bars using foreach.
I came across some tantalising hints, e.g. at How do you create a progress bar when using the “foreach()” function in R?
Sadly, the proposed mechanism didn’t actually work.
What next?
I think there might be a way of getting progress bars with foreach and the doParallel package, at least in some circumstances.
I plan to pen my ideas in a follow-up blog post.
Meanwhile, can you do better? Is there a way of creating progress bars with foreach in a parallel job?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.