clustermq falsely claims to submit jobs when running in local mode #196

dapritchard · 2020-06-04T16:43:18Z

Thank you so much for this package. I've started it as a parallel backend for a drake pipeline, and have been very impressed over the performance improvements that I've observed. I'm interested in evaluating the use of clustermq / rzmq in settings outside of drake, but seemingly can't get the example using foreach listed in the User Guide (in the subsection titled "As parallel foreach backend") to work. What am I missing here?

In the example below on my 4-core machine, I would expect the following code to run in close to 5 seconds, yet it runs in close to 20 seconds. When I use similar code to run some heavy processing, I'm only observing one core doing significant work.

library(foreach)
(n_cores <- parallel::detectCores())
#> [1] 4
clustermq::register_dopar_cmq(n_jobs = n_cores)
system.time(foreach(i = seq_len(n_cores)) %dopar% Sys.sleep(5))
#> Submitting 4 worker jobs (ID: 6856) ...
#>    user  system elapsed 
#>   0.118   0.022  20.187

(Note that the text of this issue is copied from a Stack Overflow question: https://stackoverflow.com/questions/62134030/using-clustermq-r-package-as-a-parallel-backend-for-foreach).

mschubert · 2020-06-04T17:54:12Z

Are you using the local instead of the multicore backend?

If so, the following should fix it:

getOption("clustermq.scheduler") # check which scheduler is set
options(clustermq.scheduler = "multicore") # select multicore
# foreach example..

The reason for this is that we never select multicore automatically: it's bad locally because it may duplicate memory (and crash your computer), and it's bad on a computing cluster because it may exceed the number of cores you reserved.

When you load library(clustermq) and haven't set up your scheduler, it will tell you this:

Option 'clustermq.scheduler' not set, defaulting to ‘LOCAL’
--- see: https://mschubert.github.io/clustermq/articles/userguide.html#configuration

In your case it won't because you access the package namespace via clustermq::, and printing messages with accessing the namespace only is considered bad practice by CRAN.

But I can see how this is confusing, it should probably be more obvious than it is right now, or at least not tell you it is submitting jobs when it isn't.

dapritchard · 2020-06-04T19:03:37Z

Yes, that was exactly it! Thanks so much for your response. I have to admit that I had seen the Option 'clustermq.scheduler' not set, defaulting to 'LOCAL' message, but I actually thought that was the correct setting, although in retrospect I probably should have deduced otherwise.

A couple of things that would have tipped me off immediately what the problem was:

If the "As parallel foreach backend" section of the User Guide had the options(clustermq.scheduler = "multicore") command mentioned in it.
If the defaulting to 'LOCAL' message had the word "sequential" in there somewhere (maybe something like defaulting to 'SEQUENTIAL' or defaulting to 'LOCAL' (sequential).
As you mentioned, if the Submitting 4 worker jobs didn't appear when using the LOCAL scheduler.

I'd be happy to try and tackle any of those tasks in a PR if you'd be interested (presumably the first 2 are trivial). Thanks again for this wonderful package!

mschubert · 2020-06-08T19:36:23Z

I'd be happy to take a PR. I suggest 1 and 3:

Add a comment line here: # set up the scheduler first, otherwise this will run sequentially
Renaming the "LOCAL" scheduler is a no go because it would break existing user setups. Adding a special case for the message would work
Easiest would be to check qsys$id in workers.r, but it would probably be better to somehow handle this in the qsys's

mschubert changed the title ~~Using clustermq locally for parallel processing~~ clustermq falsely claims to submit jobs when running in local mode Jun 6, 2020

mschubert added documentation enhancement labels Jun 6, 2020

mschubert added the priority label Jun 13, 2020

mschubert added a commit that referenced this issue Jun 20, 2020

add note that foreach needs scheduler (#196)

a8669b6

mschubert added a commit that referenced this issue Jun 20, 2020

have qsys handle submit msgs (fixes #196)

33c0256

mschubert closed this as completed Jun 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clustermq falsely claims to submit jobs when running in local mode #196

clustermq falsely claims to submit jobs when running in local mode #196

dapritchard commented Jun 4, 2020

mschubert commented Jun 4, 2020 •

edited

Loading

dapritchard commented Jun 4, 2020 •

edited

Loading

mschubert commented Jun 8, 2020

clustermq falsely claims to submit jobs when running in local mode #196

clustermq falsely claims to submit jobs when running in local mode #196

Comments

dapritchard commented Jun 4, 2020

mschubert commented Jun 4, 2020 • edited Loading

dapritchard commented Jun 4, 2020 • edited Loading

mschubert commented Jun 8, 2020

mschubert commented Jun 4, 2020 •

edited

Loading

dapritchard commented Jun 4, 2020 •

edited

Loading