You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ducttape-0.3 defaults to depth-first traversal of the realization graph in order to try different kind of tasks quickly (and fail fast). But when the user elects to run multiple processes, this ordering for certain workflows will actually prevent parallelism. For example, if the realization graph is like
a.1 --> b.1 -+-> c --> d
a.2 --> b.2 /
(tasks a and b having two realizations each, both of which are prerequisites to c), then the depth-first strategy will yield a.1 b.1 a.2 b.2 c d, where the only opportunity for parallelism is b.1 and a.2. Because c depends on two branches, this would actually fail more slowly than the breadth-first strategy of a.1 a.2 b.1 b.2 c d, because the a.1/a.2 and b.1/b.2 might both run in parallel given -j2.
I suspect a smarter strategy might be a hybrid depth/breadth traversal, where the breadth is constrained by the number of things allowed to execute in parallel. Thus -j2 would be equivalent to breadth-first for the structure above, while -j3 would cause the traversal to start with up to 3 ready tasks, and so forth.
Or, can the traversal order be computed on the fly? Then after starting a.1 with -j2 it would realize that b.1 is not ready and therefore start a.2 in parallel.
The text was updated successfully, but these errors were encountered:
A related problem is that if a realization of a task fails, it is easy to miss it when other things are running in parallel. Maybe there could be an option to have ducttape immediately suspend other running or scheduled realizations of a task if one of them fails?
I'm skeptical about immediately stopping. More often than not, I don't want to kill a tuning run that's been going for 12 hours on 32 cores just because I made a simple typo in a results formatting script for some barely-related realization.
So far I haven't seen enough painful situations come out of this. I generally just open up the number of tasks to execute in parallel to something huge and let a job scheduler figure it out.
ducttape-0.3 defaults to depth-first traversal of the realization graph in order to try different kind of tasks quickly (and fail fast). But when the user elects to run multiple processes, this ordering for certain workflows will actually prevent parallelism. For example, if the realization graph is like
(tasks
a
andb
having two realizations each, both of which are prerequisites toc
), then the depth-first strategy will yielda.1 b.1 a.2 b.2 c d
, where the only opportunity for parallelism isb.1
anda.2
. Becausec
depends on two branches, this would actually fail more slowly than the breadth-first strategy ofa.1 a.2 b.1 b.2 c d
, because thea.1/a.2
andb.1/b.2
might both run in parallel given-j2
.I suspect a smarter strategy might be a hybrid depth/breadth traversal, where the breadth is constrained by the number of things allowed to execute in parallel. Thus
-j2
would be equivalent to breadth-first for the structure above, while-j3
would cause the traversal to start with up to 3 ready tasks, and so forth.Or, can the traversal order be computed on the fly? Then after starting
a.1
with-j2
it would realize thatb.1
is not ready and therefore starta.2
in parallel.The text was updated successfully, but these errors were encountered: