Communicate tunecache during runs when tuning is active in Multi-GPU runs #199

mathiaswagner · 2014-12-12T11:00:26Z

When tuning is active during Multi-GPU runs each GPU independently tunes each Kernel. This results in different GPUs using different launch configurations for the final Kernel launch and finally makes binary reproducibility impossible. This was first discovered in #182.

While a simple global reduction over the elapsed time during the tuning can help in synchronous runs it will cause hangs when using asynchronous algorithms like DD where each GPU works on a local problem and may not even launch the tuning process for a specific Kernel.

This then also relates to the issue mentioned in tune.cpp

//FIXME: We should really check to see if any nodes have tuned a kernel that was not also tuned on node 0, since as things
//       stand, the corresponding launch parameters would never get cached to disk in this situation.  This will come up if we
//       ever support different sub volumes per GPU (as might be convenient for lattice volumes that don't divide evenly).

We need a non blocking solution to that.

The text was updated successfully, but these errors were encountered:

weinbe2 · 2024-09-03T22:05:19Z

Updating: this has been addressed in the non-DD case, but is still relevant for DD.

mathiaswagner added the bug label Dec 12, 2014

mathiaswagner mentioned this issue Dec 12, 2014

No binary reproducibility with tuning turned on #182

Closed

maddyscientist mentioned this issue Jun 2, 2016

further twisted mass clover convergence issues #474

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Communicate tunecache during runs when tuning is active in Multi-GPU runs #199

Communicate tunecache during runs when tuning is active in Multi-GPU runs #199

mathiaswagner commented Dec 12, 2014

weinbe2 commented Sep 3, 2024

Communicate tunecache during runs when tuning is active in Multi-GPU runs #199

Communicate tunecache during runs when tuning is active in Multi-GPU runs #199

Comments

mathiaswagner commented Dec 12, 2014

weinbe2 commented Sep 3, 2024