You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Executing tests on my local machine which dont have specified the MAX_NUM_PROC are shown as passed although they were never executed. This happens for example with make -j8 check_python ARGS='-R experimental_decorator -V' which results in
...
Start 32: experimental_decorator
32: Test command: /usr/bin/mpiexec "-n" "13" "/home/areinauer/Documents/hiwi/espresso/build/pypresso" "/home/areinauer/Documents/hiwi/espresso/build/testsuite/python/experimental_decorator.py"
32: Test timeout computed to be: 300
32: --------------------------------------------------------------------------
32: A request was made to bind to that would result in binding more
32: processes than cpus on a resource:
32:
32: Bind to: NUMA
32: Node: ibex
32: #processes: 6
32: #cpus: 6
32:
32: You can override this protection by adding the "overload-allowed"
32: option to your binding directive.
32: --------------------------------------------------------------------------
1/1 Test #32: experimental_decorator ........... Passed 0.11 sec
The following tests passed:
experimental_decorator
100% tests passed, 0 tests failed out of 1
The text was updated successfully, but these errors were encountered:
the CPU is hyperthreaded to expose 24 logical cores, mpiexec -n 12 doesn't produce the warning
with mpiexec -n 12 an espresso runtime error is raised
with mpiexec -n 4 a different espresso runtime error is raised
with mpiexec -n 1 and without mpi the test runs fine
uses mpiexec (OpenRTE) 2.1.1
The thread warning probably exits with error code 0 or produces a signal that isn't caught by ctest, which means those tests fail silently. This can be resolved by simply capping $NP to 4.
By default, OpenMPI should allow spawning more threads than cores. I cannot reproduce the warning on my machine. We need to find out which mpirun flag can be used to detect whether this binding safety is enabled, so that we can introduce a guard to prevent ctest from running if there are not enough cores for the value of $NP. @mkuron any idea? I've looked for the documentation of --bind-to core:overload-allowed but couldn't find anything helpful.
Please don't change --bind-to. Just use -oversubscribe. That will run with as many processes as you specify, even if that puts multiple processes on the same core. Then you don't need any guards etc.
jngrad
changed the title
local false pass in python testsuite
some python tests fail silently on some architectures with OpenMPI 2.X
Nov 21, 2019
Executing tests on my local machine which dont have specified the
MAX_NUM_PROC
are shown as passed although they were never executed. This happens for example withmake -j8 check_python ARGS='-R experimental_decorator -V'
which results inThe text was updated successfully, but these errors were encountered: