-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the number of procs of MPI test for robust CI #136
Conversation
.travis.yml
Outdated
- (for NP in 1 2 3; do PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' mpiexec -n ${NP} nosetests -v -a '!nccl,!gpu,!slow' || exit $?; done) | ||
# - cd tests | ||
# - PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' nosetests -a '!gpu,!slow' --with-doctest chainer_tests | ||
- if mpiexec --version | grep -q OpenRTE; then NOBIND="--bind-to none"; else NOBIND= ; fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "--bind-to none" used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it for future safety to prevent potential errors, but it works without it now. I guess we can remove it as far as it works. Removed it.
.travis.yml
Outdated
# - cd tests | ||
# - PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' nosetests -a '!gpu,!slow' --with-doctest chainer_tests | ||
- if mpiexec --version | grep -q OpenRTE; then NOBIND="--bind-to none"; else NOBIND= ; fi | ||
- (for NP in 1 2; do for T in $(find tests -name "test_*.py") ; do mpiexec -n ${NP} nosetests -s -v -a '!nccl,!gpu,!slow' $T || exit $?; done; done) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the tests run separately?
I think it is not needed to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was also "safety guard" for future potential errors... but I removed it for the same reason.
.travis.yml
Outdated
# - PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' nosetests -a '!gpu,!slow' --with-doctest chainer_tests | ||
- if mpiexec --version | grep -q OpenRTE; then NOBIND="--bind-to none"; else NOBIND= ; fi | ||
- (for NP in 1 2; do for T in $(find tests -name "test_*.py") ; do mpiexec -n ${NP} nosetests -s -v -a '!nccl,!gpu,!slow' $T || exit $?; done; done) | ||
# - (for NP in 1 2 4; do for T in $(find tests -name "test_*.py") ; do PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' mpiexec ${NOBIND} -n ${NP} nosetests -s -v -a '!nccl,!gpu,!slow' $T || exit $?; done; done) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove this comments because I do not think that it is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Currently, CI on Travis CI often fails because of deadlocks in 3-process test_mnist.
Although I don't find the exact reason of the deadlock, tests with up to 2 processes
does not suffer from the deadlock issue.
This PR changes the number of processes in the MPI tests (1,2,3) to two processes in
.travis.yml
.Also, this PR changes to invoke
nosetests
command for eachtest_*.py
files so it can avoid deadlocks.