Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of procs of MPI test for robust CI #136

Merged
merged 11 commits into from
Nov 21, 2017
Merged

Conversation

keisukefukuda
Copy link
Member

@keisukefukuda keisukefukuda commented Nov 16, 2017

Currently, CI on Travis CI often fails because of deadlocks in 3-process test_mnist.

Although I don't find the exact reason of the deadlock, tests with up to 2 processes
does not suffer from the deadlock issue.

This PR changes the number of processes in the MPI tests (1,2,3) to two processes in .travis.yml.

Also, this PR changes to invoke nosetests command for each test_*.py files so it can avoid deadlocks.

@keisukefukuda keisukefukuda changed the title More robust Travis CI test [WIP] More robust Travis CI test Nov 17, 2017
@keisukefukuda keisukefukuda changed the title [WIP] More robust Travis CI test Reduce the number of procs of MPI test for robust CI Nov 20, 2017
@shu65 shu65 self-requested a review November 20, 2017 07:32
.travis.yml Outdated
- (for NP in 1 2 3; do PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' mpiexec -n ${NP} nosetests -v -a '!nccl,!gpu,!slow' || exit $?; done)
# - cd tests
# - PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' nosetests -a '!gpu,!slow' --with-doctest chainer_tests
- if mpiexec --version | grep -q OpenRTE; then NOBIND="--bind-to none"; else NOBIND= ; fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "--bind-to none" used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it for future safety to prevent potential errors, but it works without it now. I guess we can remove it as far as it works. Removed it.

.travis.yml Outdated
# - cd tests
# - PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' nosetests -a '!gpu,!slow' --with-doctest chainer_tests
- if mpiexec --version | grep -q OpenRTE; then NOBIND="--bind-to none"; else NOBIND= ; fi
- (for NP in 1 2; do for T in $(find tests -name "test_*.py") ; do mpiexec -n ${NP} nosetests -s -v -a '!nccl,!gpu,!slow' $T || exit $?; done; done)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the tests run separately?
I think it is not needed to change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also "safety guard" for future potential errors... but I removed it for the same reason.

.travis.yml Outdated
# - PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' nosetests -a '!gpu,!slow' --with-doctest chainer_tests
- if mpiexec --version | grep -q OpenRTE; then NOBIND="--bind-to none"; else NOBIND= ; fi
- (for NP in 1 2; do for T in $(find tests -name "test_*.py") ; do mpiexec -n ${NP} nosetests -s -v -a '!nccl,!gpu,!slow' $T || exit $?; done; done)
# - (for NP in 1 2 4; do for T in $(find tests -name "test_*.py") ; do PYTHONWARNINGS='ignore::FutureWarning,module::DeprecationWarning' mpiexec ${NOBIND} -n ${NP} nosetests -s -v -a '!nccl,!gpu,!slow' $T || exit $?; done; done)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this comments because I do not think that it is necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@shu65 shu65 merged commit 93dad5b into master Nov 21, 2017
@shu65 shu65 deleted the robust-travis-test branch November 21, 2017 00:54
@iwiwi iwiwi added this to the v1.1.0 milestone Dec 13, 2017
@kuenishi kuenishi added the test label Dec 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants