-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test examples in parallel #417
Conversation
Tests take ~3min longer, seems like a win |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks! It raises the question of which of the other unit tests or integration tests should be in parallel. The nice thing about the examples is that they cover most of the capabilities so this should give us reasonable coverage.
Just one small suggestion...
4882454
to
5ad6f86
Compare
I agree, it's probably worth having a discussion at some point which of the unit/integration tests should be run in parallel. Using the A short term suggestion would be: If you find any parallel bugs add a parallel test, or update an existing test to run in parallel. |
29987b9
to
6d477e7
Compare
39206c6
to
30b8f5d
Compare
30b8f5d
to
4004100
Compare
Twice in a row! Nice! I will pull out my debugging code and then this is ready for review |
I'd really like us to start parallel testing by default, so I thought it would be helpful to update the branch. But as we saw a few months ago, we can still get failures. On this latest run it was the On the first call to the mixed solver, one rank seems to fall behind the others and not begin the
Output from one log file:
|
In the failing example that I posted above, the hanging comes between the
|
I wonder if we're caching something we shouldn't on a mesh hierarchy. We recently fixed failing tests in Firedrakeland by ensuring a fresh mesh hierarchy each time. That approach wouldn't work here but might be a good place to start. It would be good to whittle this down to a minimal failing example to work on. |
I don't think we actually use a mesh hierarchy in any tests -- we only use algebraic multigrid and not geometric, although this is definitely something that we need to do to improve performance. I agree about the minimum failing example, the hard thing is that sometimes it will pass!! |
Some recent changes in Firedrake my have fixed some parallel bugs. I'm running CI again and crossing my fingers 🤞 . If this still doesn't work we should try tackling this at the hackathon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been talked about extensively offline. Thanks Jack and I approve!
Given that most users will probably try and run an example from the
examples
directory in parallel, enable parallel testing for all examples.I do not anticipate this adding significant time to CI runs as the parallel tests will benefit from having a hot cache.
Any test in
unit-tests
orintegration-tests
can be made to run in parallel by using the Python decorator:@pytest.mark.parallel(nprocs=4)
The above example would run the test with 4 MPI ranks.
Fixes #377.
Requires #416 to be addressed.