Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart reproducibility tests are failing #11

Closed
aidanheerdegen opened this issue Feb 22, 2024 · 2 comments
Closed

Restart reproducibility tests are failing #11

aidanheerdegen opened this issue Feb 22, 2024 · 2 comments

Comments

@aidanheerdegen
Copy link
Member

The tests for reproducibility across restarts are currently not being run in the CI tests because they fail.

https://github.com/ACCESS-NRI/access-om2-configs/blob/main/test/test_bit_reproducibility.py#L78

The original COSIMA ACCESS-OM2 tests from which these were adapted used reproducibility flags turned on for their builds

https://github.com/COSIMA/access-om2/blob/master/test/exp_test_helper.py#L217-L218

though this was not used in production builds

https://github.com/COSIMA/access-om2/blob/master/install.sh#L49

We need a repro variant for the spack package to test if we can replicate the COSIMA reproducibility results, and see what effect that option has on performance.

@access-hive-bot
Copy link

This issue has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/modifying-access-om2-spack-package/1847/6

@aidanheerdegen
Copy link
Member Author

We have a fix for the restart repro.

  1. The first step is to add the CI restart repro checks to CI so they will fire for the next step.

  2. Once there is a PR to add this to ACCESS-OM2 the prerelease deployment can be used in a PR to a config, say release-1deg_jra55_ryf.

  3. Other performance tests will also be done with the pre-release PR.

  4. If the CI tests pass and performance tests are acceptable the ACCESS-OM2 PR should be merged, then the release-1deg_jra55_ryf PR should be updated with the paths to the properly deployed ACCESS-OM2, then merged (assuming all tests still pass)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants