Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bit-for-bit diagnostics #204

Closed
apcraig opened this issue Oct 10, 2018 · 3 comments
Closed

bit-for-bit diagnostics #204

apcraig opened this issue Oct 10, 2018 · 3 comments

Comments

@apcraig
Copy link
Contributor

apcraig commented Oct 10, 2018

CICE currently has 3 different flags that affect bit-for-bit diagnostics. The first "DITTO" is an env variable in cice.settings. That triggers the cpp -DREPRODUCIBLE but only for a few machine/compiler implementations. It probably should be specified in all machine/compiler implementations if any. Finally, there is the bfbflag in namelist. That turns on yet another calculation for the global diagnostics but only on the mpi side.

I see several issues

  • multiple independent approaches to address the same issue
  • inconsistency in scripts application to various machines and compilers
  • inconsistency in implementation between mpi and serial versions (serial does not have a bfbflag implementation)
  • lack of reuse in implementation of sums, mpi vs serial, int/float/double, prod sums, etc.
  • no recent testing as far as I know
  • really does need to be cleaned up

I would like to refactor this with the following requirements

  • run time flag only
  • implementation in both mpi and serial, recognizing this is not only a pe count issue but also a decomp/block issue.
  • code reuse where possible
  • leverage new techniques such as the Worley ddpdd approach to reduce cost
  • potentially support multiple approaches via a consistent, well documented namelist input
  • test and validate

This feature is useful, especially for quick bit-for-bit comparison of log files during testing. If the cost is low enough, it may even be a viable default setting.

It is probably not a lot of work, I have implemented the same thing in other projects and would leverage some of that work. I would like to propose it be added to the CICE6 release project.

Pinging @eclare108213, @mattdturner, @dabail10 for any comments or feedback.

@eclare108213
Copy link
Contributor

eclare108213 commented Oct 10, 2018

This makes sense to me. DITTO and REPRODUCIBLE came from UKMO a long time ago, specifically because they wanted to be able to compare diagnostic output across configurations (these only affect the diagnostic output). bfbflag came from CESM 4 years ago (the "blame" functionality in github is very handy). @apcraig, yes, please clean this up.

@duvivier
Copy link
Contributor

Also get rid of the following in cice.settings:
setenv CAM_ICE no # set to yes for CAM runs (single column)
setenv BARRIERS no # prevent MPI buffer overflow during gather/scatter

@apcraig
Copy link
Contributor Author

apcraig commented Mar 27, 2019

PR #300 complete

@apcraig apcraig closed this as completed Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants