Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add -ffpe-summary=none to stand-alone MPAS Makefile for gnu #4643

Merged
merged 4 commits into from
Nov 23, 2021

Conversation

mark-petersen
Copy link
Contributor

@mark-petersen mark-petersen commented Nov 4, 2021

Currently, gnu-compiled executables produce a summary table of exceptions at the end, which is some version of:

Note: The following floating-point exceptions are signalling: IEEE_DENORMAL
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_DENORMAL

and prints one line per processor on finalize for both debug and optimized runs. When running on hundreds of cores, this is a headache.

Looking at on-line material 1 2 3 other modeling groups see this with gnu 5+ as well, and say to ignore it or add -ffpe-summary=none. See gnu compile flags.

I have had numerous students and post-docs tell me they had errors when they see this. The error messages are confusing and not useful, because there is no back-trace. Note that actual IEEE errors are caught during debug with -ffpe-trap=invalid,zero,overflow and include a back-trace. The -ffpe-trap is the normal way we catch the errors.

This affects MPAS-seaice and MALI as well. This Makefile is not used for E3SM compiling.

[BFB]

@mark-petersen
Copy link
Contributor Author

mark-petersen commented Nov 4, 2021

Tested pr regression suite on grizzly with make gfortran and cori with make gnu-nersc. Passes pr suite for both debug and optimized on both. Ran some QU240 and baroclinic channel forward runs with before/after executables, and could see that summary table of IEEE exceptions summary was removed.

@mark-petersen mark-petersen requested a review from xylar November 4, 2021 17:51
@ndkeen
Copy link
Contributor

ndkeen commented Nov 4, 2021

Ah. We might want to use this flag in other places as well. Thanks Mark.

@mark-petersen
Copy link
Contributor Author

In this process, I found that gnu-nersc is not using -ffpe-trap=invalid,zero,overflow like the other gnu compilers in the Makefile. I have to think that was an oversight. I added an intentional divide by zero, and indeed the current settings don't produce an error, but with this last commit, we get a floating-point exception and trace-back, as expected.

Copy link
Contributor

@philipwjones philipwjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also please add the flag to the gfortran-clang target?

I built/ran a version and confirm that it ran successfully and did not generate the usual underflow messages. They were kinda annoying so thanks @mark-petersen

@mark-petersen
Copy link
Contributor Author

Added -fbounds-check -fbacktrace to gnu-nersc debug flags, so it matches gfortran. I tested on cori, including with and without an intentional divide by zero, and it all works as expected. Also added -ffpe-summary=none to gfortran-clang per @philipwjones's request.

@matthewhoffman and @akturner if you could run a standard test on MALI and MPAS-seaice with this Makefile, I think that is enough to test this PR.

@mark-petersen
Copy link
Contributor Author

@xylar I think additional testing from you is not required. You can read my testing above. You've been involved with Makefile changes in the past, so wanted you to see this PR.

Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and passed the nightly regression suite on my laptop with gnu compilers. I see:

Note: The following floating-point exceptions are signalling: IEEE_DENORMAL
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL

without these changes but not with them. Thanks for this work!

Copy link
Contributor

@matthewhoffman matthewhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested on Cori with gnu using MALI, and ran into no problems. It will be nice to eliminate those confusing output messages!

Copy link
Contributor

@akturner akturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed MPAS-Seaice testing

jonbob added a commit that referenced this pull request Nov 19, 2021
…4643)

Add -ffpe-summary=none to stand-alone MPAS Makefile for gnu

Currently, gnu-compiled executables produce a summary table of
exceptions at the end and prints one line per processor on finalize for
both debug and optimized runs. When running on hundreds of cores, this is
a headache.  Other modeling groups see this with gnu 5+ as well, and say
to ignore it or add -ffpe-summary=none. See gnu compile flags.

This affects MPAS-seaice and MALI as well. This Makefile is not used for
E3SM compiling.

[BFB]
@jonbob
Copy link
Contributor

jonbob commented Nov 19, 2021

passes sanity testing, merged to next

@jonbob jonbob merged commit 488ac4c into master Nov 23, 2021
@jonbob
Copy link
Contributor

jonbob commented Nov 23, 2021

merged to master

@jonbob jonbob deleted the mark-petersen/mpas/update-Makefile-gnu branch November 23, 2021 17:44
hollyhan added a commit to MALI-Dev/E3SM that referenced this pull request Dec 2, 2021
xylar added a commit to xylar/compass that referenced this pull request Feb 4, 2022
commit hash: 44814ae3ce06ec1d175fc03a6c4471f9d3fad274

This update includes the following MPAS-Ocean PRs:
* Corrected coupling of ice biogeochemistry for MARBL
  E3SM-Project/E3SM#4641
* GPU port of ocean vmix routine (non-bit-for-bit in compass)
  E3SM-Project/E3SM#4680
* Fix filename interval for 2 analysis member restarts
  E3SM-Project/E3SM#4687
* fix calculation of the ML-averaged Brunt-Vaisala frequency
  E3SM-Project/E3SM#4715
* Fix interface locations for 60-layer PHC grid
  E3SM-Project/E3SM#4729
* Update PGI compiler runs of E3SM on GPUs
  E3SM-Project/E3SM#4748
* Initialize Coriolis param in its own subroutine
  E3SM-Project/E3SM#4750
* Add mode specification to conservation check streams
  E3SM-Project/E3SM#4769

And the following MPAS framework PRS:
* add -ffpe-summary=none to stand-alone MPAS Makefile for gnu
  E3SM-Project/E3SM#4643
* Fix missing dependency in one case of gen_f90_targets
  E3SM-Project/E3SM#4736
* new reproducible global sum module for MPAS components
  E3SM-Project/E3SM#4700
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFB PR leaves answers BFB mpas-framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants