Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RRFS debug & 2threads variants fixed plus many boundary condition bugs #1437

Merged

Conversation

SamuelTrahanNOAA
Copy link
Collaborator

@SamuelTrahanNOAA SamuelTrahanNOAA commented Sep 22, 2022

PR Checklist

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR are specified below.

  • Results for one or more of the regression tests change and the reasons for the changes are understood and explained below.

  • New or updated input data is required by this PR. If checked, please work with the code managers to update input data sets on all platforms.

Description

This fixes many bugs in the fv_regional_bc.F90, and adds workarounds for two issues that couldn't be fixed. Also, the module_sf_ruclsm had two variables with troubles during gfortran debug mode, because subnormal number truncation was disabled. This is expected to improve some of the boundary issues seen in RRFS runs, and will be incorporated into RRFS parallels as soon as possible.

The result is that the RRFS can run in debug mode, and the 2threads variant matches the control and are now enabled in the conf files. Sadly, the restart and decomp do not match the control; they're in the tests/tests directory, but are commented out in the conf files. All rrfs conus13km tests all have a timestep of 120s now, instead of 60s, which should slightly offset the cost of having more of them.

RRFS-Smoke fails all variants, and crashes during restart, for known reasons. This is being fixed in the RRFS_dev branch (NOAA-GSL fork) which has a much newer and improved RRFS-Smoke. Fixing problems in the much older version in ufs-community would take more labor than is available, so those fixes will come in when the RRFS changes are merged to the authoritative repositories. Syncing the repositories is a high priority for RRFS developers.

All tests that have regional boundary conditions or use the RUC LSM may change results.

(Partially) Incorporated PRs

Pull request #1431, which corrected Rocoto support, was used during testing, so that is merged into this PR.

The GFDL_atmos_cubed_sphere changes for #1158 have already been merged to dev/emc, so they're in this PR's dependency at that level (NOAA-GFDL/GFDL_atmos_cubed_sphere#219). However, the rest of that PR has NOT been combined into this one. This doesn't seem to have had any impact on the regression tests. That is expected because #1158 did not change any actual code in GFDL_atmos_cubed_sphere. The meat of that PR is elsewhere.

The GFDL_atmos_cubed_sphere PR NOAA-GFDL/GFDL_atmos_cubed_sphere#220 from @kaiyuan-cheng is included, which fixes most of the issues mentioned here in a simpler manner than my own fixes.

Issue(s) addressed

Fixes NOAA-GFDL/GFDL_atmos_cubed_sphere#218
Fixes ufs-community/ccpp-physics#5
Fixes #1436

This solves half of the problem described in #1222. Do NOT close that issue. The other half, the decomp and restart variants, do not work yet.

These are newly-discovered bugs, for which this PR adds workarounds. The bugs should be fixed, and the workarounds removed:

workaround for NOAA-EMC/fv3atm#586 (Do NOT close that issue.)

Testing

All regression tests in rt.conf and rt_gnu.conf pass on hera.gnu, jet.intel, and hera.intel.

Some test variants that fail are commented out, but I kept them in tests/tests for debugging purposes. Specifically, these are the decomp and restart variants. The conus13km tests are all warm starts from data assimilation system output, so their restart tests require an extra variable. All extra logic needed to run a restart is present in the disabled conus13km files.

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • acorn.intel
  • opnReqTest for newly added/changed feature
  • CI

Dependencies

NOAA-GFDL/GFDL_atmos_cubed_sphere#219
NOAA-EMC/fv3atm#587
ufs-community/ccpp-physics#6

…nNOAA/ufs-weather-model into bugfix/rrfs-debug-mode
…nNOAA/ufs-weather-model into bugfix/rrfs-debug-mode
…nNOAA/ufs-weather-model into bugfix/rrfs-debug-mode
@zach1221 zach1221 added the jenkins-ci Jenkins CI: ORT build/test on docker container label Oct 13, 2022
@zach1221
Copy link
Collaborator

@zach1221 zach1221 removed the jenkins-ci Jenkins CI: ORT build/test on docker container label Oct 13, 2022
@jkbk2004
Copy link
Collaborator

@SamuelTrahanNOAA all tests are done. We can start merging in dependencies first.

@jkbk2004
Copy link
Collaborator

@SamuelTrahanNOAA go ahead to update on fv3atm submodule pointer and revert the branch. and we can merge.

@SamuelTrahanNOAA
Copy link
Collaborator Author

@jkbk2004 The FV3 submodule hash and URL have been updated. This PR is ready for merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated.
Projects
None yet
6 participants