Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ufs-release/public-v2: bit-for-bit reproducibility for regional runs when changing MPI decomposition #68

Merged

Conversation

climbfuji
Copy link

@climbfuji climbfuji commented Feb 5, 2021

This PR fixes b4b mismatches when changing the MPI decomposition for regional runs (see ufs-community/ufs-weather-model#409 and ufs-community/ufs-weather-model#288) for more details.

  • add missing call to mpp_update_domain for q in model/fv_tracer2d.F90
  • add missing calls and update existing calls to update domain and halos, fix compiler warnings about additional text after #endif statements in model/dyn_core.F90
  • remove two unused variables from model/sw_core.F90
  • fix a few trailing whitespaces

Note 1. Each of the added/modified calls to update the domain or halos is required. After achieving b4b identical results, I removed each of the changes shown here and left the rest, and every time the results were no longer b4b.

Note 2. The perhaps most confusing change is to move the call to call start_group_halo_update(i_pack(8), u, v, domain, gridtype=DGRID_NE) to further down in the code in model/sw_core.F90, because there is a call to a regional boundary update for u and v after the original location.

Note 3. I am not an expert in the FV3 dycore, and it is possible that there are alternative and easier ways to achieve the same.

Associated PRs:

#68
NOAA-EMC/fv3atm#245
ufs-community/ufs-weather-model#409

For regression testing, see ufs-community/ufs-weather-model#409.

@climbfuji
Copy link
Author

@climbfuji
Copy link
Author

@bensonr

Copy link
Contributor

@XiaqiongZhou-NOAA XiaqiongZhou-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed. It looks good to me.

@bensonr bensonr requested a review from lharris4 February 8, 2021 15:37
@@ -749,6 +749,9 @@ subroutine tracer_2d_nested(q, dp1, mfx, mfy, cx, cy, gridstruct, bd, domain, np
reg_bc_update_time, &
iq )
enddo
call timing_on('COMM_TOTAL')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK but could likely be avoided by having the regional boundary update fill boundary rows that lie in the haloes too: ie. if this is a west edge, then have the BCs fill in (-2:0) x (jsd:jed) instead of just (-2:0) x (jsc:jec).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the nesting code already does this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lharris4 Can you explain please what you mean? I tried to skip this one update call, and the results were different. I did that with every single change in this PR, actually. Took them out on by one but left the others and reran the tests. Every time they failed, thus I think that all of them are required (unless changes are made to the regional BC update code as you mentioned).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. That is exactly what I mean: I think that if the regional BC update code is changed, this mpp_update_domains call is unnecessary. Lucas

Got it, thank you. Do you think it is ok to get the changes into the public release code as they are now, or should we try to follow your suggestion. The reason I am asking is that the SRW App is on a tight release schedule. The code freeze was last week and we were hoping to get the b4b issues fixed and merged as an "emergency bugfix".

We should definitely follow your suggestions for the authoritative/development branches and not try to bring over the public release code changes. Doing so would probably allow us to remove several of the already existing calls to mpp_update_domain, if I understand this correctly, but it will require a bit of time for the development to happen.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lharris4 @bensonr I would like to follow up with this issue, do we have the boundary fill updates in the master and dev/emc branch? We still have decomposition reproducibility issue with regional test and would like to see if we can try the fix you mentioned.

@lharris4
Copy link
Contributor

lharris4 commented Feb 8, 2021 via email

@lharris4
Copy link
Contributor

lharris4 commented Feb 8, 2021 via email

@climbfuji
Copy link
Author

Hi, Dom. Sure, this is fine. Lucas

Thanks, Lucas. I'll post another note when all the regression tests have passed and we are ready to merge.

@climbfuji
Copy link
Author

@lharris4 this is ready to merge whenever it is convenient for you - thanks!

@bensonr
Copy link
Contributor

bensonr commented Feb 9, 2021

With @lharris4 okaying this PR, I'll merge it now.

@bensonr bensonr merged commit a2078d7 into NOAA-GFDL:ufs-release/public-v2 Feb 9, 2021
kaiyuan-cheng added a commit to kaiyuan-cheng/GFDL_atmos_cubed_sphere that referenced this pull request Feb 16, 2021
@lharris4
Copy link
Contributor

lharris4 commented Jun 7, 2021 via email

@junwang-noaa
Copy link
Collaborator

That's great, thanks for letting me know!

MicroTed pushed a commit to MicroTed/GFDL_atmos_cubed_sphere that referenced this pull request Sep 22, 2021
…erturbations (NOAA-GFDL#239)

* Update .gitmodules and submodule pointers for ccpp-framework and ccpp-physics for gsl/develop branch
* RUC ice for gsl/develop (replaces NOAA-GFDL#54 and NOAA-GFDL#56) (NOAA-GFDL#60) Implementation of RUC LSM ice model in CCPP
* Fix bug in gfsphysics/GFS_layer/GFS_typedefs.F90 from merge
* Remove lsm_ruc_sfc_sice from suite FV3_GSD_v0_unified_ugwp_suite and update submodule pointer for ccpp-physics
* Remove sfc_sice from ccpp/suites/suite_FV3_GSD_v0_unified_ugwp_suite.xml
* Update gsl/develop from develop 2020/12/08 (NOAA-GFDL#61)
* Fix for updating stochastic physics on separate time-step. (NOAA-GFDL#199)
This bug fix allows the random patterns in the stochastic physics persist the for a period of time (defined as SKEBINT,SPPTINT, etc.) before calculating new patterns.
The fix is to move the allocation of the saved variables into the init section of stochastic_physics_wrapper, and remove the deallocates in the run section.
* Bug fixes in (1) running with frac_grid=T and GFDL MP and (2) restarting with frac_grid=T (NOAA-GFDL#204)
* -- Pointing to Moorthi's modifications in ccpp/physics, which fixed the crash when running GFDL MP with frac_grid=T;
-- Not setting fice to zero in order to leave lake ice untouched;
-- Restart in the coupled model with the default physics is reproducible, if bad water temperature is only filtered at initial time;
Co-authored-with: Shrinivas Moorthi <[email protected]>
Co-authored-with: Denise Worthen <[email protected]>
* Revert change to .gitmodules and update submodule pointer for ccpp-physics
* Update submodule pointer for ccpp-physics - MYNN surface layer updates and bugfixes (NOAA-GFDL#63)
* Land stochastic perturbations (wrapper PR for NOAA-GFDL#65) (NOAA-GFDL#68)
* Move initialization of stochastic physics after the physics
initialization in CCPP.
* Add albedo variables to land perturbations with lndp_type=2 option. Change to accommodate soil perturbations with RUC LSM.
* Max/min soil moisture variables are introduced via GFS_Control_type
variables instead of through the use of namelist_soilveg*. This is a
more flexible way for different LSMs.
* Added pores and resid variables for max/min soil moisture to GFS_typedefs.f90.
* Remove tracer_sanitizer from all suites and from CCPP prebuild config
* Add namelist option to apply land surface perturbations at every time step, clean up stochastic_physics/stochastic_physics_wrapper.F90
* Stochastic land perturbations: add roughness length over land to the perturbed variables (NOAA-GFDL#70)
* Added roughness length over land to the perturbed variables.
* Bugfix in gfsphysics/GFS_layer/GFS_typedefs.F90: remove Diag%cldcov, in particular the reset call because the variable is not allocated
* Update .gitmodules and submodule pointer for GFDL_atmos_cubed_sphere for code review and testing
* Revert change to .gitmodules for ccpp-physics, update submodule pointer for ccpp-physics
* Revert change to .gitmodules and update submodule pointer for GFDL_atmos_cubed_sphere
Co-authored-by: DomHeinzeller <[email protected]>
Co-authored-by: Phil Pegion <[email protected]>
Co-authored-by: shansun6 <[email protected]>
Co-authored-by: tanyasmirnova <[email protected]>
laurenchilutti added a commit to laurenchilutti/GFDL_atmos_cubed_sphere that referenced this pull request Feb 18, 2022
laurenchilutti added a commit that referenced this pull request Feb 18, 2022
* Revert "revise external_ic.F90 and fv_nudge.F90 (#68)"

This reverts commit 32b44d9.

* Revert "fixing call to pmaxmin to no longer get a compilation error when compiling with GNU."

This reverts commit 2aa049c.

* Revert "Update external_ic.F90 and fv_nudge.F90 to use allocatable arrays (#65)"

This reverts commit 44211c0.

* Revert "update external_ic.F90 and fv_nudge.F90 (#63)"

This reverts commit 81b9be0.

* Revert "Fix OVERLOAD_R4 ifdef block as suggested by @junwang-noaa (#60)"

This reverts commit 63a4603.

* Revert "Update code to use 'constantsR4_mod' module (#59)"

This reverts commit 7b8ee4c.
binli2337 added a commit to binli2337/GFDL_atmos_cubed_sphere that referenced this pull request Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants