Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to MOM6 main 20240729 commit (gfdl-to-main-2024-05-31) #2381

Merged

Conversation

jiandewang
Copy link
Collaborator

@jiandewang jiandewang commented Jul 31, 2024

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

MOM6 main repo. merged the "gfdl-to-main-2024-05-31" on 20240729 (see detail at mom-ocean/MOM6#1631). Need to pull it into dev/emc correspondingly.
Note this PR will require a new BL because

a new variable "h_ML" is being added in the restart file (see detail at mom-ocean/MOM6@7305528)

a bug fixing on "incorrect PPM:H3 tracer advection stencil" (see detail at NOAA-EMC/MOM6@3fac1c5). Note we can retain original answer by setting USE_HUYNH_STENCIL_BUG=True.

we will also set DEFAULT_ANSWER_DATE=20231231 in 5x5 setting as currently it is using its default value which is too aggressive (suggested by GFDL)

Commit Message:

      add MOM_dynamics_split_RK2b.F90 and MOM_EOS_base_type.F90 in cmake list
      add extra 16PE for cpld_debug_pdlib job
      set DEFAULT_ANSWER_DATE=20231231 in 5x5 case
      using MOM6 restart file as BL for sfs and hafs_mom jobs
      switching FIX_USTAR_GUSTLESS_BUG to USTAR_GUSTLESS_BUG in MOM_input
      USE_HUYNH_STENCIL_BUG=True (false will have mpi jobs fail to match their control)

Please provide concise information for The UFS-WM and/or each sub-component:

  • MOM6 -

Priority:

  • High: Time-sensitive project.

  • High: Reason: this PR has been lasted for 2 months to reach final stage and a new MOM6 PR has arrived. Need to finish this PR before I can do any clean test for the new PR.

Git Tracking

UFSWM:

Sub component Pull Requests:

UFSWM Blocking Dependencies:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

  • PR Updates/Changes Baselines.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

jiandewang and others added 7 commits July 29, 2024 23:13
      add MOM_dynamics_split_RK2b.F90 and MOM_EOS_base_type.F90 in cmake list
      add extra 8PE for cpld_debug_pdlib job
      set DEFAULT_ANSWER_DATE=20231231 in 5x5 case
      add USE_HUYNH_STENCIL_BUG = True in MOM_input
      using MOM6 restart file as BL for sfs and hafs_mom jobs
  switch FIX_USTAR_GUSTLESS_BUG=F to USTAR_GUSTLESS_BUG=T
@jkbk2004
Copy link
Collaborator

@jiandewang can you sync up branch?

@jiandewang
Copy link
Collaborator Author

@jkbk2004 just sync-ed my PR branch

@jiandewang
Copy link
Collaborator Author

note for hafs_regional_storm_following_1nest_atm_ocn_wav_mom6 and cpld_control_sfs, you need to create RESTART directory inside BL dir and copy MOMresnc to it.

@zach1221 zach1221 added Baseline Updates Current baselines will be updated. Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. labels Aug 13, 2024
@zach1221 zach1221 added jenkins-ort run ORT testing and removed jenkins-ort run ORT testing labels Aug 13, 2024
@jiandewang
Copy link
Collaborator Author

are we running this PR ? I don't see any email traffic on it today.

@zach1221
Copy link
Collaborator

zach1221 commented Aug 14, 2024

are we running this PR ? I don't see any email traffic on it today.

Yes, tests are running.

DeniseWorthen
DeniseWorthen previously approved these changes Aug 14, 2024
@FernandoAndrade-NOAA
Copy link
Collaborator

Leaving a note that Jet will likely have to be skipped. There is a system issue preventing git clones, I've let the system admins know so they can take a look.

@FernandoAndrade-NOAA
Copy link
Collaborator

Apologies for the delayed Hera and Gaea logs. RTs were started, but I am unable to log onto the machines to check up on the results due to the token login issues today.

@jkbk2004
Copy link
Collaborator

@jkbk2004 so you need to re-generate BL for this hafs-mom6 job on all machines

@jiandewang thanks for the information! @FernandoAndrade-NOAA @zach1221 FYI

@FernandoAndrade-NOAA
Copy link
Collaborator

@DusanJovic-NOAA Are we skipping Acorn for this PR?

@DusanJovic-NOAA
Copy link
Collaborator

@DusanJovic-NOAA Are we skipping Acorn for this PR?

Yes.

@jkbk2004
Copy link
Collaborator

Hera is very busy. I see a long job queue on Hera. Jet lfs4 issue continues. If we need to recover anything, we need to follow on those tow machine with next pr.

@FernandoAndrade-NOAA
Copy link
Collaborator

Looks like we'll be skipping hera and jet due to system issues this PR then, we'll catch up on the next PR @jkbk2004

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Aug 16, 2024

@FernandoAndrade-NOAA The next PR is currently scheduled to be my warmstart PR. If you plan on skipping baseline creation on Hera, then the next PR must create all baselines---mine only requires one.

I'm also not clear on what the hurry is. It is late on Friday afternoon. Are you planning to start a new PR today?

@zach1221
Copy link
Collaborator

@DeniseWorthen I think this PR is listed as high priority above, so we were trying to get it merged before weekend.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Aug 16, 2024

And anyone running RTs this weekend on hera will fail.

@jiandewang
Copy link
Collaborator Author

MOM6 merged (hash # 00f8ea20), submodule reverted

@jiandewang
Copy link
Collaborator Author

And anyone running RTs this weekend on hera will fail.

hafs_regional_storm_following_1nest_atm_ocn_wav_mom6 will fail on HERA

@jiandewang
Copy link
Collaborator Author

And anyone running RTs this weekend on hera will fail.

hafs_regional_storm_following_1nest_atm_ocn_wav_mom6 will fail on HERA

I just checked HERA BL /scratch2/NAGAPE/epic/UFS-WM_RT/NEMSfv3gfs/develop-20240813/hafs_regional_storm_following_1nest_atm_ocn_wav_mom6_intel/RESTART

-rw-r--r-- 1 role.epic epic 4197540137 Aug 16 13:41 20200825.150000.MOM.res.nc
-rw-r--r-- 1 role.epic epic 2276048626 Aug 16 13:41 20200825.150000.MOM.res_1.nc

these two were generated ~1.5hr ago so they will be fine

@jkbk2004 jkbk2004 merged commit 6d0454c into ufs-community:develop Aug 17, 2024
3 checks passed
@jiandewang
Copy link
Collaborator Author

@jkbk2004 @FernandoAndrade-NOAA my test indicates" BL for hafs_regional_storm_following_1nest_atm_ocn_wav_mom6_intel" has not been updated on HERA. This job will fail if people run latest UWM. It will show "20200825.150000.MOM.res.nc" not identical.

@jiandewang
Copy link
Collaborator Author

And anyone running RTs this weekend on hera will fail.

hafs_regional_storm_following_1nest_atm_ocn_wav_mom6 will fail on HERA

I just checked HERA BL /scratch2/NAGAPE/epic/UFS-WM_RT/NEMSfv3gfs/develop-20240813/hafs_regional_storm_following_1nest_atm_ocn_wav_mom6_intel/RESTART

-rw-r--r-- 1 role.epic epic 4197540137 Aug 16 13:41 20200825.150000.MOM.res.nc -rw-r--r-- 1 role.epic epic 2276048626 Aug 16 13:41 20200825.150000.MOM.res_1.nc

these two were generated ~1.5hr ago so they will be fine

the above is what I checked Friday around 4pm, they are correct one (I compared them with my own run and got b4b). but now (Sat. 6:40pm) I got
-rw-r--r-- 1 role.epic epic 4197540137 Aug 16 20:21 20200825.150000.MOM.res.nc
-rw-r--r-- 1 role.epic epic 2276048626 Aug 16 20:21 20200825.150000.MOM.res_1.nc
from the time stamp of file being generated, apparently there is updating on them and the new one is not correct

@jkbk2004
Copy link
Collaborator

@jiandewang let me check with develop branch

@jiandewang
Copy link
Collaborator Author

I compared these two yesterday
/scratch2/NAGAPE/epic/UFS-WM_RT/NEMSfv3gfs/develop-20240813_/hafs_regional_storm_following_1nest_atm_ocn_wav_mom6_intel/RESTART

/scratch2/NAGAPE/epic/UFS-WM_RT/NEMSfv3gfs/develop-20240813/hafs_regional_storm_following_1nest_atm_ocn_wav_mom6_intel/RESTART

they are not the same, and the BL directory "develop-20240813" contained the correct MOM restart file. But right now when I compare them, they are the same. So somehow files in "develop-20240813_" was being copied to develop-20240813" in the past half day which shouldn't.

@jiandewang
Copy link
Collaborator Author

@jiandewang let me check with develop branch

@jkbk2004 a clean way to do is re-run this job and replace the MOM restart file in BL dir.

@jkbk2004
Copy link
Collaborator

@jiandewang let me check with develop branch

@jkbk2004 a clean way to do is re-run this job and replace the MOM restart file in BL dir.

@jiandewang recovered ok.

DavidHuber-NOAA added a commit to DavidHuber-NOAA/ufs-weather-model that referenced this pull request Sep 9, 2024
…r-model into develop

* 'develop' of https://github.com/ufs-community/ufs-weather-model:
  update mom6 to its main repo. 20240824 commit (FMA) (ufs-community#2412)
  Update CMEPS; fix aux history functionality for float variable type; Switch to using Aux history files in atm_ds2s_docn_dice test; Remove IFI tests (was ufs-community#2417) (ufs-community#2395)
  Combination for CCPP-physics ufs-community#213 and ufs-community#218 (H2O scheme refactor and C3/SAS/MYNN fix) (ufs-community#2408)
  Unify CDEPS gfs, cfsr, and gefs datm datamodes + Improve error checking in rt.sh (2388) + Add ability to read increment files on native cubed sphere grid (2304) (ufs-community#2389)
  sync with head of NOAA-EMC UPP develop (ufs-community#2326)
  Allow use of downscaled warmstart files for cpld_control_sfs test (ufs-community#2375)
  update to MOM6 main 20240729 commit (gfdl-to-main-2024-05-31) (ufs-community#2381)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update to MOM6 main 20240729 commit (gfdl-to-main-2024-05-31)
7 participants