Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update GF for RRFS 2023 HWT SFE #1713

Merged
merged 19 commits into from
Apr 20, 2023
Merged

Conversation

haiqinli
Copy link
Contributor

@haiqinli haiqinli commented Apr 14, 2023

Description

1). A update of the Grell-Freitas (GF) convection for RRFS;
2). A update of RUC LSM to use BNU soil property for RRFS;
3). This is an urgent PR for RRFS 2023 HWT SFE, which will start on May 1st, 2023;
4). related PRs for fv3atm and ccpp-physics
fv3atm: NOAA-EMC/fv3atm#644
ccpp-physics: ufs-community/ccpp-physics#64

Input data additions/changes

  • [ V] No changes are expected to input data.
  • There will be new input data.
  • Input data will be updated.

Anticipated changes to regression tests:

  • No changes are expected to any regression test.
  • [ V] Changes are expected to the following tests:

Subcomponents involved:

  • AQM
  • CDEPS
  • CICE
  • CMEPS
  • CMakeModules
  • [V ] FV3
  • GOCART
  • HYCOM
  • MOM6
  • NOAHMP
  • WW3
  • stochastic_physics
  • none

Commit Queue Checklist:

  • [ V] Link PR's from all sub-components involved
  • [ V] Confirm reviews completed in sub-component PR's
  • [ V] Add all appropriate labels to this PR.
  • [ V] Run full RT suite on either Hera/Cheyenne with both Intel/GNU compilers
  • [ V] Add list of any failed regression tests to "Anticipated changes to regression tests" section.

Testing Day Checklist:

  • [ V] This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
  • Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.

Testing Log (for CM's):

  • RDHPCS
    • Intel
      • [ V] Hera
      • Orion
      • Jet
      • Gaea
      • Cheyenne
    • GNU
      • [V ] Hera
      • Cheyenne
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
    • Completed
  • opnReqTest
    • N/A
    • Log attached to comment

@grantfirl
Copy link
Collaborator

@haiqinli Could you add a list of the tests that are expected to fail? I see that you have the new baseline directories listed in NOAA-EMC/fv3atm#644 and that you have committed successful RT logs, but we need to know which RTs fail. I suppose if you committed the RT logs that have a list of failures, that would have worked, but I think that the UFS code managers just want a list of the expected RT failures listed in the description or the comment section here.

@haiqinli
Copy link
Contributor Author

@grantfirl Sure, I am starting to do a regression test against the original baseline, and will add a list of the tests that are expected to fail when it is done. Thanks.

@grantfirl
Copy link
Collaborator

@jkbk2004 This PR is a high priority due to it being needed for the RRFS and the HWT spring experiment. Could this be added to the queue after #1669?

@haiqinli
Copy link
Contributor Author

@grantfirl The following lists the tests are expected to fail on Hera with Intel compiler. Thanks.

fail_test_057 fail_test_063 fail_test_073 fail_test_085 fail_test_100 fail_test_105 fail_test_112 fail_test_119
fail_test_058 fail_test_065 fail_test_074 fail_test_096 fail_test_101 fail_test_106 fail_test_113 fail_test_120
fail_test_059 fail_test_066 fail_test_075 fail_test_097 fail_test_102 fail_test_107 fail_test_114 fail_test_121
fail_test_060 fail_test_067 fail_test_083 fail_test_098 fail_test_103 fail_test_110 fail_test_115
fail_test_062 fail_test_072 fail_test_084 fail_test_099 fail_test_104 fail_test_111 fail_test_118

@jkbk2004
Copy link
Collaborator

@jkbk2004 This PR is a high priority due to it being needed for the RRFS and the HWT spring experiment. Could this be added to the queue after #1669?

@grantfirl sure, I think we can start working on this pr today. will keep you posted.

@FernandoAndrade-NOAA
Copy link
Collaborator

@haiqinli Could you sync your branch with the latest develop? I can get started on jenkins-ci after that. Thank you

@DeniseWorthen
Copy link
Collaborator

@haiqinli The list of "fail_test" you provided doesn't tell us which actual test failed. Could you provide the list of the failed tests by name?

@jkbk2004
Copy link
Collaborator

@haiqinli @grantfirl since the PR needs urgency, can you sync up and respond to @DeniseWorthen's comment ? I see about 35 cases changing results. Those are the cases that use GF scheme and RUC LSM, right?

@haiqinli
Copy link
Contributor Author

@jkbk2004 @DeniseWorthen Yes, the cases with GF convection and RUC LSM are changed. I am rerunning the regression test, and will add the detailed cases when it is done. Thanks.

@haiqinli
Copy link
Contributor Author

@FernandoAndrade-NOAA My branch has been synced with the latest develop. Thanks.

@haiqinli
Copy link
Contributor Author

@grantfirl @DeniseWorthen @jkbk2004 After syncing all the components, the compiling of the regression test failed (/scratch1/BMC/gsd-fv3-dev/NCEPDEV/stmp4/Haiqin.Li/Haiqin.Li/FV3_RT/rt_33652/compile_001.log.0). Would you like to have a look and give me some suggestions? Thanks.

@FernandoAndrade-NOAA
Copy link
Collaborator

FernandoAndrade-NOAA commented Apr 18, 2023

@haiqinli The current hashes used in the ufs weather model for ww3 and upp are:
ww3
081d5a64f205caac7f920acec6004b1a184f5709
upp
dccb32176930676ef2a258eb65571ab4e3f7e7a4
Please adjust to point to those hashes, thank you

@haiqinli
Copy link
Contributor Author

@FernandoAndrade-NOAA Yes, it works after pointing to the correct hashes of upp and WW3. Thanks.

@haiqinli
Copy link
Contributor Author

@grantfirl @DeniseWorthen @jkbk2004 @FernandoAndrade-NOAA
The following lists the tests are expected to fail on Hera with Intel compiler. Thanks.
The regression test run directory is /scratch1/BMC/gsd-fv3-dev/NCEPDEV/stmp4/Haiqin.Li/Haiqin.Li/FV3_RT/rt_200304.

hrrr_control_2threads.err.0 rap_flake_debug.err.0 regional_spp_sppt_shum_skeb.err.0
hrrr_control_debug.err.0 rap_lndp_debug.err.0 rrfs_conus13km_hrrr_warm_debug.err.0
hrrr_control_decomp.err.0 rap_noah_debug.err.0 rrfs_conus13km_hrrr_warm.err.0
hrrr_control.err.0 rap_noah_sfcdiff_cires_ugwp_debug.err.0 rrfs_smoke_conus13km_hrrr_warm_2threads.err.0
rap_2threads.err.0 rap_progcld_thompson_debug.err.0 rrfs_smoke_conus13km_hrrr_warm_debug_2threads.err.0
rap_cires_ugwp_debug.err.0 rap_sfcdiff_debug.err.0 rrfs_smoke_conus13km_hrrr_warm_debug.err.0
rap_control_debug.err.0 rap_sfcdiff_decomp.err.0 rrfs_smoke_conus13km_hrrr_warm.err.0
rap_control.err.0 rap_sfcdiff.err.0 rrfs_smoke_conus13km_radar_tten_warm.err.0
rap_decomp.err.0 rap_unified_drag_suite_debug.err.0
rap_diag_debug.err.0 rap_unified_ugwp_debug.err.0

@FernandoAndrade-NOAA
Copy link
Collaborator

@FernandoAndrade-NOAA Yes, it works after pointing to the correct hashes of upp and WW3. Thanks.

Looks like WW3 is good to go, but your UPP is still pointing to 87a7542, please point to
dccb32176930676ef2a258eb65571ab4e3f7e7a4

@haiqinli
Copy link
Contributor Author

@FernandoAndrade-NOAA Thanks. The UPP hash has been pointed to dccb32176930676ef2a258eb65571ab4e3f7e7a4. Thanks.

@jkbk2004
Copy link
Collaborator

So, basically all rap, rrfs, hrrr cases are baseline change cases. I am attaching @haiqinli 's hera log files as pre-test results.
RegressionTests_hera.intel.log
RegressionTests_hera.gnu.log

@FernandoAndrade-NOAA FernandoAndrade-NOAA added Baseline Updates Current baselines will be updated. Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. jenkins-ci Jenkins CI: ORT build/test on docker container labels Apr 18, 2023
@FernandoAndrade-NOAA
Copy link
Collaborator

@BrianCurtis-NOAA I'm running ci for this now

@BrianCurtis-NOAA
Copy link
Collaborator

Looks like the FV3 PR was not approved yet.

@jkbk2004
Copy link
Collaborator

@FernandoAndrade-NOAA I am trying to manually run those failed cases on hera.gnu and cheyenne.intel.

@jkbk2004
Copy link
Collaborator

I was checking jet. Quite slurm input/output issue: "/var/spool/slurmd/job26295598/slurm_script: line 0: echo: write error: Input/output error". I will let all jobs go thru and check again. Sounds like we need more time.

@jkbk2004
Copy link
Collaborator

BTW, clearly file system issue on jet, I think.

@jkbk2004
Copy link
Collaborator

all tests are done. we can start merging process.

@jkbk2004
Copy link
Collaborator

@haiqinli fv3 pr was merged. can you update hash and revert change in gitmodules?

@jkbk2004
Copy link
Collaborator

@haiqinli correct hash is NOAA-EMC/fv3atm@aed0607

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. jenkins-ci Jenkins CI: ORT build/test on docker container Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants