Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ERA5 data atmosphere mode and bring HAFS related changes (recreated from different branch) #108

Merged
merged 9 commits into from
Jul 26, 2021

Conversation

uturuncoglu
Copy link
Collaborator

@uturuncoglu uturuncoglu commented Jul 23, 2021

Description of changes

This PR aims to bring UFS HAFS related changes to the upstream.

  • There is no any distinction about the fields used in the HAFS fully coupled configuration (FV3+HYCOM) and DATM+HYCOM configuration in terms of the exchanged fields.
  • ERA5 data mode is simplified
  • This PR also fix the issue if component specific *_modelio:: section, which includes diro and logfile does not exist. the model was looking those attributes and failing but now it uses defaults (diro = "." and logfile = "d"//shr_string_toLower(compname)//".log") if this is the case.

Specific notes

The testing will be combined with the CMEPS PR (coming soon).

Contributors other than yourself, if any: @binli2337 @BinLiu-NOAA

CMEPS Issues Fixed (include github issue #):

Are there dependencies on other component PRs

  • CIME (list)
  • CMEPS (list)

Are changes expected to change answers?

  • bit for bit
  • different at roundoff level
  • more substantial

Any User Interface Changes (namelist or namelist defaults changes)?

  • Yes
  • No

Testing performed:

  • (required) aux_cdeps

    • machines and compilers: Cheyenne, intel/19.1.1 and mpt/2.22
    • details (e.g. failed tests): Only SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP test is failing with ERROR: (shr_stream_findBounds) ERROR: rDateIn lt rDatelvd limit true which seems related with the configuration in datm.streams.xml. As for your reference, the run directory is in /glade/scratch/turuncu/SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.cheyenne_intel.20210723_155750_fc2o4x/run
  • (optional) CESM prealpha test

    • machines and compilers: Cheyenne, intel/19.1.1 and mpt/2.22
    • details (e.g. failed tests): Following tests are failed but we know that these tests fails also in the ESMF new alarm implementation.
  DAE_N2_D_Lh12_Vnuopc.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_intel.clm-DA_multidrv (Overall: FAIL) details:
    FAIL DAE_N2_D_Lh12_Vnuopc.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_intel.clm-DA_multidrv RUN time=1223
  ERC_D_Ln9_Vnuopc.mpasa480z32_mpasa480.FHS94.cheyenne_intel.cam-outfrq3s_usecase (Overall: FAIL) details:
    FAIL ERC_D_Ln9_Vnuopc.mpasa480z32_mpasa480.FHS94.cheyenne_intel.cam-outfrq3s_usecase MODEL_BUILD time=15
  ERP_D_Ln9_Vnuopc.C48_C48_mg17.QPC6.cheyenne_intel.cam-outfrq9s (Overall: FAIL) details:
    FAIL ERP_D_Ln9_Vnuopc.C48_C48_mg17.QPC6.cheyenne_intel.cam-outfrq9s MODEL_BUILD time=105
  ERP_D_Ln9_Vnuopc.f09_f09_mg17.FSD.cheyenne_intel.cam-outfrq9s_contrail (Overall: FAIL) details:
    FAIL ERP_D_Ln9_Vnuopc.f09_f09_mg17.FSD.cheyenne_intel.cam-outfrq9s_contrail RUN time=180
  IRT_N3_PM3_Ld7_Vnuopc.f19_g17.BHIST.cheyenne_intel.allactive-defaultio (Overall: FAIL) details:
    FAIL IRT_N3_PM3_Ld7_Vnuopc.f19_g17.BHIST.cheyenne_intel.allactive-defaultio SHAREDLIB_BUILD time=9
  NCK_Ld5_Vnuopc.f19_g17.B1850G.cheyenne_intel.allactive-cism-test_coupling (Overall: FAIL) details:
    FAIL NCK_Ld5_Vnuopc.f19_g17.B1850G.cheyenne_intel.allactive-cism-test_coupling COMPARE_base_multiinst
  NCK_Vnuopc.f19_g17.B1850.cheyenne_intel.allactive-defaultiomi (Overall: FAIL) details:
    FAIL NCK_Vnuopc.f19_g17.B1850.cheyenne_intel.allactive-defaultiomi COMPARE_base_multiinst
  SMS_Lm13_Vnuopc.f10_f10_mg37.I1850Clm50SpG.cheyenne_intel (Overall: FAIL) details:
    FAIL SMS_Lm13_Vnuopc.f10_f10_mg37.I1850Clm50SpG.cheyenne_intel RUN time=1
  SMS_Ln9_Vnuopc.f19_f19_mg17.FXHIST.cheyenne_intel.cam-outfrq9s_multi (Overall: FAIL) details:
    FAIL SMS_Ln9_Vnuopc.f19_f19_mg17.FXHIST.cheyenne_intel.cam-outfrq9s_multi RUN time=1822
  SMS_Vnuopc_Lm13.f10_f10_mg37.I1850Clm50SpG.cheyenne_intel (Overall: FAIL) details:
    FAIL SMS_Vnuopc_Lm13.f10_f10_mg37.I1850Clm50SpG.cheyenne_intel RUN time=1

Here is the detail investigation of the errors,
DAE_N2_D_Lh12_Vnuopc.f10_f10_mg37.I2000Clm50BgcCrop: It runs but killed by the scheduler due to time out.
ERC_D_Ln9_Vnuopc.mpasa480z32_mpasa480.FHS94: fails in build with following error.
/glade/work/turuncu/TESTING/CESM_hafs/components/cam/src/dynamics/mpas/dycore/src/framework/mpas_derived_types.F(35): error #7002: Error in opening the compiled module file. Check INCLUDE paths. [ESMF_STUBS]
ERP_D_Ln9_Vnuopc.C48_C48_mg17.QPC6: I checkout externals by providing -o option as @jedwards4b suggested and I have ../../libraries/FMS/ directory now but the test still fails with /glade/work/turuncu/TESTING/CESM_hafs/components/cam/src/dynamics/fv3/atmos_cubed_sphere/tools/fv_mp_mod.F90(75): error #6580: Name in only-list does not exist or is not accessible. [MPP_NODE] error.
ERP_D_Ln9_Vnuopc.f09_f09_mg17.FSD.cheyenne_intel.cam-outfrq9s_contrail: It gives floating point error in atm PEs.
IRT_N3_PM3_Ld7_Vnuopc.f19_g17.BHIST: It could not find suitable decomposition configuration and fails with CAM build-namelist - ERROR: fv_decomp_set failed to find a decomposition. error.
NCK_Ld5_Vnuopc.f19_g17.B1850G and NCK_Vnuopc.f19_g17.B1850: It runs but there are Andover changes and do not validate.
SMS_Lm13_Vnuopc.f10_f10_mg37.I1850Clm50SpG: It dies with BlockingIOError: [Errno 11] Resource temporarily unavailable: during run.
SMS_Ln9_Vnuopc.f19_f19_mg17.FXHIST: It runs but killed by the scheduler due to time out.
SMS_Vnuopc_Lm13.f10_f10_mg37.I1850Clm50SpG Again, it dies with BlockingIOError: [Errno 11] Resource temporarily unavailable: during run.

Hashes used for testing:

@uturuncoglu
Copy link
Collaborator Author

@binli2337 I had to update this PR since support/HAFS was protected and I could not fix the issue related with the CESM. if you don't mind could you test this with the UFS application. Thanks.

@uturuncoglu
Copy link
Collaborator Author

uturuncoglu commented Jul 24, 2021

@jedwards4b @mvertens I updated PR description with the result of CDEPS aux tests. I have only one failed test and more information can be found in there. It seems that it is related with the configuration of data atmosphere. Let me know if this is not expected.

@uturuncoglu
Copy link
Collaborator Author

@junwang-noaa @binli2337 This PR also includes the work related with the renaming of CDEPS modules. I have already tested it under HAFS application by running HAFS specific RTs as well as the ones in rt.conf and all of them was fine but it would be nice to check it again by you.

@jedwards4b
Copy link
Contributor

jedwards4b commented Jul 24, 2021

@uturuncoglu I ran the cdeps testlist against my PR #106 and compared to the july baseline just a couple of days ago and all passed. There are no expected failures.
/glade/scratch/jedwards/SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.cheyenne_intel.datm-1PT.C.20210722_130230_k6l5y3

I see this difference in nuopc.runconfig:

194c194
< (mine)     start_ymd = 19931202
---
> (yours)     start_ymd = 00010101

This came from components/cdeps/datm/cime_config/testdefs/testmods_dirs/datm/1PT/shell_commands
I think you just need to merge master into your branch.

@uturuncoglu
Copy link
Collaborator Author

@jedwards4b it is strange, actually I could not find anything in this PR that could lead to issue. Anyway, let me check again.

@uturuncoglu
Copy link
Collaborator Author

Okay. I run again that particular test again and it pass without any issue. I am not sure but my glade quote was full and maybe that was the reason. The alpha test were failed because of it. Anyway, I'll keep continue to test and let you know.

@uturuncoglu
Copy link
Collaborator Author

@jedwards4b I think that the test defined as CDEPS aux test is SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.cheyenne_intel which is different from SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.cheyenne_intel.datm-1PT. As you can see the test names different. The test with datm-1PT suffix works without any issue but other fails. Maybe these are different tests or maybe the test under ../../components/cdeps/cime_config/testdefs/testlist_cdeps.xml is not updated. Let me know what do you think?

@junwang-noaa
Copy link
Contributor

@junwang-noaa @binli2337 This PR also includes the work related with the renaming of CDEPS modules. I have already tested it under HAFS application by running HAFS specific RTs as well as the ones in rt.conf and all of them was fine but it would be nice to check it again by you.

@uturuncoglu Thanks for including those changes and tested in HAFS. The code changes look good to me.

@uturuncoglu
Copy link
Collaborator Author

All CESM prealpha test are run and the failed tests are listed in the PR description. It seems that they are not related with this PR.

@binli2337
Copy link
Contributor

@uturuncoglu @junwang-noaa All cdeps tests in ufs-weather-model are successful.

@uturuncoglu
Copy link
Collaborator Author

@binli2337 That is great! Thanks for testing this PR.

@uturuncoglu
Copy link
Collaborator Author

@jedwards4b do you want me to go and merge this PR or do you have any idea about the difference between SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.cheyenne_intel.datm-1PT and SMS_Vnuopc_Ld5_P1.1x1_mexicocityMEX.2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP.cheyenne_intel? Let me know what do you think?

@jedwards4b
Copy link
Contributor

@uturuncoglu You need to merge master into your branch for all of the cdeps tests to pass.

<test compset="2000_DATM%1PT_SLND_SICE_SOCN_SROF_SGLC_SWAV_SESP" grid="1x1_mexicocityMEX" name="SMS_Vnuopc_Ld5_P1" testmods="datm/1PT">
The test was changed to add this required testmod.

@uturuncoglu
Copy link
Collaborator Author

@jedwards4b Thanks. Okay. I'll do it once Cheyyene is back and test again.

@uturuncoglu uturuncoglu merged commit e9e072d into ESCOMP:master Jul 26, 2021
@uturuncoglu uturuncoglu deleted the feature/pr branch July 26, 2021 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants