Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Split EnsembleStat vx tasks into GenEnsProd and EnsembleStat #809

Merged
merged 45 commits into from
Jun 5, 2023

Conversation

gsketefian
Copy link
Collaborator

@gsketefian gsketefian commented May 23, 2023

DESCRIPTION OF CHANGES:

The latest versions of MET/METplus have introduced a new tool named GenEnsProd that performs many of the ensemble-related calculations that EnsembleStat used to do. Although EnsembleStat can still perform these calculations, that will change in future versions of MET/METplus. Thus, this PR creates new vx tasks that call GenEnsProd to perform the necessary calculations, removing them from the EnsembleStat tasks. Details:

  • Add new METplus config files for GenEnsProd.
  • Modify exregional_run_met_genensprod_or_ensemblestat.sh to enable it to call GenEnsProd (as well as EnsembleStat).
  • Modify the EnsembleStat config files so that EnsembleStat no longer calculates the ensemble statistics that are now the purview of GenEnsProd.
  • Modify the ex-scripts for the ensemble mean and probabilistic tasks so that they look for forecast input from the output of the GenEnsProd tasks, not the output of the EnsembleStat tasks.
  • Modify all the METplus config files so that they are as similar to each other as possible (for possible consolidation later on).
  • For clarity and correctness, rename the workflow configuration files verify.yaml and verify_ensgrid.yaml to verify_det.yaml and verify_ens.yaml, respectively.
  • Add new yaml configuration file named verify_pre.yaml that contains (meta)tasks that are prerequisites for both deterministic and ensemble vx (these tasks were previously included in verify_det.yaml).
  • Modify the default ensemble vx workflow configuration file verify_ens.yaml to include the new GenEnsProd tasks.
  • Modify verify_det.yaml and verify_ens.yaml so that vx task dependencies can be formulated more precisely, e.g. have dependencies on member-specific tasks instead of whole metatasks. This includes splitting metatasks into two or more other ones, moving tasks from one metatask to another, etc.
  • Modify the WE2E configuration files for the four vx tests (MET_verification, MET_verification_only_vx, MET_ensemble_verification, and MET_ensemble_verification_only_vx) to reflect the changes in the default vx workflow configuration files verify_pre.yaml, verify_det.yaml, and verify_ens.yaml.
  • Introduce the new experiment variable VX_FIELDS to enable the user to specify the fields or groups of fields on which to perform verification.
  • Introduce the new experiment variable VX_APCP_ACCUMS_HH to enable the user to specify the accumulation hours for which to perform verification on accumulated precip (APCP).
  • Modify setup.py to remove any [meta]tasks for which fields are not specified in VX_FIELDS.
  • Modify setup.py so that for configuration parameters that are arrays/lists, each value of the list is checked to ensure it is valid (if that variable has a corresponding valid_vals_ array in valid_param_vals.yaml).
  • Make modifications to vx ex-scripts as well as config_defaults.sh to ensure that the vx tests succeed in NCO mode.
  • Make sure that the experiment variable ENS_TIME_LAG_HRS is set and accessed correctly for both deterministic and ensemble configurations. For a deterministic configuration, this is a 1-element array containing the time lag (if any, in units of hours), while for an ensemble configuration it contains NUM_ENS_MEMBERS elements containing the time lag for each element.
  • Add a new WE2E test (MET_ensemble_verification_only_vx_time_lag) that performs deterministic and ensemble verification on a 2-member ensemble with one time-lagged and one non-time-lagged member.
  • Add the existing WE2E test MET_ensemble_verification_only_vx and the new test MET_ensemble_verification_only_vx_time_lag to the list of comprehensive WE2E tests.
  • Include hour 0 in the verification of MRMS and NDAS fields.
  • Fix bugs in exregional_get_obs_[ccpa|mrms|ndas].sh ex-scripts.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

I ran the following vx WE2E tests on Hera with intel, both in community mode and in NCO mode:

MET_verification
MET_verification_only_vx
MET_ensemble_verification
MET_ensemble_verification_only_vx
MET_ensemble_verification_only_vx_time_lag

I also ran the following fundamental suite of tests:

grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_GFS_v16
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16

All tests were successful.

DOCUMENTATION:

Documentation is needed but will be added once the whole set of vx PRs has been merged (see Issue #630).

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

gsketefian added 25 commits May 12, 2023 14:00
… now done in GenEnsProd (to avoid duplication).
…e output of GenEnsProd (instead of EnsembleStat, which no longer generates the necessary outputs).
…e desired dependencies possible; remove unneeded variables; change vx WE2E test config files to be compatible with aforementioned changes.
…hich fields or group of fields on which to perform verification.
…er or not to run deterministic and ensemble verification, respectively.
…lus config files (and in order for the final config files to have the correct values), change the way the variable ENSMEM_INDX is passed into METplus config files from an environment variable to a jinja variable.
1) Replace "mem${ensmem_indx}" with new variable "${ensmem_name}".
2) Remove DOT_ENSMEM_OR_NULL since ensmem_name can now be used instead.
3) Replace SLASH_ENSMEM_SUBDIR_OR_NULL with slash_ensmem_subdir_or_null.
4) Remove unneeded code from exregional_run_met_pb2nc_obs.sh.
@gsketefian gsketefian changed the title [develop] Split EnsembleStat tasks into GenEnsProd and EnsembleStat [develop] Split EnsembleStat vx tasks into GenEnsProd and EnsembleStat May 23, 2023
@gsketefian gsketefian marked this pull request as ready for review May 23, 2023 07:58
…ctionality is replaced by inclusion/exclusion of workflow taskgroup yaml configuration files.
…of ensemble vx tasks in the workflow (by making sure that the names of the ensemble vx tasks are read in from the appropriate yaml file instead of being hard-coded).
…mble and deterministic forecasts) and the way the time lag is calculated for a given ensemble member (or the deterministic forecast).
@gsketefian
Copy link
Collaborator Author

@christinaholtNOAA I think I addressed all of your concerns except the renaming of the vx tasks (Issue #633).
In the interests of getting this PR in so others who are working on further mods to vx can start their work and to not complicate this PR any further, I'd like to address that in a separate PR (also I'd like to consult @michelleharrold, @JeffBeck-NOAA, @willmayfield, and @mkavulich before I rename to make sure we agree). Hope that is ok with you. Thanks.

@JeffBeck-NOAA
Copy link
Collaborator

@christinaholtNOAA I think I addressed all of your concerns except the renaming of the vx tasks (Issue #633). In the interests of getting this PR in so others who are working on further mods to vx can start their work and to not complicate this PR any further, I'd like to address that in a separate PR (also I'd like to consult @michelleharrold, @JeffBeck-NOAA, @willmayfield, and @mkavulich before I rename to make sure we agree). Hope that is ok with you. Thanks.

@gsketefian, thanks for all your work on this PR! I agree that we can tackle the renaming of the vx tasks in a separate PR after we consult with others. I'd personally be fine with just removing the "run" portion of the ex-script and j-job names for all vx tasks.

gsketefian added 10 commits May 26, 2023 13:28
…ation_only_vx_time_lag" to look for the staged forecast data from the default location for the machine; also, change the predefined grid name to the one on which this staged data is provided (this is necessary to form the correct path to the data).
…g any time-lagging) to names of GenEnsProd and EnsembleStat output files.
…GenEnsProd task, not the EnsembleStat task. Also, change order of metatasks so that the one for MRMS comes before the one for NDAS.
Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsketefian These changes look good to me! I was also able to successfully run the new MET_ensemble_verification_only_vx_time_lag test, as well as the rest of the verification WE2E tests on Hera.

@MichaelLueken
Copy link
Collaborator

@gsketefian I will go ahead and submit the automated Jenkins testing on this work now. If @christinaholtNOAA requests additional changes, then the tests can be resubmitted at that time.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Jun 2, 2023
Copy link
Collaborator

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsketefian Thanks for addressing my concerns! This looks great.

@MichaelLueken
Copy link
Collaborator

@gsketefian The automated Jenkins tests encountered failures:

Cheyenne GNU - the nco_grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thompson_mynn_lam3km test failed (detailed in issue #806).

Gaea - SRW won't build following the latest maintenance on the machine. Will be turning off Gaea testing in #799 until the UFS-WM has merged the new Gaea modulefile changes.

Hera Intel - the grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta test failed (detailed in issue #805).

I saw that the there were two failures in the Hera GNU test - get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_nemsio_2019061200 and get_from_NOMADS_ics_FV3GFS_lbcs_FV3GFS. These tests shouldn't be failing, so I have resubmitted them. I will let you know if they continue to fail.

I will also run the Jenkins tests manually on Orion, since the machine requires git/2.28.0 to be loaded, but since git/2.28.0 isn't available on Hercules and Hercules and Orion are the same machine, the git/2.28.0 module load is constantly removed from the role-epic-ps account's .bashrc file. I will let you know if there are any failures.

@MichaelLueken
Copy link
Collaborator

@gsketefian Both the manual run of the Jenkins tests on Orion and the rerun of Hera Intel automated Jenkins tests have all successfully passed. I will now move forward with merging this work to develop.

@MichaelLueken MichaelLueken merged commit e3bb9ea into ufs-community:develop Jun 5, 2023
@gsketefian
Copy link
Collaborator Author

@MichaelLueken Thanks for running the tests and merging!

michelleharrold pushed a commit to michelleharrold/ufs-srweather-app that referenced this pull request Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants