Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Add -n 1 to allow the use of the service partition #1012

Merged

Conversation

MichaelLueken
Copy link
Collaborator

DESCRIPTION OF CHANGES:

Following the Slurm update on Hera and Jet, the service partition is no longer usable within the SRW App. This PR will make the necessary changes to allow the service partition to once again function properly, by adding -n 1 to the SCHED_NATIVE_CMD_HPSS variable in the Hera and Jet machine yaml files, and updating the native entry in the parm/wflow/verify_pre.yaml file's get_* tasks.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

TESTS CONDUCTED:

  • hera.intel
  • jet.intel
  • fundamental test suite

ISSUE:

Fixes #1011

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes

Copy link
Collaborator

@RatkoVasic-NOAA RatkoVasic-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Collaborator

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Feb 9, 2024
@MichaelLueken
Copy link
Collaborator Author

The manual running of the WE2E coverage tests on Derecho have successfully completed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
custom_ESGgrid_IndianOcean_6km_20240209072225                      COMPLETE              23.89
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot_20  COMPLETE              37.05
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024020907222  COMPLETE              44.84
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_HRRR_20240209  COMPLETE              28.82
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2  COMPLETE              17.61
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024020907223  COMPLETE              40.24
nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_  COMPLETE              24.03
pregen_grid_orog_sfc_climo_20240209072238                          COMPLETE              14.45
specify_template_filenames_20240209072241                          COMPLETE              14.56
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             245.49

@MichaelLueken
Copy link
Collaborator Author

The SRW App will not currently run on Gaea C5 following the transition from the F2 to F5 filesystem, which required aborting the Jenkins tests on Gaea C5. The tests on Hera GNU, Hera Intel, Hercules, Jet, and Orion have successfully passed.

Manually running the Gaea C5 coverage tests on Hercules to ensure that the tests continue to run. These tests have successfully passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
community_20240209135559                                           COMPLETE              25.63
custom_ESGgrid_NewZealand_3km_20240209135600                       COMPLETE              43.77
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2  COMPLETE              23.14
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP_20240209135  COMPLETE              24.64
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR_2024020913  COMPLETE              26.77
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thompson  COMPLETE             286.21
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024020  COMPLETE              21.85
grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_20  COMPLETE             284.97
grid_SUBCONUS_Ind_3km_ics_RAP_lbcs_RAP_suite_RRFS_v1beta_plot_202  COMPLETE              10.56
nco_ensemble_20240209135607                                        COMPLETE              75.37
nco_grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thom  COMPLETE             281.62
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            1104.53

Now merging this work.

@MichaelLueken MichaelLueken merged commit 6874ce8 into ufs-community:develop Feb 9, 2024
2 of 4 checks passed
willmayfield added a commit to willmayfield/ufs-srweather-app that referenced this pull request Feb 11, 2024
[mumip_ufs] Add -n 1 to allow the use of the service partition (ufs-community#1012)
@MichaelLueken MichaelLueken deleted the bugfix/service_update branch February 15, 2024 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[develop] The service partition no longer works on Hera and Jet following the Slurm updates
3 participants