-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource adjustments and updated obsproc/prepobs packages for tcvitals bug fix #862
Resource adjustments and updated obsproc/prepobs packages for tcvitals bug fix #862
Conversation
- Increase memory request values for some non-exclusive jobs that have been getting Cgroup mem warning messages on WCOSS2. - Translate additional high res resource values from config.resources.nco.static into config.resources.emc.dyn. Refs: NOAA-EMC#665, NOAA-EMC#744
- Adjust/increase memory requests for some non-exclusive jobs that were getting the Cgroup mem warning messages on WCOSS2. - Some additional memory adjustments to wave jobs in resource configs. Refs: NOAA-EMC#665, NOAA-EMC#744
- Update obsproc_run_ver to 1.0.2-rd - Update prepobs_run_ver to 1.0.1-rd New obsproc/prepobs versions include tcvitals bug fix. Refs NOAA-EMC#665
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing major, but a few questions.
export npe_node_fcst=32 | ||
export npe_node_fcst_gfs=42 | ||
export npe_node_fcst_gfs=24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do the values on L200 and L199 differ from the calculations on L195 and L196?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L195 = 128/3 = 42.6, which rounds up to 43 in the xml
L196 = 128/5 = 25.6, which rounds up to 26 in the xml
Neither value lays nicely across the WCOSS2 nodes.
George V and I did a lot of testing to get the values on L199 and L200. We want users using these values for C768 on WCOSS2. May also want them for lower resolutions too but haven't fully vetted those resolutions on WCOSS2 yet.
This reminded me I forgot to adjust the C768 block in config.fv3.emc.dyn
with the tested values for layout_x_gfs
and WRTTASK_PER_GROUP_GFS
from config.fv3.nco.static
. Please see that change in this PR now too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. So these values are specific to C768.
May be then please add a note to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I added a check for CASE=768 to make sure those values aren't used outside of high res on WCOSS2.
- Update the C768 block in config.fv3.emc.dync with the tested values from config.fv3.nco.static. - Will be used on WCOSS2 for C768 but not on R&D machines because of npe_node_max checks. Refs: NOAA-EMC#665
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just add a note about C768 for those calculations and that they will be revised for other resolutions when we get to it.
- Add a check that CASE=768 in the fcst block of config.resources.emc.dyn where specific values of npe_node_fcst are set for WCOSS2. Refs: NOAA-EMC#665
Description
This PR includes memory updates and new obsproc/prepobs package versions:
Cgroup mem limit exceeded
warning message on WCOSS2.config.resources.nco.static
toconfig.resources.emc.dyn
and fromconfig.fv3.nco.static
toconfig.fv3.emc.dyn
.obsproc_run_ver
inversions/*.ver
files for WCOSS2, Hera, Orion.prepobs_run_ver
inversions/*.ver
files for WCOSS2, Hera, Orion.New obsproc/prepobs packages have been installed on WCOSS2, Hera, and Orion.
Will merge these changes into the
release/gfs.v16.3.0
branch next.Type of change
How Has This Been Tested?
Refs #665, #744