-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Introduce DA data preprocessing to workflow #778
[develop] Introduce DA data preprocessing to workflow #778
Conversation
- "archive_internal_dir" must be an empty string for top-level archive files - Rename "file_type" argument to "file_fmt" - Rename "external_model" argument to more generic/correct "data_type" - If only some files can not be un-zipped, do not fail. We'll handle this more gracefully later if it needs to be an error - Remove default location logic for now; couldn't get it working - Re-add explicit disk path requirement; may re-visit later - For fatal error, do not imply that script continues - Remove debug prints
"rap_e" files in addition to "rap".
specifically HPSS fails with some kind of logging error. Needs more investigation. This commit removes some un-needed, un-initialized variables, adds modulefiles for the new task on each platform, and adds some unit tests for retrieve_data.py to test new capabilities. The new task entry has been added to parm/wflow/coldstart.yaml for testing, but it should be removed prior to merging this PR since the task is not integrated with the rest of the workflow yet.
… tasks going forward. Clearly the boundary conditions tasks will need to eventually change also.
- Link obs needed for task_process_bufr - Put obs in $DATA/obs rather than $DATA/gsi - Put in logic for getting "rap_e" files at 00 and 12z cycles; I hope we can extract this logic into the config file later with Jinja - Add "task_process_bufr" entry to warmstart.yaml with a dependency on task get_da_obs - Add "CYCLE_TYPE" variable to config_defaults - Start a new config.rrfs.yaml file
- Set machine-specific paths in machine files, this should have been obvious... - Use RRFS_NA_13km domain for RRFS - Re-organize Jet and Hera machine files for section consistency - Fix grammar error in retrieve_data.py
- Move date variables to J-job level - Keep da_obs in same directory for community and nco mode (since process_bufr does the same) - In process_bufr, look for da_obs in staging directory
- Add default directory to Jet machine file...only available there for now - Add task_process_lightning to warmstart workflow
…ve to Hera. Also fix config.rrfs.yaml to better settings for test case.
…to avoid duplicate error messages, search for RRFS data on disk as well as hpss
Removing the CYCLE_TYPE from config_defaults because it is a run-time, cycle-dependent setting, not an experiment level one.
This is similar to what is done for external model files so that we can use the machine file information about which data stores a platform has access to.
Matching new standard outlined in Issue ufs-community#633
…obs_task Had to manually revert some renaming changes in ush/test_retrieve_data.py
@mkavulich Following the merging of PR #728 this morning, there is now a conflict in |
…ata isnt ready by 00z two days later
…l problems when running all unit tests
@MichaelLueken I have merged in the latest changes from develop, and fixed some problems with the merged unit tests. I can not figure out why the linter is unhappy (I ran the linter on Jet for develop and got 9.74/10, so I don't know what I'm supposed to be aiming for). I have asked @christinaholtNOAA to take a look. |
@mkavulich It looks like the new linting GitHub Action test gave you a score of 9.98/10. I see the following message:
|
@mkavulich may have been the first one to check my work on-prem. I've pinned the GitHub actions to pylint v2.6, which is consistent with what's available in an old version of regional_workflow on Jet/Hera. I need to open an Issue/PR and bump that version to the pylint version available in the appropriate supported environments. This should help folks debug locally before pushing. I'm sorry for the inconvenience. Edit: I opened Issue #787 to document the need for this change. |
Ran the |
Approving based on our talk yesterday that the |
It looks like many of the major platform/compiler combos passed the Jenkins tests and Gaea's lack of Intel compiler module held up the others (best I can tell). Can anyone help us get this PR through the pipeline given that the PR itself looks fine, but not Gaea? |
@christinaholtNOAA - Gaea has been updated, and new regression tests now pass with the newly updated stack for the C3 and C4 partitions, see WM issue-1755 (1) At this point it is a matter of updating the ufs-weather-model repository pointing to the new software stack. |
Okay, but what does that mean for this PR? |
@christinaholtNOAA - Could Gaea possibly be added to the testing process? (still pending updates to the stack location) |
@natalie-perlin Are you suggesting that we update the version of the weather model in this PR to make the Gaea tests pass? That is way outside the scope of the work in this PR, and it would be great if we could find a solution that lets us merge the current PR as-is since the failures are due to factors outside our control and we have the approvals. We're also only adding new capabilities here, not modifying old ones. |
@christinaholtNOAA - Oh, the question was whether any changes in machine file for Gaea are expected, but not implemented due to issues following the upgrade. If no changes are expected for Gaea, and it only holds the PR from being fully passed, then no objections! |
On Cheyenne GNU, the The Gaea tests will fail until PR #799 is merged. On Hera GNU, the On Hera Intel, the Since all of the failures are expected, I will go ahead and merge this work now. |
…community#778) * Construct var_defns components from dictionary. * Bring back config_defaults.yaml * Add support for sourcing yaml file into shell script. * Remove newline for printing config, json config fix. * Make QUILTING a sub-dictionary in predef_grids * Reorganize config_defaults.yaml by task and feature. * Bug fix with QUILTING=true. * Structure a dictionary based on a template dictionary. * Convert all WE2E config files to yaml. * Take care of problematic chars when converting to shell string. * Process only selected keys of config. * Add symlinked yaml config files. * Actually use yaml config files for WE2E tests. * Delete all shell WE2E configs. * Don't check for single quotes in test description. * Make WE2E work with yaml configs. * Make yaml default config format. * Bug fix in run_WE2E script. * Add utility to check validity of yaml config file. * Add config utility interface in ush directory. * Remove unused check_expt_config_vars script. * Add description to default config. * Reorganize source_config. * Add XML as one of the config formats. * Update custom_ESGgrid config. * Bug fix due to update. * Change ensemble seed. * Change POST_OUTPUT group due to merge. * Make xml and ini configs work. * Maintain config structure down to var_defns. * Add function to load structured shell config, put description under metadata * Flatten dicts before importing env now that shell config is structured. * Support python regex for selecting dict keys. * Add capability of sourcing task specific portion of config file. * Access var_defns via env variable. * Make names of tasks consistent with ex- and j- job script names. * Append pid to temp file. * Prettify user config, don't use " in xml texts. * Compare timestamp of csv vs all files instead of directory. * Fixes for some pylint suggestions. * Convert new configs to yaml. * Format python files with black (no functional change). * More readable yaml/json formats by using more data types. Only datetime type is now in quotes. * More readable yaml config files for WE2E and default configs. * Make config_defaults itself more readable. * Correct pyyaml list indentation issue. * Fix indentation in all config files. * Use unquoted WTIME in config_defaults * Cosmotic changes. * Fix due to merge. * Make __init__.py clearer. * Fixes due to merge. * Minor edits of comments. * Remove wcoss_dell_p3 from workflow (ufs-community#810) * remove wcoss_dell_p3 * remove block for tide and gyre * Replace deprecated NCAR python environment with conda on Cheyenne (ufs-community#812) * Fix issue on get_extrn_lbcs when FCST_LEN_HRS>=40 with netcdf (ufs-community#814) * activate b file on hpss for >40h * add a new we2e test for fcst_len_hrs>40 * reduce fcst time for we2e * Convert new test case to yaml. * Fix formatting due to merge. * Convert new test case to yaml. * Fix unittest. * Merge develop * Remove exception logic from __init__.py * Minor change to cmd concat. * Make grid gen methods return dictionary, simplifis code a lot. * Add a comment why we are suppressing yaml import exception. * Minor change to beautify unittest output. * Add status badge for functional tests. * Reorder tasks in config_default and we2e test cases to match order in FV3LAM.xml * Keep single quotes and newlines in we2e test description. * Revert back to not rounding to 10 digits Co-authored-by: Chan-Hoo.Jeon-NOAA <[email protected]> Co-authored-by: Michael Kavulich <[email protected]>
DESCRIPTION OF CHANGES:
This is a set of contributions as part of the RRFS/dev merge. The major capabilities include:
get_da_obs
) that retrieves and stages observation files for the subsequent data processing tasks. Most of this logic is existing consolidated logic that has been moved out of theprocess*
tasks.get_da_obs
,process_radarref
,process_lightning
, andprocess_bufrobs
tasksconfig.rrfs.yaml
file for RRFS-specific options; this will be expanded over time as more capabilities are added.process_obs
) to test these initial observation processing tasks. Note that in order to run this test, the app must be built with the rrfs_utils and gsi builds, which is not part of the default build at this time.run_WE2E_tests.py
--file_type
and--external_model
arguments to--file_fmt
and--data_type
, respectively. These are less confusing option names since both model data and obs can be retrieved with this script.Type of change
TESTS CONDUCTED:
Ran fundamental tests on Hera and Jet, all passed. Ran new "process_obs" test on Hera and Jet that just tests the observation preprocessing tasks, confirmed that all expected tasks ran and completed successfully.
DEPENDENCIES:
None
DOCUMENTATION:
None
ISSUE:
Resolves #675
CHECKLIST
CONTRIBUTORS (optional):
Workflow changes contributed by @christinaholtNOAA.