Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Add workflow for radar reflectivity processing. #807

Merged
merged 52 commits into from
May 30, 2023

Conversation

christinaholtNOAA
Copy link
Collaborator

DESCRIPTION OF CHANGES:

Add MRMS data to the get_da_obs task, and turn on the radar reflectivity processing task in the workflow for data assimilation pre-processing.

This is an addition of a new feature that should not impact any other features (outside recent RRFS-feature additions)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

none

DOCUMENTATION:

None.

ISSUE:

Issue #782

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

LABELS (optional):

A Code Manager needs to add the following labels to this PR:

  • Work In Progress
  • bug
  • enhancement
  • documentation
  • release
  • high priority
  • run_ci
  • run_we2e_fundamental_tests
  • run_we2e_comprehensive_tests
  • Needs Cheyenne test
  • Needs Jet test
  • Needs Hera test
  • Needs Orion test
  • help wanted

mkavulich and others added 30 commits April 19, 2023 20:27
 - "archive_internal_dir" must be an empty string for top-level archive
   files
 - Rename "file_type" argument to "file_fmt"
 - Rename "external_model" argument to more generic/correct "data_type"
 - If only some files can not be un-zipped, do not fail. We'll handle
   this more gracefully later if it needs to be an error
 - Remove default location logic for now; couldn't get it working
 - Re-add explicit disk path requirement; may re-visit later
 - For fatal error, do not imply that script continues
 - Remove debug prints
specifically HPSS fails with some kind of logging error. Needs more
investigation.

This commit removes some un-needed, un-initialized variables, adds
modulefiles for the new task on each platform, and adds some unit tests
for retrieve_data.py to test new capabilities.

The new task entry has been added to parm/wflow/coldstart.yaml for
testing, but it should be removed prior to merging this PR since the
task is not integrated with the rest of the workflow yet.
… tasks going forward. Clearly the boundary conditions tasks will need to eventually change also.
 - Link obs needed for task_process_bufr
 - Put obs in $DATA/obs rather than $DATA/gsi
 - Put in logic for getting "rap_e" files at 00 and 12z cycles; I hope
we can extract this logic into the config file later with Jinja
 - Add "task_process_bufr" entry to warmstart.yaml with a dependency on
task get_da_obs
 - Add "CYCLE_TYPE" variable to config_defaults
 - Start a new config.rrfs.yaml file
 - Set machine-specific paths in machine files, this should have been
obvious...
 - Use RRFS_NA_13km domain for RRFS
 - Re-organize Jet and Hera machine files for section consistency
 - Fix grammar error in retrieve_data.py
 - Move date variables to J-job level
 - Keep da_obs in same directory for community and nco mode (since
process_bufr does the same)
 - In process_bufr, look for da_obs in staging directory
 - Add default directory to Jet machine file...only available there for
now
 - Add task_process_lightning to warmstart workflow
…ve to Hera. Also fix config.rrfs.yaml to better settings for test case.
…to avoid duplicate error messages, search for RRFS data on disk as well as hpss
Removing the CYCLE_TYPE from config_defaults because it is a run-time,
cycle-dependent setting, not an experiment level one.
This is similar to what is done for external model files so that we can
use the machine file information about which data stores a platform has
access to.
Matching new standard outlined in Issue ufs-community#633
…obs_task

Had to manually revert some renaming changes in ush/test_retrieve_data.py
@MichaelLueken MichaelLueken linked an issue May 22, 2023 that may be closed by this pull request
Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA Overall, these changes look good. I was able to run the process_obs test on both Hera and Jet and the test successfully ran on both machines.

I have noted that the paths on ush/machine/jet.yaml should use /mnt/lfs4, rather than /scratch2.

I also encountered failures running the fundamental tests on both Hera and Jet. The nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_GFS_v16 test fails on both machines in the make_ics/lbcs step on both machines. Removing the expand_source_paths and going back to source_paths in ush/retrieve_data.py allows the test to run smoothly. It's not clear to me why the modifications you made to expand_source_paths are causing the necessary data not to be pulled.

ush/machine/jet.yaml Outdated Show resolved Hide resolved
ush/machine/jet.yaml Outdated Show resolved Hide resolved
ush/retrieve_data.py Show resolved Hide resolved
Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA Thank you very much for finding the issue with the failure of the test so quickly! The fundamental tests were rerun and all successfully passed. As noted previously, the process_obs test was successfully run on Hera and Jet and now that the fundamental tests are passing, I will give my approval to this PR.

@christinaholtNOAA
Copy link
Collaborator Author

@MichaelLueken Thanks for bringing my attention to it and rerunning the tests! I opened a separate PR to more directly address the fix, but wanted to also test it with this PR to make sure it wasn't a problem here.

@christinaholtNOAA
Copy link
Collaborator Author

Is anyone else available to provide a 2nd review for this PR? It would be helpful to get it merged so that the cycled DA work could follow it in.

Copy link
Collaborator

@EdwardSnyder-NOAA EdwardSnyder-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments in the get_mrms_file.sh script.

When retrieving the da data from either NSSLMOSAIC or RAP_OBS_BUFR, these locations on Hera and Jet are the current live stream and not a historical location? Is that correct?

do
min=$( printf %2.2i $((timelevel+min)) )
echo "Looking for data valid:"${YYYY}"-"${MM}"-"${DD}" "${cyc}":"${min}
nsslfiles=${NSSL}/*${mrms}_00.50_${YYYY}${MM}${DD}-${cyc}${min}??.${OBS_SUFFIX}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the '00.50' need to be a wildcard? In line 50, it is a wildcard.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the original logic, I believe this is the level that needs to exist before all the others are gathered.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. That makes sense but wasn't sure. Thanks for clarifying and making those changes. Approving.

for min in ${RADARREFL_MINS[@]}
do
min=$( printf %2.2i $((timelevel+min)) )
echo "Looking for data valid:"${YYYY}"-"${MM}"-"${DD}" "${cyc}":"${min}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is a reason behind all these double quotes, but you could just have one set of double quotes around this string.

# 10 represents a significant number of vertical levels of data
if [ ${numgrib2} -ge 10 ] && [ ! -e filelist_mrms ]; then
cp_vrfy ${NSSL}/${nsslfile1} ${output_path}
ls ${nsslfile1} > ${output_path}/filelist_mrms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you need to add ${NSSL}/ in front of ${nsslfile1} in order for this command to work.

if [ ${numgrib2} -ge 10 ] && [ ! -e filelist_mrms ]; then
cp_vrfy ${NSSL}/${nsslfile1} ${output_path}
ls ${nsslfile1} > ${output_path}/filelist_mrms
echo 'Copying mrms files for ${YYYY}${MM}${DD}-${cyc}${min}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prints out: Copying mrms files for ${YYYY}${MM}${DD}-${cyc}${min}. Replacing the single quotes with double quotes will allow the variables to be substituted.

if [ -s $nsslfile ]; then
echo 'Found '${nsslfile}
nsslfile1=*${mrms}_*_${YYYY}${MM}${DD}-${cyc}${min}*.${OBS_SUFFIX}
numgrib2=${#nsslfiles1}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be nsslfile1

@MichaelLueken
Copy link
Collaborator

@christinaholtNOAA Since potential changes to the ush/get_mrms_files.sh script won't affect current WE2E tests, I will go ahead and submit the automated Jenkins tests at this time. However, this PR won't be merged until the comments made by @EdwardSnyder-NOAA have been resolved.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label May 26, 2023
@christinaholtNOAA christinaholtNOAA self-assigned this May 30, 2023
@MichaelLueken
Copy link
Collaborator

MichaelLueken commented May 30, 2023

@christinaholtNOAA The Jenkins functional tests failed on Hera and due to issues on Orion, the coverage WE2E tests failed there as well. The coverage WE2E tests were run on Hera and Orion. The tests successfully passed. Once you and @EdwardSnyder-NOAA are happy with the changes in ush/get_mrms_files.sh and the conflict in ush/retrieve_data.py has been addressed (likely due to PR #810 having been merged before this work), then I will move forward and merge this update.

@christinaholtNOAA
Copy link
Collaborator Author

Thanks @MichaelLueken. I just finished the fundamental tests after the conflict resolution and addressing @EdwardSnyder-NOAA's comments.

@MichaelLueken
Copy link
Collaborator

@EdwardSnyder-NOAA Please look over @christinaholtNOAA's latest modifications to ush/get_mrms_files.sh. If you are happy with this update, I will go ahead and merge this PR. Thanks!

@MichaelLueken
Copy link
Collaborator

@EdwardSnyder-NOAA Thank you very much for looking back over the changes! @christinaholtNOAA I will now proceed with merging this work.

@MichaelLueken MichaelLueken merged commit 696f080 into ufs-community:develop May 30, 2023
@christinaholtNOAA christinaholtNOAA deleted the add_radar_obs branch July 2, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Get and process radar data in separate workflow tasks.
5 participants