Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review and revise the warning messages when running the MET unit tests. #1921

Closed
20 tasks done
JohnHalleyGotway opened this issue Sep 16, 2021 · 2 comments · Fixed by #1935 or #1937
Closed
20 tasks done

Review and revise the warning messages when running the MET unit tests. #1921

JohnHalleyGotway opened this issue Sep 16, 2021 · 2 comments · Fixed by #1935 or #1937
Assignees
Labels
component: code cleanup Code cleanup and maintenance issue priority: medium Medium Priority requestor: METplus Team METplus Development Team type: task An actionable item of work
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Sep 16, 2021

Describe the Task

Recent changes have introduced many new warning messages to the output of MET. This task is to run all of the commands for the unit tests, save the output to a log file, and count up the warning messages. For each warning message, decide whether or not it's useful and should remain.

If not, modify the code to either remove it entirely or change it to a high level debug message instead.

Time Estimate

1 day.

Sub-Issues

Consider breaking the task down into sub-issues.
No sub-issues.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required: @JohnHalleyGotway
  • Select scientist(s) or no scientist required: none needed

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

Task Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@JohnHalleyGotway JohnHalleyGotway added component: code cleanup Code cleanup and maintenance issue type: task An actionable item of work priority: medium Medium Priority alert: NEED ACCOUNT KEY Need to assign an account key to this issue requestor: METplus Team METplus Development Team labels Sep 16, 2021
@JohnHalleyGotway JohnHalleyGotway added this to the MET 10.1.0 milestone Sep 16, 2021
@JohnHalleyGotway JohnHalleyGotway self-assigned this Sep 16, 2021
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Sep 16, 2021

I ran the following commands to generate the commands for testing:

grep '\.xml' bin/unit_test.sh | sed -r 's/ /\n/g' | sed -r 's/=/\n/g' | tr -d '"' | grep unit_ > xml_list
for xml in `cat xml_list`; do perl/unit.pl xml/${xml} --cmd >> unit_test_cmds.sh; echo $xml; done
./unit_test_cmds.sh >& unit_test_cmds.log&

Here are the resulting files:
unit_test_cmds.tar.gz
The unit_test_cmds.log file contains 1127 non-empty WARNING messages output lines:

grep "WARNING:" unit_test_cmds.log | grep '[a-z]' | wc -l
1127

They can be categorized like this:

  • 240 for "Skipping duplicate entry for ensemble".
    The 240 warnings result from a single station "LKWF1":
ps:LKWF1:26.61000:-80.03000:0:0.00000:20121025_060000:20121025_050000:1010.59998
ps:LKWF1:26.61000:-80.03000:0:0.00000:20121025_060000:20121025_060000:1009.90002
ps:LKWF1:26.61000:-80.03000:0:0.00000:20121025_060000:20121025_070000:1008.70001
t:LKWF1:26.61000:-80.03000:0:0.00000:20121025_060000:20121025_050000:299.75000
t:LKWF1:26.61000:-80.03000:0:0.00000:20121025_060000:20121025_060000:300.04999
t:LKWF1:26.61000:-80.03000:0:0.00000:20121025_060000:20121025_070000:300.25000

The 6 lines are repeated 20 times, once for each of 20 ensemble members. And double that for 2 calls to gsid2orank with this input data = 240.

Action: Change from "WARNING" to "DEBUG(4)", per @KathrynNewman. We suspect the gsid2orank tool is lightly used. And users can't do anything about the duplicates in the input binary file anyway.

  • 197 for "compute_ss_index".
    There are 3 types of warnings:
  1. WARNING: SSIndexJobInfo::compute_ss_index() -> can't compute the skill score for term 35 (TMP/P400 at 360000 lead).
  2. WARNING: SSIndexJobInfo::compute_ss_index() -> the number of aggregated forecast and reference lines differ (1 != 0) for term 33 (TMP/P400 at 120000 lead).
  3. WARNING: SSIndexJobInfo::compute_ss_index() -> skipping GO_INDEX since the ratio of valid terms 16/48 < 0.5!

Action: Confirmed by @michelleharrold... Leave type 3 as-is. Change types 1 and 2 from WARNING to DEBUG(3). Keep track if types 1 or 2 actually occurred and if so, print a single warning message for each type:

WARNING: SSIndexJobInfo::compute_ss_index() -> cannot compute the skill score for 32 of the 48 terms. Rerun at verbosity level 3 (-v 3) for details.
WARNING: SSIndexJobInfo::compute_ss_index() -> the number of aggregated forecast and reference lines differ for 48 of the 48 terms. Rerun at verbosity level 3 (-v 3) for details.
  • 177 for "bad probability value".
WARNING: bool ProbInfoArray::add() -> bad probability value ... "999"

Clearly NHC is using (or at least was using in 2015) 999 as a missing value.

Action: I confirmed this change with @KathrynNewman. Apply range check logic to the probability values (0 to 100) and anything out of range results in a DEBUG(4) log message instead of a WARNING.

  • 161 about "truncated" strings. Action:
    Action: Howard - Keep this and modify calling m_strncpy with "truncate=true" argument.

  • 141 with "ipdtmpl" about filter GRIB records and 5 with "Using the first of" (these are related).
    These are coming from 2 sources:

  1. For "PRMSL/L0" shows up 2 times from plot_data_plane:
    The input GRIB2 file (model_data/grib2/gefs/enspost_grb2.t00z.prmsl) contains records for 65 times (from 0 to 384, every 6 hours). So it really does find 65 matches!
    Action: Leave the warnings in place as-is, but update unit_plot_data_plane.xml with lead_time = "00"; to eliminate these warnings. That will produce the same result but without the warnings.

  2. For "APCP/L0" shows up 3 times from grid_diag for files gfs.t00z.pgrb2.0p25.f14[1,4,7]. The first command in unit_grid_diag.xml uses data for forecast hours 141, 144, and 147. Requesting "APCP/L0" results in 3 hour precip from f141 and f147, but 6 hour precip from f144! So this is a bad example. We should not be mixing A3 with A6 in the output.
    Action: Switched "APCP/L0" to "RH/Z2" and adjusted the range of the bins. This gets rid of the warning messages but note that it WILL CHANGE THE UNIT TEST OUTPUT.

  • 54 for the "PBL" derivation.
    Action: Howard - Change them to Debug(4)

  • 50 from the "get_obs_data_float" function.
    Action: Howard - change them to Debug(4)

  • 12 for "valid times do not match".
    Action: No action required. These are good warnings and should remain. We could change the unit tests to stop using these mismatched files. But shouldn't do it now.

  • 12 with "the number of U-wind records". Action:

  • 11 for "can't open input ensemble file". Action:

  • 8 for "For case " from Stat-Analysis. Action:

  • 6 for "Skipping duplicate entry for case". Action:

  • 5 from "do_sub_command". Action:

  • 5 about "Resetting interpolation method". Action:

  • 4 for "process_pb_file". Action:

  • 4 from the "get_meta_data_float" function. Action:
    Action Howard - changed them to Debug(4)

  • 4 from "get_nc_var".
    Action Howard - keep the warning (caller chooses the message as an error or a warning)

  • 4 for "The valid time has changed". Action:

  • 2 for the "LittleRHandler". Action:

  • 2 for "Found multiple variables". Action:

  • 2 for "matches for VarInfo". Action:

  • 3 for "was not used" from gen_vx_mask. Action:

  • 3 for "deprecated 10 column input file format". Action:

  • 2 for "not found in file" and 2 for "No matching record found" from GRIB library. Action:

  • 2 for "could not determine the valid time". Action:

  • 2 for "falls outside the configuration file range". Action:

  • 2 for "no matching STAT lines found for job" from Stat-Analysis. Action:

  • 1 for "Can not combine TQ and UV". Action:
    Action Howard - changed them to Debug(4)

  • 1 for "compute_ssvar". Action:

  • 1 for "do_matching". Action:

  • 1 for "enabling NetCDF ensemble mean" from Ensemble-Stat. Action:

  • 1 for "missing Qty variable".
    Action Howard - changed to Debug(1) like other places at the same file (madis2nc.cc)

Here's the grep command to ignore them all:

grep "WARNING:" unit_test_cmds.log | grep '[a-z]' | egrep -v "truncated|get_obs_data_float|get_meta|valid times do not match|PBL|compute_ss_index|bad probability value|do_sub_command|Resetting interpolation method|Skipping duplicate entry|the number of U-wind records|ipdtmpl|Using the first of|LittleRHandler|get_nc_var|could not determine the valid time|falls outside the configuration file range|can't open input ensemble file|For case |The valid time has changed|Found multiple variables|matches for VarInfo|was not used|deprecated 10 column input file format|not found in file|process_pbfile|No matching record found|Can not combine TQ and UV|no matching STAT lines found for job|compute_ssvar|do_matching|enabling NetCDF ensemble mean|missing Qty variable"

@hsoh-u hsoh-u self-assigned this Sep 16, 2021
hsoh-u pushed a commit that referenced this issue Sep 20, 2021
hsoh-u pushed a commit that referenced this issue Sep 20, 2021
@JohnHalleyGotway JohnHalleyGotway linked a pull request Oct 5, 2021 that will close this issue
12 tasks
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Oct 6, 2021

I reran the same tests and captured the log output before/after making many of these changes. Here is the resulting "report.log" file showing the old number of warnings followed by the new number for each warning type. See warnings.tar.gz for details:

     161 ->        0 for "truncated"
      50 ->        0 for "get_obs_data_float"
       4 ->        0 for "get_meta"
      12 ->        9 for "valid times do not match"
      54 ->        0 for "PBL"
     197 ->        0 for "compute_ss_index"
     177 ->        0 for "bad probability value"
       5 ->        5 for "do_sub_command"
       5 ->        5 for "Resetting interpolation method"
     246 ->        6 for "Skipping duplicate entry"
      12 ->       12 for "the number of U-wind records"
     141 ->      132 for "ipdtmpl"
                 These warnings were also greatly reduced after this test was run.
       5 ->        2 for "Using the first of"
       2 ->        2 for "LittleRHandler"
       4 ->        2 for "get_nc_var"
       2 ->        2 for "could not determine the valid time"
       2 ->        0 for "falls outside the configuration file range"
      11 ->        9 for "can't open input ensemble file"
       8 ->        8 for "For case "
       4 ->        4 for "The valid time has changed"
       2 ->        2 for "Found multiple variables"
       2 ->        2 for "matches for VarInfo"
       3 ->        3 for "was not used"
       3 ->        3 for "deprecated 10 column input file format"
       2 ->        2 for "not found in file"
       4 ->        4 for "process_pbfile"
       2 ->        2 for "No matching record found"
       1 ->        0 for "Can not combine TQ and UV"
       2 ->        2 for "no matching STAT lines found for job"
       1 ->        0 for "compute_ssvar"
       1 ->        1 for "do_matching"
       1 ->        1 for "enabling NetCDF ensemble mean"
       1 ->        0 for "missing Qty variable"

@JohnHalleyGotway JohnHalleyGotway linked a pull request Oct 6, 2021 that will close this issue
12 tasks
@JohnHalleyGotway JohnHalleyGotway changed the title Review and revise the warning messages produced by running the MET unit tests. Review and revise the warning messages when running the MET unit tests. Oct 6, 2021
@JohnHalleyGotway JohnHalleyGotway removed the alert: NEED ACCOUNT KEY Need to assign an account key to this issue label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: code cleanup Code cleanup and maintenance issue priority: medium Medium Priority requestor: METplus Team METplus Development Team type: task An actionable item of work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants