-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature 1626 tc_gen #1633
Merged
Merged
Feature 1626 tc_gen #1633
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ry with the oper_technique string. Add genesis_init_diff config entry. Update config_constants.h accordingly and the tc_gen_conf_info.h/.cc to parse the updated config entries.
…not yet complete. Still need to compute categorical MISSES but the current version does compile.
…he unused set_dland() function.
… the best tracks. We need to know the best genesis events in order to count up the forecast misses.
…ss to store genesis pairs. This will be needed to write a matched pair line type.
… ID since the forecast genesis do not have meaningful Storm ID's.
…events for the same storm, but do print a useful Debug(3) log message about it.
…m id and initialization time but NOT require an exact forecast hour match.
…stently report the storm id.
…ses and hits in chronological order.
…that's what Dan H used in his examples.
… the memory myself. This makes the implmentation of TrackInfoArray::erase_storm_id() very easy. Replace n_tracks() function with n() in several places.
…load_dland.h/.cc to load_tc_data.h/.cc and add code to read the basin file.
…sages and add lots of details to the tc_gen documentation.
…a, and output_flag to be specified separately for each filter. Also add nc_pairs_flag and genesis_track_points_window config options. Add config constants entries for these options and update tc_gen to handle all of these changes.
…nt to specify how the DataPlane should be initialized.
… line type and instead write the genesis times to the FCST_VALID_BEG/END and OBS_VALID_BEG/END header columns.
…t from very early versions of MET which included the CTP, CFP, and COP line types.
…window, and opt_hit_tdiff. Also update log message to switch from 'lead' to 'forecast hour'.
…ile to include basin name abbreviations.
…ions from the NetCDF basin file.
… basin_mask. Updated the library and application code, and updated the user's guide.
JohnHalleyGotway
requested review from
DanielAdriaansen,
KathrynNewman and
halperin-erau
January 22, 2021 21:52
This was
linked to
issues
Jan 22, 2021
KathrynNewman
approved these changes
Jan 22, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - PR approved. Changes tested and reviewed with development team.
9 tasks
JohnHalleyGotway
added a commit
that referenced
this pull request
Jan 24, 2021
* Getting rid of compiler warnings in PB2NC by replacing several instances of the NULL pointer with the nul character (\0) instead. * Fix typo in config_options.rst. * Feature 1408 var_name_for_grib_code (#1617) * #1408 Added get_var_id * #1408 Check variable name in the configuration to use the variable name instewad of grib code * #1408 Added point2grid_ascii2nc_surfrad_DW_PSP_by_name * Feature 1580 2d time (#1616) * #1580 Added get_grid_from_lat_lon_vars * #1580 Added get_grid_from_lat_lon_vars and support 2D time variable * #1580 Support int type variable without scale_factor and add_offset attributes * #1580 Support 2D time variable. Implemented filtering by valid_time * #1580 Bug fix: read time with dimension 0 * #1580 Support time variable with no dimension * #1580 Initial release * #1580 Added point2grid_2D_time * #1580 Check project attribute for GOES * #1580 Changed NULL to 0 to avoid co,pilation warning * #1580 Added point2grid_2D_time * #1580 Added "point2grid configuration file" section * #1580 Changed to_grid for point2grid_NCCF_UK & point2grid_2D_time Co-authored-by: Howard Soh <[email protected]> Co-authored-by: John Halley Gotway <[email protected]> * feature 1580 nccf (#1619) * #1580 Correct the precision at _apply_scale_factor * #1580 Added unit test plot_data_plane_NCCF_time * #1580 Changed argument type to double at _apply_scale_factor(double) * Bugfix 1618 develop pb2nc (#1623) Co-authored-by: Howard Soh <[email protected]> * Feature 1624 OBS_COMMAND (#1625) * Per #1627, add grid_data.regrid config option for PlotPointObs and update the tool to do the requested regridding. Still need to update the docs. * Per #1627, update docs about grid_data.regrid config option for PlotPointObs. * Per #1627, add another call to plot_point_obs to exercise the new regrid functionality. * Feature 1624 obs_command second try (#1629) * Per #1624, define OBS_COMMAND. * Per #1624, unset the test-specific environment variables after completing the run. * Per #1624, after PR #1625 merged these changes into develop, they caused 2 unexpected diffs in the NB output. These were caused by enviornment variables being unset after each test. Updating unit_netcdf.xml and unit_point2grid.xml to define more test-specific environment variables to reproduce previous NB output. * Organizing NB climatology and point2grid output files into the appopriate directories rather than having them at the top-level directory. * Update pull_request_template.md * Update the point2grid unit tests to write their temp files to the point2grid subdirectory instead of the top-level test output directory. * Update appendixC.rst Split the definition of H_RATE and POD * Feature 1626 tc_gen (#1633) * Per #1448, many changes for TC-Gen. Replace the oper_genesis dictionary with the oper_technique string. Add genesis_init_diff config entry. Update config_constants.h accordingly and the tc_gen_conf_info.h/.cc to parse the updated config entries. * Per #1448, large overhaul of the tc_gen matching logic. This work is not yet complete. Still need to compute categorical MISSES but the current version does compile. * Per #1448, add GenesisInfoArray::has_storm_id() function and remove the unused set_dland() function. * Per #1448, more updates. Define the best genesis events while parsing the best tracks. We need to know the best genesis events in order to count up the forecast misses. * Per #1448, lots more changes for tc_gen. Create a PairDataGenesis class to store genesis pairs. This will be needed to write a matched pair line type. * Per #1448, minor tweaks to log messages. * Per #1448, update PairDataGenesis class to store the BEST track Storm ID since the forecast genesis do not have meaningful Storm ID's. * Per #1448, in GenesisInfoArray::add(), do NOT store multiple genesis events for the same storm, but do print a useful Debug(3) log message about it. * Per #1448, update PairDataGenesis::has_case() logic to check the storm id and initialization time but NOT require an exact forecast hour match. * Per #1448, update the tc_gen log messages to more concisely and consistently report the storm id. * Per #1448, update the PairDataGenesis logic a bit to have all the misses and hits in chronological order. * Per #1448, add genesis_init_diff entry. * Per #1448, set the default genesis_init_diff entry to 48 hours since that's what Dan H used in his examples. * Per #1448, work on comments and log messages. * Per #1448, reimplement TrackInfoArray as a vector instead of managing the memory myself. This makes the implmentation of TrackInfoArray::erase_storm_id() very easy. Replace n_tracks() function with n() in several places. * Per #1448, add valid_freq and basin_file config entries. Also rename load_dland.h/.cc to load_tc_data.h/.cc and add code to read the basin file. * Per #1448, add GenesisInfoArray::erase_storm_id(). * Per #1448, update tc_gen code to handle new config options. * Per #1448, had my units wrong. Was processing seconds when I thought it was hours! * Per #1448, making test TC-Gen config file consistent with the default. * Per #1448, also track the obs valid times. * Per #1448, switch from tech1/tech2 to dev/ops methods. Update log messages and add lots of details to the tc_gen documentation. * Per #1430, in tc_gen enable dev_method_flag, ops_method_flag, ci_alpha, and output_flag to be specified separately for each filter. Also add nc_pairs_flag and genesis_track_points_window config options. Add config constants entries for these options and update tc_gen to handle all of these changes. * Per #1430, consolidate the parse_grid_mask() code a bit to avoid redundancy.: * Per #1430, just cleaning up some messy comments. * Per #1430, adding hooks for writing NetCDF output file. * Per #1430, update DataPlane::set_size() function to take a 3rd argument to specify how the DataPlane should be initialized. * Per #1430, update the nc_pairs_flag options and update the code to parse them. * Per #1430, update the TrackInfo class to track and report the min/max warm core information. * Per #1430, current state of development. Still a work in progress. I'm getting runtime segfaults when testing and I still need to NOT overcount the BEST track hits. * Per #1430, committing changes described by #1430 (comment) * Per #1430, forgot to rename genesis_match_window to genesis_hit_window as it is in the code. * Per #1430, chaning GenesisInfo to just inherit directly from TrackInfo. Frankly, I should have thought of this a LONG time ago. * Per #1430, change the default desc setting from NA to ALL and add the best_unique_flag option. * Per #1430, simplify the logic now that GenesisInfo is derived from TrackInfo. Also support the best_unique_flag config option. * Per #1430, instead of storing 12 individual DataPlane objects, store them in a map to make writing their output more convenient. * Per #1430, updating documentation and comments. * Per #1430, more doc updates. * Per #1430, update unit test to only write NetCDF counts for the AL_BASIN and not the other filters. * Per #1430, fix parsing logic for nc_pairs_flag = TRUE. * Per #1430, fix bug. Check the VxOpt.NcInfo before calling write_nc(), not the top-level one. * Per #1430, the docker build of tc_gen failed. * Per #1430, working on DockerHub compilation. * Per #1430, getting DockerHub build working. * One more try. * Per #1597, add hooks for new GENMPR stat line type. * Per #1597, add config file option and column definitions for the GENMPR line type. * Per #1597, finish writing the GENMPR line type. * Per #1597, change the default output grid from a global 5 degree to global 1 degree grid. * Per #1597, change GENMPR output columns to GEN_TDIFF and INIT_TDIFF since they're reported in HHMMSS format instead of seconds. Also, tweak the config file for the tc-gen unit test. * Per #1597, have to add GENMPR header columns for Stat-Analysis and test scripts to handle it. * Per #1597, update Stat-Analysis to handle the GENMPR line type. * Per #1597, user's guide updates for the GENMPR and NetCDF output file. * Per #1597, add AGEN_INIT and AGEN_FHR columns. * Per #1597, add AGEN_INIT and AGEN_FHR columns. * Per #1597, remove the AGEN_TIME and BGEN_TIME columns from the GENMPR line type and instead write the genesis times to the FCST_VALID_BEG/END and OBS_VALID_BEG/END header columns. * Remove some unused output column name definitions. There are a remnant from very early versions of MET which included the CTP, CFP, and COP line types. * Per #1597, update config file options to use dev_hit_radius, dev_hit_window, and opt_hit_tdiff. Also update log message to switch from 'lead' to 'forecast hour'. * Per #1626, add met_regrid_nearest() utility function since I'm calling it twice. * Per #1626, update the basin_global_tenth_degree.nc basin definition file to include basin name abbreviations. * Per #1626, update load_tc_data.h/.cc to also read the basin abbreviations from the NetCDF basin file. * Per #1626, add TC-Gen config file options for init_inc, init_exc, and basin_mask. Updated the library and application code, and updated the user's guide. Co-authored-by: hsoh-u <[email protected]> Co-authored-by: Howard Soh <[email protected]> Co-authored-by: John Halley Gotway <[email protected]> Co-authored-by: j-opatz <[email protected]>
JohnHalleyGotway
added a commit
that referenced
this pull request
Jan 26, 2021
* Getting rid of compiler warnings in PB2NC by replacing several instances of the NULL pointer with the nul character (\0) instead. * Fix typo in config_options.rst. * Feature 1408 var_name_for_grib_code (#1617) * #1408 Added get_var_id * #1408 Check variable name in the configuration to use the variable name instewad of grib code * #1408 Added point2grid_ascii2nc_surfrad_DW_PSP_by_name * Feature 1580 2d time (#1616) * #1580 Added get_grid_from_lat_lon_vars * #1580 Added get_grid_from_lat_lon_vars and support 2D time variable * #1580 Support int type variable without scale_factor and add_offset attributes * #1580 Support 2D time variable. Implemented filtering by valid_time * #1580 Bug fix: read time with dimension 0 * #1580 Support time variable with no dimension * #1580 Initial release * #1580 Added point2grid_2D_time * #1580 Check project attribute for GOES * #1580 Changed NULL to 0 to avoid co,pilation warning * #1580 Added point2grid_2D_time * #1580 Added "point2grid configuration file" section * #1580 Changed to_grid for point2grid_NCCF_UK & point2grid_2D_time Co-authored-by: Howard Soh <[email protected]> Co-authored-by: John Halley Gotway <[email protected]> * feature 1580 nccf (#1619) * #1580 Correct the precision at _apply_scale_factor * #1580 Added unit test plot_data_plane_NCCF_time * #1580 Changed argument type to double at _apply_scale_factor(double) * Bugfix 1618 develop pb2nc (#1623) Co-authored-by: Howard Soh <[email protected]> * Feature 1624 OBS_COMMAND (#1625) * Per #1627, add grid_data.regrid config option for PlotPointObs and update the tool to do the requested regridding. Still need to update the docs. * Per #1627, update docs about grid_data.regrid config option for PlotPointObs. * Per #1627, add another call to plot_point_obs to exercise the new regrid functionality. * Feature 1624 obs_command second try (#1629) * Per #1624, define OBS_COMMAND. * Per #1624, unset the test-specific environment variables after completing the run. * Per #1624, after PR #1625 merged these changes into develop, they caused 2 unexpected diffs in the NB output. These were caused by enviornment variables being unset after each test. Updating unit_netcdf.xml and unit_point2grid.xml to define more test-specific environment variables to reproduce previous NB output. * Organizing NB climatology and point2grid output files into the appopriate directories rather than having them at the top-level directory. * Update pull_request_template.md * Update the point2grid unit tests to write their temp files to the point2grid subdirectory instead of the top-level test output directory. * Update appendixC.rst Split the definition of H_RATE and POD * Feature 1626 tc_gen (#1633) * Per #1448, many changes for TC-Gen. Replace the oper_genesis dictionary with the oper_technique string. Add genesis_init_diff config entry. Update config_constants.h accordingly and the tc_gen_conf_info.h/.cc to parse the updated config entries. * Per #1448, large overhaul of the tc_gen matching logic. This work is not yet complete. Still need to compute categorical MISSES but the current version does compile. * Per #1448, add GenesisInfoArray::has_storm_id() function and remove the unused set_dland() function. * Per #1448, more updates. Define the best genesis events while parsing the best tracks. We need to know the best genesis events in order to count up the forecast misses. * Per #1448, lots more changes for tc_gen. Create a PairDataGenesis class to store genesis pairs. This will be needed to write a matched pair line type. * Per #1448, minor tweaks to log messages. * Per #1448, update PairDataGenesis class to store the BEST track Storm ID since the forecast genesis do not have meaningful Storm ID's. * Per #1448, in GenesisInfoArray::add(), do NOT store multiple genesis events for the same storm, but do print a useful Debug(3) log message about it. * Per #1448, update PairDataGenesis::has_case() logic to check the storm id and initialization time but NOT require an exact forecast hour match. * Per #1448, update the tc_gen log messages to more concisely and consistently report the storm id. * Per #1448, update the PairDataGenesis logic a bit to have all the misses and hits in chronological order. * Per #1448, add genesis_init_diff entry. * Per #1448, set the default genesis_init_diff entry to 48 hours since that's what Dan H used in his examples. * Per #1448, work on comments and log messages. * Per #1448, reimplement TrackInfoArray as a vector instead of managing the memory myself. This makes the implmentation of TrackInfoArray::erase_storm_id() very easy. Replace n_tracks() function with n() in several places. * Per #1448, add valid_freq and basin_file config entries. Also rename load_dland.h/.cc to load_tc_data.h/.cc and add code to read the basin file. * Per #1448, add GenesisInfoArray::erase_storm_id(). * Per #1448, update tc_gen code to handle new config options. * Per #1448, had my units wrong. Was processing seconds when I thought it was hours! * Per #1448, making test TC-Gen config file consistent with the default. * Per #1448, also track the obs valid times. * Per #1448, switch from tech1/tech2 to dev/ops methods. Update log messages and add lots of details to the tc_gen documentation. * Per #1430, in tc_gen enable dev_method_flag, ops_method_flag, ci_alpha, and output_flag to be specified separately for each filter. Also add nc_pairs_flag and genesis_track_points_window config options. Add config constants entries for these options and update tc_gen to handle all of these changes. * Per #1430, consolidate the parse_grid_mask() code a bit to avoid redundancy.: * Per #1430, just cleaning up some messy comments. * Per #1430, adding hooks for writing NetCDF output file. * Per #1430, update DataPlane::set_size() function to take a 3rd argument to specify how the DataPlane should be initialized. * Per #1430, update the nc_pairs_flag options and update the code to parse them. * Per #1430, update the TrackInfo class to track and report the min/max warm core information. * Per #1430, current state of development. Still a work in progress. I'm getting runtime segfaults when testing and I still need to NOT overcount the BEST track hits. * Per #1430, committing changes described by #1430 (comment) * Per #1430, forgot to rename genesis_match_window to genesis_hit_window as it is in the code. * Per #1430, chaning GenesisInfo to just inherit directly from TrackInfo. Frankly, I should have thought of this a LONG time ago. * Per #1430, change the default desc setting from NA to ALL and add the best_unique_flag option. * Per #1430, simplify the logic now that GenesisInfo is derived from TrackInfo. Also support the best_unique_flag config option. * Per #1430, instead of storing 12 individual DataPlane objects, store them in a map to make writing their output more convenient. * Per #1430, updating documentation and comments. * Per #1430, more doc updates. * Per #1430, update unit test to only write NetCDF counts for the AL_BASIN and not the other filters. * Per #1430, fix parsing logic for nc_pairs_flag = TRUE. * Per #1430, fix bug. Check the VxOpt.NcInfo before calling write_nc(), not the top-level one. * Per #1430, the docker build of tc_gen failed. * Per #1430, working on DockerHub compilation. * Per #1430, getting DockerHub build working. * One more try. * Per #1597, add hooks for new GENMPR stat line type. * Per #1597, add config file option and column definitions for the GENMPR line type. * Per #1597, finish writing the GENMPR line type. * Per #1597, change the default output grid from a global 5 degree to global 1 degree grid. * Per #1597, change GENMPR output columns to GEN_TDIFF and INIT_TDIFF since they're reported in HHMMSS format instead of seconds. Also, tweak the config file for the tc-gen unit test. * Per #1597, have to add GENMPR header columns for Stat-Analysis and test scripts to handle it. * Per #1597, update Stat-Analysis to handle the GENMPR line type. * Per #1597, user's guide updates for the GENMPR and NetCDF output file. * Per #1597, add AGEN_INIT and AGEN_FHR columns. * Per #1597, add AGEN_INIT and AGEN_FHR columns. * Per #1597, remove the AGEN_TIME and BGEN_TIME columns from the GENMPR line type and instead write the genesis times to the FCST_VALID_BEG/END and OBS_VALID_BEG/END header columns. * Remove some unused output column name definitions. There are a remnant from very early versions of MET which included the CTP, CFP, and COP line types. * Per #1597, update config file options to use dev_hit_radius, dev_hit_window, and opt_hit_tdiff. Also update log message to switch from 'lead' to 'forecast hour'. * Per #1626, add met_regrid_nearest() utility function since I'm calling it twice. * Per #1626, update the basin_global_tenth_degree.nc basin definition file to include basin name abbreviations. * Per #1626, update load_tc_data.h/.cc to also read the basin abbreviations from the NetCDF basin file. * Per #1626, add TC-Gen config file options for init_inc, init_exc, and basin_mask. Updated the library and application code, and updated the user's guide. * Fixing Fortify warnings for 'Poor Style: Variable Never Used' in 6 files. * Fix Fortify warnings for 'Uninitialized variable' in tc_gen.cc and point2grid.cc. * Fix Fortify warnings for 'Poor Style: Redundant Initialization' in plot_point_obs.cc and point2grid.cc. * Feature 1346 valid time attr (#1634) * #1346 get_att_value_unixtime supports yyyymmdd_hhmmss, too * #1346 Check valid_time & init_time attributes, too * #1346 Check valid_time & init_time attributes, too Co-authored-by: Howard Soh <[email protected]> * Feature 1473 python errors (#1615) * Added sample script to read ascii data and create an xarray. * Disabled use_xarray exit for testing. * Get attrs from DataArray if using xarray. * Removed some comments. * Revised error messages for use with both numpy and xarray. * Removing commented out code. Co-authored-by: David Fillmore <[email protected]> Co-authored-by: johnhg <[email protected]> * Feature 1630 zero obs (#1637) * Per #1630, update ascii2nc to change zero observations from an error (which returns bad status) to a warning message. * Per #1630, update point2grid to read an empty input file and write fields of 0's or bad data to the output. Change previous error message to warning. Also, update LOTS of warning and error log messages to make them consistent. * Per #1630, need to initialize the dataplanes before the loop (for when there are no obs) and within each loop iteration (for when there are multiple fields to process). * Bugfix 1638 develop climo cdf (#1639) * Per #1638, correct the order of arguments in the call to the normal_cdf() utility function. * Per #1638, update the logic in derive_climo_prob(). For CDP thresholds, the constant climo probability should be based on the inequality type where less-than-types match the threshold percentile value while greater-than-types are 1.0 minus the threshold percentile. * Per #1638, update normal_cdf() to initialize the output CDF field using the climo mean field instead of the observation data field. This makes the timestamps consistent for the climo mean, stdev, and cdf variables in the Grid-Stat NetCDF matched pairs output file. * Update tc_gen.cc Co-authored-by: hsoh-u <[email protected]> Co-authored-by: Howard Soh <[email protected]> Co-authored-by: John Halley Gotway <[email protected]> Co-authored-by: j-opatz <[email protected]> Co-authored-by: David Fillmore <[email protected]> Co-authored-by: David Fillmore <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Testing
Describe testing already performed for these changes:
Very large set of changes here. This PR includes changes for 4 different GitHub issues:
MET Refine tc_gen logic to compare forecast genesis events to all BEST track points. #1448 to overhaul TC-Gen logic.
MET Add gridded output fields from tc_gen in NetCDF format #1430 to add the NetCDF pair output.
MET Enhance tc_gen to write a matched pair output line type. #1597 to add the GENMPR line type.
MET Add TC-Gen configuration file options for init_inc, init_exc, and basin_mask. #1626 to add more config options.
Dan H, Kathryn N, and Dan A have been testing these changes in successive feature branches.
This branch is now being built on DockerHub in dtcenter/met:feature_1626_tc_gen
Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:
Dan H, please specifically test the init_inc and init_exc options to confirm that it does what you expect it to do.
Do these changes include sufficient documentation and testing updates? [Yes]
Yes, I updated the TC-Gen chapter of the user's guide. And I adapted unit_tc_gen.xml to exercise many of the configuration options. No test currently exercises the new init_int and init_exc logic. And that's why I asked Dan H to explicitly test that.
Will this PR result in changes to the test suite? [Yes]
If yes, describe the new output and/or changes to the existing output:
New files produced by unit_tc_gen.xml include an _genmpr.txt file, _pairs.nc file, and changes to the existing .stat files.
Pull Request Checklist
See the METplus Workflow for details.
Select: Reviewer(s), Project(s), and Milestone