Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing error for floating point percentile thresholds, like >SFP33.3 #1716

Closed
9 of 21 tasks
JohnHalleyGotway opened this issue Mar 15, 2021 · 2 comments · Fixed by #1721 or #1722
Closed
9 of 21 tasks

Fix parsing error for floating point percentile thresholds, like >SFP33.3 #1716

JohnHalleyGotway opened this issue Mar 15, 2021 · 2 comments · Fixed by #1721 or #1722
Assignees
Labels
MET: Library Code requestor: DTC/AF V&V Air Force Verification and Validation Project type: bug Fix something that is not working

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Mar 15, 2021

Describe the Problem

This issue arose via met-help:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99059

The config file parsing logic prevents the use of non-integer percentile thresholds. So while a threshold of ">50.0" parses just fine, a threshold of ">SFP50.0" results in a parsing error. This task is to fix the parsing logic to allow for decimal places in percentile thresholds. The rest of the library code is fine and already stores the percentile as a double. The problem lies in the parser.

Expected Behavior

MET should support both integer and floating point percentile threshold values.

Environment

Demonstrated this behavior on my Mac laptop, but should be a problem on all machines.

To Reproduce

I was able to replicate the behavior using this plot_data_plane command where I tried replacing anything greater than the median with a value of 270.

Integers work:

plot_data_plane data/sample_fcst/2005080700/wrfprs_ruc13_12.tm00_G212 plot.ps 'name="TMP"; level="P500"; censor_thresh=[>SFP50]; censor_val=[270];'

Floating point percentiles do not:

plot_data_plane data/sample_fcst/2005080700/wrfprs_ruc13_12.tm00_G212 plot.ps 'name="TMP"; level="P500"; censor_thresh=[>SFP50.0]; censor_val=[270];'
ERROR  :
ERROR  : yyerror() -> syntax error in file "/tmp/met_config_85620_0"
ERROR  :
ERROR  :    line   = 1
ERROR  :
ERROR  :    column = 27
ERROR  :
ERROR  :    text   = "]"
ERROR  :
ERROR  :
ERROR  : name="TMP"; level="P500"; censor_thresh=[>SFP50.0]; censor_val=[270];
ERROR  : __________________________^__________________________________________
ERROR  : 

Relevant Deadlines

Include fix in met-10.0.0.

Funding Source

2712221

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones or add "alert:NEED PROJECT ASSIGNMENT" label
  • Select milestone to relevant bugfix version

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of main_<Version>.
    Branch name: bugfix_<Issue Number>_main_<Version>_<Description>
  • Fix the bug and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into main_<Version>.
    Pull request: bugfix <Issue Number> main_<Version> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s), Project(s), Milestone, and Linked issues
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Complete the steps above to fix the bug on the develop branch.
    Branch name: bugfix_<Issue Number>_develop_<Description>
    Pull request: bugfix <Issue Number> develop <Description>
  • Close this issue.
@JohnHalleyGotway JohnHalleyGotway added type: bug Fix something that is not working component: library code requestor: DTC/AF V&V Air Force Verification and Validation Project labels Mar 15, 2021
@JohnHalleyGotway JohnHalleyGotway added this to the MET 10.0.0 milestone Mar 15, 2021
@michelleharrold
Copy link

Charge key for this work: 2712221

@JohnHalleyGotway
Copy link
Collaborator Author

Adding @rgbullock to this issue since he can work on this on 3/17/21.

Here's the commands I was using to test:

echo "a = >SFP50.0;" > config
internal_tests/basic/vx_config/test_config config
ERROR  :
ERROR  : yyerror() -> syntax error in file "/tmp/met_config_27428_0"
ERROR  :
ERROR  :    line   = 1
ERROR  :
ERROR  :    column = 3
ERROR  :
ERROR  :    text   = ";"
ERROR  :
ERROR  :
ERROR  : a = >SFP50.0;
ERROR  : __^__________
ERROR  : 

JohnHalleyGotway added a commit that referenced this issue Mar 17, 2021
JohnHalleyGotway added a commit that referenced this issue Mar 18, 2021
JohnHalleyGotway added a commit that referenced this issue Mar 18, 2021
JohnHalleyGotway added a commit that referenced this issue Mar 18, 2021
This was linked to pull requests Mar 18, 2021
JohnHalleyGotway added a commit that referenced this issue Mar 19, 2021
* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.
JohnHalleyGotway added a commit that referenced this issue Mar 19, 2021
* Per #1716, committing the same bugfix to the main_v9.1 branch for inclusion in the 9.1.3 bugfix release.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.
JohnHalleyGotway added a commit that referenced this issue Mar 19, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
@JohnHalleyGotway JohnHalleyGotway changed the title Fix parsing error for non-integer percentile thresholds. Fix parsing error for floating point percentile thresholds, like >SFP33.3 Mar 19, 2021
JohnHalleyGotway added a commit that referenced this issue Mar 20, 2021
* Add debug level 4 message to list out the number of GRIB2 records inventoried. This helps debugging issues with MET potentially not reading all input GRIB2 records on WCOSS.

* Bugfix 1554 main_v9.1 ncdump (#1555)

* Bugfix 1562 main_v9.1 grid_diag (#1563)

* Per #1562, check for bad data values before adding data to the PDF's for grid_diag.

* Per #1562, removing the poly = CONUS.poly mask from GridDiagConfig_TMP. That settting masked a problem in the handling of missing data. Exercising the mask.poly option is tested in another unit test. This will change the output and break the nightly build, but that's good since we'll do more thorough testing.

* Per #1508, change the verbosity in unit_tc_gen.xml from -v 2 to -v 5 to print out some additional log messages that may help in debugging the intermittent file list failure.

* Feature 1572 v9.1.1 (#1573)

* Per #1572, delete the docs/version file as it is not needed here.

* Per #1572, update the version number to 9.1.1.

* Per #1572, list the met-9.1.1 release date as 20201119 for the docs.

* Per #1572, add release notes for the met-9.1.1 verison.

* Per #1572, add release notes for met-9.1.1 version.

* Per #1572, let's try to release met-9.1.1 today 11/18.

* Correct GitHub link.

* Fix small typo in release notes.

* Bugfix 1618 main v91 pb2nc (#1622)

Co-authored-by: Howard Soh <[email protected]>

* Update pull_request_template.md

* Per #1638, apply the same 3 fixes to the main_v9.1 branch to be included in the next bugfix release for met-9.1. (#1640)

* Per #1646, one line fix for cut-and-paste error. (#1648)

* Adding necessary files for ReadTheDocs (#1703)

* Per #1706, update aggr_ecnt_lines() to also add an entry to the skip_ba array of booleans. This is used to keep track of whether or not the point should be excluded from the statistics and is used in the Ensemble-Stat. This same aggregation code in the vx_statistics library is used by Stat-Analysis. Since Stat-Analysis failed to populate that array, it led to an array parsing error in ECNTInfo::set() when trying to compute the sum of the weights. (#1707)

* Per #1694, exactly the same set of bugfix changes but applied to main_v9.1 rather than the develop branch. (#1709)

* Feature 1710 v9.1.2 (#1711)

* Per #1710, update the release notes and version numbers for the 9.1.2 release.

* Per #1710, add a release note about the move to read the docs.

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added blank line for Warning

* #1715 Added a blank line for Error

* Bugfix 1716 main_v9.1 perc_thresh (#1721)

* Per #1716, committing the same bugfix to the main_v9.1 branch for inclusion in the 9.1.3 bugfix release.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Per #1723, update the verison number to 9.1.3. (#1730)

* Update pull_request_template.md

Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: jprestop <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Mar 31, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Apr 2, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Apr 2, 2021
JohnHalleyGotway added a commit that referenced this issue Apr 5, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Apr 9, 2021
* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Apr 12, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: George McCabe <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Apr 17, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: George McCabe <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Apr 27, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: George McCabe <[email protected]>
JohnHalleyGotway added a commit that referenced this issue May 24, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: [email protected] <[email protected]>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <[email protected]>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <[email protected]>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <[email protected]>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: George McCabe <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: j-opatz <[email protected]>
Co-authored-by: Daniel Adriaansen <[email protected]>
Co-authored-by: bikegeek <[email protected]>
Co-authored-by: bikegeek <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Jun 1, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: [email protected] <[email protected]>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <[email protected]>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <[email protected]>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <[email protected]>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

* Migrate issue and PR template changes from PR #1803 into the develop branch so that they'll be available for future releases.

* Per met-help question (https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99964) clarify the description of the obs_thresh option.

* Update README.md

Adding GitHub Discussions information

* changed non-unicode apostrophe and fixed typo in URL

* Feature 1581 api point obs (#1812)

* #1581 Initial release

* #1581 Added met_nc_point_obs.cc met_nc_point_obs.h

* Removed nc_ in function names and moved them to the struct members

* #1581 Added HDR_TYPE_ARR_LEN

* #1581 Changed API calls (API names)

* #1581 Cleanup

* #1581 Removed duplicated definitions:hdr_arr_len, hdr_typ_arr_len, and obs_arr_len

* #1581 Removed duplicated definitions:hdr_arr_len and obs_arr_len

* #1581 Removed duplicated definitions: str_len, hdr_arr_len, and obs_arr_len

* Added vx_nc_obs library

* #1581 Using common APIs

* #1581 Corrected API calls because of renaming for common APIs

* #1581 Moved function from nc_obs_util to nc_point_obs2

* #1581renamed met_nc_point_obs to nc_point_obs

* #1581 API ica changed from obs_vars to nc_point_obs

* #1581 Initial release

* #1581 Renamed from met_nc_point_obs to nc_point_obs

* 1581 Renamed met_nc_point_obs to nc_point_obs

* Per #1581, update the Makefile.am for lidar2nc to fix a linker error. Unfortunatley, the vx_config library now depends on the vx_gsl_prob library. threshold.cc in vx_config includes a call to normal_cdf_inv(double, double, double) which is defined in vx_gsl_prob. This adds to the complexity of dependencies for MET's libraries. Just linking to -lvx_gsl_prob one more time does fix the linker problem but doesn't solve the messy dependencies.

* #1581 Added method for NcDataBuffer

* Cleanup

* #1581 Cleanup

* #1581 Cleanup

* #1591 Cleanup

* #1591 Corrected API

* #1581 Avoid reading PB header twice

* #1581 Warning if PB header is not defined but read_pb_hdr_data is called

* #1581 Cleanup libraries

* 1581 cleanup

* 1581 cleanup

* 1581 cleanup

* #1581 Cleanup for Fortify (removed unused variables)

* #1581 Cleanup

* #1581 Cleanup

* #1581 Use MetNcPointObsIn instead of MetNcPointObs

* #1581 Use MetNcPointObsOut instead of MetNcPointObs2Write

* #1581 Separated nc_point_obs2.cc to nc_point_obs_in.cc and nc_point_obs_out.cc

* #1581 Renamed nc_point_obs2.cc to nc_point_obs_in.cc And added add nc_point_obs_in.h nc_point_obs_out.h nc_point_obs_out.cc

* #1581 Removed APIs related with writing point obs

* #1581 Changed copyright years

* #1581 Cleanup

* #1581 Updated copyright year

* #1581 Cleanup

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Added more APIs

Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: George McCabe <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: j-opatz <[email protected]>
Co-authored-by: Daniel Adriaansen <[email protected]>
Co-authored-by: bikegeek <[email protected]>
Co-authored-by: bikegeek <[email protected]>
Co-authored-by: George McCabe <[email protected]>
JohnHalleyGotway added a commit that referenced this issue Jun 13, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <[email protected]>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <[email protected]>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <[email protected]>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: [email protected] <[email protected]>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <[email protected]>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <[email protected]>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <[email protected]>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

* Migrate issue and PR template changes from PR #1803 into the develop branch so that they'll be available for future releases.

* Per met-help question (https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99964) clarify the description of the obs_thresh option.

* Update README.md

Adding GitHub Discussions information

* changed non-unicode apostrophe and fixed typo in URL

* Feature 1581 api point obs (#1812)

* #1581 Initial release

* #1581 Added met_nc_point_obs.cc met_nc_point_obs.h

* Removed nc_ in function names and moved them to the struct members

* #1581 Added HDR_TYPE_ARR_LEN

* #1581 Changed API calls (API names)

* #1581 Cleanup

* #1581 Removed duplicated definitions:hdr_arr_len, hdr_typ_arr_len, and obs_arr_len

* #1581 Removed duplicated definitions:hdr_arr_len and obs_arr_len

* #1581 Removed duplicated definitions: str_len, hdr_arr_len, and obs_arr_len

* Added vx_nc_obs library

* #1581 Using common APIs

* #1581 Corrected API calls because of renaming for common APIs

* #1581 Moved function from nc_obs_util to nc_point_obs2

* #1581renamed met_nc_point_obs to nc_point_obs

* #1581 API ica changed from obs_vars to nc_point_obs

* #1581 Initial release

* #1581 Renamed from met_nc_point_obs to nc_point_obs

* 1581 Renamed met_nc_point_obs to nc_point_obs

* Per #1581, update the Makefile.am for lidar2nc to fix a linker error. Unfortunatley, the vx_config library now depends on the vx_gsl_prob library. threshold.cc in vx_config includes a call to normal_cdf_inv(double, double, double) which is defined in vx_gsl_prob. This adds to the complexity of dependencies for MET's libraries. Just linking to -lvx_gsl_prob one more time does fix the linker problem but doesn't solve the messy dependencies.

* #1581 Added method for NcDataBuffer

* Cleanup

* #1581 Cleanup

* #1581 Cleanup

* #1591 Cleanup

* #1591 Corrected API

* #1581 Avoid reading PB header twice

* #1581 Warning if PB header is not defined but read_pb_hdr_data is called

* #1581 Cleanup libraries

* 1581 cleanup

* 1581 cleanup

* 1581 cleanup

* #1581 Cleanup for Fortify (removed unused variables)

* #1581 Cleanup

* #1581 Cleanup

* #1581 Use MetNcPointObsIn instead of MetNcPointObs

* #1581 Use MetNcPointObsOut instead of MetNcPointObs2Write

* #1581 Separated nc_point_obs2.cc to nc_point_obs_in.cc and nc_point_obs_out.cc

* #1581 Renamed nc_point_obs2.cc to nc_point_obs_in.cc And added add nc_point_obs_in.h nc_point_obs_out.h nc_point_obs_out.cc

* #1581 Removed APIs related with writing point obs

* #1581 Changed copyright years

* #1581 Cleanup

* #1581 Updated copyright year

* #1581 Cleanup

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Added more APIs

Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>

* Feature 1792 gen_vx_mask (#1816)

* Per issue #1792, Change -type from optional to required. Set default_mask_type to MaskType_None. Added a check on mask_type to see if it's set and print error message accordingly.

* Update test_gen_vx_mask.sh

* For the first test, added -type poly, since the masking type is now required. SL

* For all of the Poly unit tests added -type poly to the command line. The mask type is now required. SL

* Modified document to indicate that -type string (masking type) is now required on the command line for gen_vx_mask. SL

* Update met/docs/Users_Guide/masking.rst

Co-authored-by: johnhg <[email protected]>

* Update met/src/tools/other/gen_vx_mask/gen_vx_mask.cc

Co-authored-by: johnhg <[email protected]>

Co-authored-by: Seth Linden <[email protected]>
Co-authored-by: johnhg <[email protected]>

* Fix 2 minor formatting errors in the release notes.

* PR #1816 for issue #1792 unexpectedly caused the NB to fail on 20210605. We changed -type from optional to required, but missed adding the -type option in unit_met_test_scripts.xml and unit_ref_config.xml. This is a hotfix to resolve that.

* Feature 1811 anchor links (#1822)

* testing new anchoring link idea.

* testing without the bold asterik

* tinkering with the look

* another attempt

* another attempt #2

* another attempt #3

* another attempt #4

* making sure new anchor works as expected.

* seeing if link will save with spaces instead of dashes

* need underscores to link

* is it fixed?

* testing

* testing 2

* testing 4

* testing 5

* testing 6

* testing 7

* testing 8

* going back to test original problem

* able to link with spaces instead of underscores. Testing if a return is possible to keep under 79 character limit.

* double checking everything is still working.

* DO NOT break ref lines apart, it won't work.

* trying a shorter name.

* continuing to add anchors

* updating lines 1946 thru 2214 with anchors

* updating lines 2214 thru 3371 with anchors

* updating lines 3371 to the end with anchors

* testing anchor

* testing anchor

* testing anchor 3

* testing anchor 4

* testing anchor 45 percent

* testing anchor final half

* fixing typo

* numbering fcst, obs_1 and 2 to create different links.

* finding more anchors that need numbers to keep them separate.

* fixing warnings

* fixing warnings

* fixing typo

* Feature 1749 hss (#1825)

* Per #1749, updating the MET version number from 10.0 to 10.1 prior to adding new columns of output to existing line types.

* Per #1749, adding 10.1 columns to the Makefile.am

* Per #1749, changes for the mechanics of adding the HSS_EC statistic to the MCTS line type. Still need to acutally compute it and make the expected correct value configurable.

* Per #1749, add hss_ec_value as a configurable option for Point-Stat and Grid-Stat. Still need to actually compute it correctly, add it to other test config files, add support to series_analysis/stat_analysis, update the docs, and make writeup corresponding issues for other METplus components.

* Per #1749, fix the column offsets for the HSS_EC columns.

* Per #1749, add correct definition of HSS_EC.

* Per #1749, pass hss_ec_value from the config file into the computation of the MCTS statistics.

* Per #1749, add hss_ec_value entry to all the Grid-Stat config files.

* Per #1749, update the documentation about the HSS_EC statistic.

* Per #1749, add the -hss_ec_value job command option to Stat-Analysis.

* Per #1749, no real code changes here. Just changing to consistent ordering with hss_ec_value preceeding rank_corr_flag.

* Per #1749, update docs for stat_analysis supporting hss_ec_value.

* Per #1749, add HSS_EC to Series-Analysis, but only with a constant hss_ec_value for now.

* Per #1749, add EC_VALUE to the MCTC line type definition.

* Per #1749, move ECvalue from the MCTSInfo class into the ContingencyTable class so that it's available to be included in the MCTC output line type.

* Per #1749, update point_stat, grid_stat, and series_analysis to accomodate the move of ECvalue from the MCTSInfo class to the ContingencyTable class.

* Per #1749, update library code to write EC_VALUE to the MCTC line type and update the User's Guide docs.

* Per #1749, update stat_analysis code for the addition of EC_VALUE in the MCTC line type.

* Per #1749, write EC_VALUE to the MCTC output line type.

* Per #1749, store the ec_value that was actually used to compute the stats.

* Per #1749, parsing EC_VALUE from the MCTC line type.

* Per #1749, move the MCTC EC_VALUE column to the end of the line, as requested by METdatadb.

* Per #932, need to write MCTS HSS_EC value to temp file during the bootstrapping process.

* Added new reference for Ou 2016

* Layout correction

* Added generalized HSS, removed word from HSS_EC

* Per #1749, change the hss_ec_value config entry to match new conventions.

Co-authored-by: j-opatz <[email protected]>

* Feature 1826 v10.1.0_beta1 (#1828)

* Per #1826, add update the version in the docs to 10.1.0-beta1 and add release notes for this development version.

* Per #1826, change the beta1 release date to 6/11 so that I can do it today.

* Revoming Randy and David from the email notification list for nightly run scripts.

Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: Julie.Prestopnik <[email protected]>
Co-authored-by: David Fillmore <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: George McCabe <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: j-opatz <[email protected]>
Co-authored-by: Daniel Adriaansen <[email protected]>
Co-authored-by: bikegeek <[email protected]>
Co-authored-by: bikegeek <[email protected]>
Co-authored-by: George McCabe <[email protected]>
Co-authored-by: Seth Linden <[email protected]>
Co-authored-by: Seth Linden <[email protected]>
Co-authored-by: lisagoodrich <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MET: Library Code requestor: DTC/AF V&V Air Force Verification and Validation Project type: bug Fix something that is not working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants