Bugfix 1737 main_v9.1 little_r #1738

JohnHalleyGotway · 2021-03-30T16:35:52Z

Pull Request Testing

Describe testing already performed for these changes:

Manually ran with data from cheyenne that contains corrupted records and confirmed that the updated code prints warning messages about bad records but also writes output for all good records.
Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

I added a unit test for the develop branch but not the main_v9.1 branch. Recommend doing the PR review for both main_v9.1 and develop at the same time.
Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]
Added warning log message:

WARNING: LittleRHandler::_readObservations() -> the 18 entry of the little_r report on line 1 does not match the timestring regular expression "[0-9]\{14\}":
WARNING: "

Do these changes include sufficient testing updates? [Yes]
Unit test updates are included for the develop branch, not main_v9.1.
Will this PR result in changes to the test suite? [No]

If yes, describe the new output and/or changes to the existing output:

Pull Request Checklist

See the METplus Workflow for details.

Complete the PR definition above.
Ensure the PR title matches the feature or bugfix branch name.
Define the PR metadata, as permissions allow.
Select: Reviewer(s), Project(s), and Milestone
After submitting the PR, select Linked Issues with the original issue number.
After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
Close the linked issue and delete your feature or bugfix branch from GitHub.

…the header line contains a valid timestamp. If not, skip all observations for that record.

jprestop

I looked at the files that were changed, which looked good. I checked the run_unit_test.log file at kiowa:/d1/projects/MET/MET_pull_requests/met-10.0.0_beta5/bugfix_1737 for any failed tests and there were none. I approve this request.

* Start on write netcdf pickle alternative. * Write dataplane array. * Start on read of netcdf as pickle alternative. * Create attribute variables. * Use global attributes for met_info attrs. * Add grid structure. * Read metadata back into met_info.attrs. * Convert grid.nx and grid.ny to int. * Rename _name key to name. * Removed pickle write. * Fixed write_pickle_dataplane to work for both numpy and xarray. * Use items() to iterate of key, value attrs. * Write temporary text file. * Renamed scripts. * Changed script names in Makefile.am. * Replaced pickle with tmp_nc. * Fixed wrapper script names. * Test for attrs in met_in.met_data. * Initial version of read_tmp_point module. * Added read_tmp_point.py to install list. * Start on Python3_Script::read_tmp_point. * Write MPR tmp ascii file. * Renamed to read_tmp_ascii to use for point point and MPR. * Renamed to read_tmp_ascii to use for point point and MPR. * Define Python3_Script::import_read_tmp_ascii_py. * Call Python3_Script::import_read_tmp_ascii_py. * Append MET_BASE/wrappers to sys.path. * Finished implementation of Python3_Script::import_read_tmp_ascii_py. * Call Python3_Script::read_tmp_ascii in python_handler. * Revised python3_script::read_tmp_ascii with call to run, PyRun_String. * Return PyObject* from Python3_Script::run. * Restored call to run_python_string for now. * Per #1429, enhance error message from DataLine::get_item(). (#1682) * Feature 1429 tc_log second try (#1686) * Per #1429, enhance error message from DataLine::get_item(). * Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message. * Feature 1588 ps_log (#1687) * Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good. * Per #1588, switch very detailed interpolation details from debug level 4 to 5. * Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level. * Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs. * #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced * #1454 Moved NC attribute name to nc_utils.h * #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance * #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance * #1454 Corrected data.delta_lon * #1454 Change bact to use diff instead of absolute value of diff * 454 Deleted instea dof commenting out * 454 Deleted instea dof commenting out * Feature 1684 bss and 1685 single reference model (#1689) * Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions. * Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class. * Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file. * Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions. * Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic. * Per #1684, during development, I ran across and then updated this log message. * Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2). * Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values. * Per #1684, just fixing the format of this log message. * Per #1684, add a STATLine::get_offset() member function. * Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number. * Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones. * Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution. * Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model. * Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects. * Per #1684, remove debug print statements. * Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis. * Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference: For a climo distribution defined by mean and stdev: DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble. For a single deterministic reference: DEBUG 3: Computing ensemble statistics relative to the climatological mean. * Per #1691, add met-10.0.0-beta4 release notes. (#1692) * Updated Python documentation * Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions. * Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place. * Per #1694, just switching to consistent variable name. * Just consistent spacing. * Added python3_script::import_read_tmp_ascii. * Restored read_tmp_ascii call. * Added lookup into ascii module. * Adding files for ReadTheDocs * Adding .yaml file for ReadTheDocs * Updated path to requirements.txt file * Updated path to conf.py file * Removing ReadTheDocs files and working in separate branch * Return PyObject* from read_tmp_ascii. * Put point_data in global namespace. * Remove temporary ascii file. * Added tmp_ascii_path. * Removed read_obs_from_pickle. * Trying different options for formats (#1702) * Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708) * Feature 1471 python_grid (#1704) * Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics. * Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution. * Per #1471, add a sanity check to make sure the grid and data dimensions actually match. * Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file. * Per #1471, document python grid changes in appendix F. * Per #1471, just spacing. * Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic. Co-authored-by: John Halley Gotway <[email protected]> * Committing a fix for unit_python.xml directly to the develop branch. We referenced in a place where it's not defined. * Add *.dSYM to the .gitignore files in the src and internal_tests directories. * Replaced tmp netcdf _name attribute with name_str. * Append user script path to system path. * Revert "Feature 1319 no pickle" (#1717) * Fixed typos, added content, and modified release date format * #1715 Initial release * #1715 Do not combined if there are no overlapping beteewn TQZ and UV records * #1715 Added pb2nc_compute_pbl_cape * #1715 Added pb2nc_compute_pbl_cape * #1715 Reduced obs_bufr_var. Removed pb_report_type * #1715 Added a blank line for Error/Warning * Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726) * Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724) * Bugfix 1716 develop perc_thresh (#1722) * Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds. * Per #1716, no code changes, just consistent formatting. * Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values. * Update pull_request_template.md * Feature 1733 exc (#1734) * Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files. * Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options. * Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests. * Per #1773, update the user's guide with the new config and job command options. * Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options. * Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear. * Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc. * Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options. * Bugfix 1737 develop little_r (#1739) * Per #1737, migrate the same fix from main_v9.1 over to the develop branch. * Per #1737, add another unit test for running ascii2nc with corrupt littl_r records. Co-authored-by: David Fillmore <[email protected]> Co-authored-by: Howard Soh <[email protected]> Co-authored-by: hsoh-u <[email protected]> Co-authored-by: Julie.Prestopnik <[email protected]> Co-authored-by: David Fillmore <[email protected]> Co-authored-by: John Halley Gotway <[email protected]> Co-authored-by: MET Tools Test Account <[email protected]>

Per #1737, use a regular expression to make sure the 18th element of …

af4f49c

…the header line contains a valid timestamp. If not, skip all observations for that record.

JohnHalleyGotway added this to the met-9.1.4 (bugfix) milestone Mar 30, 2021

JohnHalleyGotway requested a review from jprestop March 30, 2021 16:35

JohnHalleyGotway linked an issue Mar 30, 2021 that may be closed by this pull request

Fix ascii2nc to handle bad records in little_r format. #1737

Closed

21 tasks

jprestop approved these changes Mar 30, 2021

View reviewed changes

JohnHalleyGotway merged commit 408f52e into main_v9.1 Mar 30, 2021

JohnHalleyGotway deleted the bugfix_1737_main_v9.1_little_r branch March 30, 2021 22:55

JohnHalleyGotway mentioned this pull request Mar 31, 2021

Update develop-ref after #1734 and #1738 #1740

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix 1737 main_v9.1 little_r #1738

Bugfix 1737 main_v9.1 little_r #1738

JohnHalleyGotway commented Mar 30, 2021

jprestop left a comment

Bugfix 1737 main_v9.1 little_r #1738

Bugfix 1737 main_v9.1 little_r #1738

Conversation

JohnHalleyGotway commented Mar 30, 2021

Pull Request Testing

Pull Request Checklist

jprestop left a comment

Choose a reason for hiding this comment