Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlite null fix #673

Merged
merged 3 commits into from
Nov 9, 2023
Merged

Sqlite null fix #673

merged 3 commits into from
Nov 9, 2023

Conversation

hellkite500
Copy link
Contributor

Fix handling of null/nan enteries in hydrofabric.

Changes

  • Allow sqlite iterator to read column types per iteratorion
  • Check for string "null" in alt_ids when referencing feature by alternative id property

Testing

  1. Tested locally, and via local use of the partition generator to generate partitions of huc01 hydrofabric.

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Target Environment support

  • Linux
  • MacOS

@hellkite500 hellkite500 requested a review from program-- November 9, 2023 00:19
else if( returncode == SQLITE_ROW){
//Update column dtypes for the next row
for (int i = 0; i < this->column_count && i < column_types.size(); i++) {
this->column_types[i] = (sqlite3_column_type(this->ptr(), i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be done for every row returned by the iterator, or just once for the first row we get? Why can/would column types differ from row to row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value in a column is NaN, then sqlite3_column_type will report that as a 5 or NULL type. So if the first row contains NaN but other rows have valid data, then all types are reported as NULL. (This is how we discovered this issue...)

Re-checking for each row ensures we get a non-null type if there is valid data in the column in that row.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For additional context, sqlite3 has table definitions that include a schema (i.e. PRAGMA table_info(...);), however, when interacting with rows, sqlite does not refer to the types in the table definition, but instead rely on row-local affinity (or something of that nature) to determine types -- so columns do not have a type associated wholly. Hence, the need to check each row.

The other alternative to this (but a bit more involved) is to pull the types from the table definition when an iterator is created. This solution of pulling types on each iteration is a bit quicker to get into the codebase though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the context that was missing. Nels, could you leave a synopsis of that as a comment in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@PhilMiller PhilMiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a typo in your second commit message - 'alt_it' vs 'alt_id', that would make this commit harder to find by searching later

@hellkite500
Copy link
Contributor Author

You've got a typo in your second commit message - 'alt_it' vs 'alt_id', that would make this commit harder to find by searching later

Third time's a charm, should be fixed now...

@donaldwj donaldwj merged commit 0a5e1f8 into NOAA-OWP:master Nov 9, 2023
19 checks passed
program-- added a commit to program--/ngen that referenced this pull request Nov 9, 2023
hellkite500 pushed a commit that referenced this pull request Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants