Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Fix readback of some older-style metadata #3653

Merged
merged 2 commits into from
Jan 31, 2025
Merged

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Jan 31, 2025

Issue and/or context: sc-62798

Bug introduced as of TileDB-SOMA 1.15.4 via PR #3607 which was the backport to the release-1.15 branch of #3558.

Changes:

Metadata were raeding back incorrectly in the cases when their values' tdb_type was TILEDB_STRING_ASCII (11) rather than TILEDB_STRING_UTF8 (12).

Note, however, that I ran the following test today:

  • Write arrays using TileDB-SOMA 1.14.5, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, and 1.15.5, pip-install each version
  • Outer-loop over those same versions, pip-installing each one
    • Inner-loop over the arrays from above
    • Try using version y to read metadata written with version x
    • In all cases I could not get a fail

Another test:

  • I read various old arrays I'd created 1, 3, 6 months ago, and could not repro the bug

In short, this occurs for some arrays but not all arrays. (All arrays tested in-house read back fine, with tiledbsoma 1.15.5).

Notes for Reviewer:

Uncertain how to test, given failure to repro with any of my historical or current data.

One idea:

  • Hack tiledbsoma (in sandbox) to write metadata with value-type TILEDB_STRING_ASCII
  • Create some data
  • Save that off in the repo as canned data
  • Run unit tests on that

@johnkerl johnkerl marked this pull request as ready for review January 31, 2025 20:19
@johnkerl johnkerl requested a review from jp-dark January 31, 2025 20:21
Copy link

codecov bot commented Jan 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.30%. Comparing base (a42d6b5) to head (a67d998).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3653      +/-   ##
==========================================
+ Coverage   86.25%   86.30%   +0.04%     
==========================================
  Files          55       55              
  Lines        6381     6381              
==========================================
+ Hits         5504     5507       +3     
+ Misses        877      874       -3     
Flag Coverage Δ
python 86.30% <ø> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 86.30% <ø> (+0.04%) ⬆️
libtiledbsoma ∅ <ø> (∅)

@johnkerl
Copy link
Member Author

johnkerl commented Jan 31, 2025

I also validated that the alternative fix

         } else {
-            py::dtype value_type = tdb_to_np_dtype(tdb_type, 1);
+            py::dtype value_type = tdb_to_np_dtype(tdb_type, value_num);
             results[py::str(key)] = py::array(value_type, value_num, value)
                                         .attr("item")(0);
         }

resolves the reported symptom.

(Note that that 1 predates #3558.)

@@ -235,7 +235,7 @@ py::dict meta(std::map<std::string, MetadataValue> metadata_mapping) {
results[py::str(key)] = py::array(value_type, value_num, value)
.attr("item")(0);
} else {
py::dtype value_type = tdb_to_np_dtype(tdb_type, 1);
py::dtype value_type = tdb_to_np_dtype(tdb_type, value_num);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either of these two lines fix the symptom. As noted in the description field, though, we need more unit-test coverage here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnkerl johnkerl merged commit 75e7e9a into main Jan 31, 2025
12 checks passed
@johnkerl johnkerl deleted the kerl/unimeta branch January 31, 2025 21:20
github-actions bot pushed a commit that referenced this pull request Jan 31, 2025
* [python] Fix readback of some older-style metadata

* other fix; why not both?
johnkerl added a commit that referenced this pull request Jan 31, 2025
* [python] Fix readback of some older-style metadata

* other fix; why not both?

Co-authored-by: John Kerl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants