Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Dense subarray-write full corner handling #3271

Open
johnkerl opened this issue Nov 1, 2024 · 0 comments
Open

[python] Dense subarray-write full corner handling #3271

johnkerl opened this issue Nov 1, 2024 · 0 comments
Assignees
Labels

Comments

@johnkerl
Copy link
Member

johnkerl commented Nov 1, 2024

Follow-on work from #2407.

With core 2.27 we have current-domain support for dense arrays, and thereby we can make the core domain (soma maxdomain) larger.

PR #3269 fixed a case for core 2.27 (found at TileDB-Inc/centralized-tiledb-nightlies#25) for this kind of write and readback:

[#####..................]
 ^^^^^                   data written
 ^^^^^^^^^^^^^^^^^^^^^^^ domain

That is a test case that has been existing, and is now passing with core 2.26 as well as core 2.27.

However, there are test cases which were never written, and which do not work (with core 2.26, 2.27, or otherwise), like this:

[.....#####.............]
      ^^^^^              data written
 ^^^^^^^^^^^^^^^^^^^^^^^ domain
[..................#####]
                   ^^^^^ data written
 ^^^^^^^^^^^^^^^^^^^^^^^ domain

Here's why. Say you have a dense 2-D array with shape (0,99) and you wrote data in (40,43) and you want to do is read it back. Everything's fine in core and libtiledbsoma except this bit:
https://github.com/single-cell-data/TileDB-SOMA/blob/1.15.0rc3/apis/python/src/tiledbsoma/_dense_nd_array.py#L224-L236
This is because the code there (written by yours truly many months ago) only looks at the non-empty domain upper not the non-empty domain lower. So core gives back an array of length 4 (as it should, with values at 40,41,42,43) but here we mishandle it:
https://github.com/single-cell-data/TileDB-SOMA/blob/1.15.0rc3/apis/python/src/tiledbsoma/_dense_nd_array.py#L275

  • The arrow_table.column("soma_data") is fine (length 4)
  • The .to_numpy() on that is fine (length 4)
  • It's only the .reshape(target_shape) that's wrong because target_shape is 44

See also #3272 for the R side which needs different work done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant