-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update for zarr 2.18.0 and zarr 2.18.1 #195
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dev #195 +/- ##
=======================================
Coverage 86.05% 86.05%
=======================================
Files 5 5
Lines 1162 1162
Branches 287 287
=======================================
Hits 1000 1000
Misses 107 107
Partials 55 55 ☔ View full report in Codecov by Sentry. |
Using the |
@oruebel There are also some deprecation warnings for zarr 3.0. I will add filters in this PR, but I'll also make a ticket for those deprecations to be resolved to whatever zarr will be pivoting to. |
Review Notes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for taking care of this.
Motivation
What was the reasoning behind this change? Please explain the changes briefly.
This PR is to update hdmf-zarr for changes that came n 2.18.0 and new changes in 2.18.0
2.18.0
This is in response to the failing workflows using the latest version of zarr, specifically the failing test test_fsspec_streaming.
The issue is how we are reading zarr scalar arrays. We are currently trying to directly index and access the scalar. From looking at the documentation, zarr accessed a scalar array as
z[:]
, which seems to solve the problem. I also saw thatz[()]
was another syntax that works.Why?
Not completely clear.
When we access a scalar array in numpy we do so with notation
[()]
. Supposedly, Zarr used to support indexing scalar arrays asz[0]
, but that was updated to numpy standard earlier in one of the zarr version 2.#. Maybe this was a functionality that finally got flushed out fully.2.18.1
To ensure that the data being assigned is in a format that the Zarr array can handle efficiently, Zarr arrays seem to require that we set with numpy arrays, i.e.,
dset[:] = np.array(data)
.Another weird behavior is when setting a dataset of references, the current method puts the entire dataset in each index vs matching the individual reference dictionary to the index (which is what we expect).
The solution to both is make sure the data is in an array first.
How to test the behavior?
Checklist
ruff
from the source directory.