key space for chunk manifest #71

d-v-b · 2024-04-03T18:59:56Z

if I understand correctly, the keys of the chunk manifest are chunk ids, which are strings like "0.0.0". If a zarr array has a chunk size of [10, 10, 10], then we can think of "0.0.0" as denoting the cube of indices [[0...9], [0...9], [0...9]], which in turn denotes the region in the virtual array associated with the data in a chunk.

If a Zarr array is sliced arbitrarily, then some of its chunks might be sliced before inserting into the Zarr array, in which case the region associated with that (chunk , slice) combination is not the full cube of indices associated with the string chunk id, but a subse of those indices (and the space of the subsets is determined by the slicing operations that are allowed).

So I wonder what would break in virtualizarr if the name of a chunk was changed from the chunk ID string to a sliceable expression of the output region associated with that chunk (and if the slicing operation was also expressible as the value associated with that key). This would define an abstract representation of the output region : chunk relationship that could be used internally in Zarr for lazy slicing, which I would like a lot :) If it's not workable to make this change, it would be great to know why so we can avoid it.

The text was updated successfully, but these errors were encountered:

TomNicholas · 2024-04-29T19:47:06Z

I'm still thinking about this suggestion @d-v-b , but one problem is that really VirtualiZarr's array type should expose a .chunks property that supports variable-length chunks (see #38).

Your statement

If a zarr array has a chunk size of [10, 10, 10], then we can think of "0.0.0" as denoting the cube of indices [[0...9], [0...9], [0...9]]

is only true if you assume chunks are all the same length.

TomNicholas added the zarr-python Relevant to zarr-python upstream label May 16, 2024

TomNicholas mentioned this issue Jun 28, 2024

Incrementally-populated Zarr Arrays zarr-developers/zarr-specs#300

Open

TomNicholas mentioned this issue Jul 11, 2024

Chunk-aligned indexing #183

Open

7 tasks

TomNicholas mentioned this issue Jul 26, 2024

Splitting out lazy indexing layer and backends layer as zarr-python features pydata/xarray#9281

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

key space for chunk manifest #71

key space for chunk manifest #71

d-v-b commented Apr 3, 2024

TomNicholas commented Apr 29, 2024 •

edited

Loading

key space for chunk manifest #71

key space for chunk manifest #71

Comments

d-v-b commented Apr 3, 2024

TomNicholas commented Apr 29, 2024 • edited Loading

TomNicholas commented Apr 29, 2024 •

edited

Loading