-
-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance when slicing with steps #843
Comments
@cgohlke, are there any updates on this? |
Doesn't look like. Timings are basically the same, maybe slightly worse:
|
HI @cgohlke! Thanks for reporting this issue and sorry that no one has replied to you yet. It sounds like you are suggesting that an optimization should be made such that strided indexing can pull only the chunks that are needed for the desired selection, rather than the entire range. This sounds reasonable to me. Some of the relevant code is here: Line 1151 in 9995939
Right now a lot of Zarr dev effort is focused on a big refactor (#877) which should dramatically improve slicing performance. Hopefully this optimization can be included in that effort. In the meantime, if you wanted to work on implementing this feature yourself, we would do our best to support you in a PR. Thanks for your valuable feedback. |
This stops chunks being read unnecessarily when a slice selection with a step was used. Previously all chunks spanning the start-end range would be used regardless of whether they contained any elements. Fixes zarr-developers#843.
Minimal, reproducible code sample, a copy-pastable example if possible
Output:
Problem description
Consider the minimal Zarr store posted above, which models the storage of a ~27 GB RGB image as individual scanlines/chunks in a TIFF file (
shape=(149568, 66369, 3)
,chunks=(1, 66369, 3)
,dtype='uint8'
). The store does no I/O and returns a pre-allocated chunk in__getitem__
to keep the overhead from the store at a minimum.Slicing the whole array with
z[:]
takes ~9.3 seconds. OK.Slicing the first 293 chunks with
z[:293]
takes ~0.02 s. OK.Slicing every 512th chunk with
z[::512]
, 293 chunks total, takes ~2.2 s. That's a hundred times what could be expected. It turns out that all chunks are read/accessed, not just the 293 chunks needed. This issue becomes significant when__getitem__
performs real work, I/O or decoding.Even when slicing only a single chunk via step size with
z[::149568]
, all chunks are read.This behavior can be reproduced with a zarr MemoryStore.
Is this behavior expected? Can the store be improved to work more efficiently with zarr slicing?
I tried two workarounds:
Manually indexing the chunks via list
zobj.oindex[list(range(0, 149568, 512))]
. This only reads the expected number of chunks but it is a burden and does not improve the performance in this simple case.Using dask to slice the store with steps (
dask.array.from_zarr(store)[::512]
) performs better than zarr and only reads the expected number of chunks. However, the overhead of dask+zarr is generally much larger than zarr alone (performance, memory, and dependencies).The text was updated successfully, but these errors were encountered: