Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ngff zarr): add multi channel support #480

Merged
merged 12 commits into from
May 17, 2022

Conversation

PaulHax
Copy link
Collaborator

@PaulHax PaulHax commented May 4, 2022

Somehow these hacks work for all the datasets I've loaded except for the 2 lowest resolution scales in oeway's astronaut where the chunks span components.

@PaulHax PaulHax requested a review from thewtex May 4, 2022 22:07
@thewtex
Copy link
Member

thewtex commented May 5, 2022

const chunkSize = toDimensionArray(['c', 'x', 'y', 'z'], info.chunkSize)

Instead of ['c', 'x', 'y', 'z'], this should be info.dims now, correct?

@PaulHax
Copy link
Collaborator Author

PaulHax commented May 6, 2022

Works on oeway's astronaut now. Some fishy stuff with the x chunk index to shift down x rows:

pixelArray[
                cc +
                  zz * pixelStrides.z +
                  (yy + x) * pixelStrides.y +
                  xx * pixelStrides.x
              ] =

@PaulHax PaulHax force-pushed the zarr-channels branch 2 times, most recently from 4a11f39 to 54457d8 Compare May 6, 2022 16:43
@oeway
Copy link
Collaborator

oeway commented May 6, 2022

Thanks for working on this! Looking forward it! FYI: I am currently making a browser file system that can handle large files backed by indexeddb. Would be nice to test using pyodide/jupyterlite + zarr-python to read in-browser large files, then pipe the data as a zarr store to itk-vtk-viewer!

}

const chunkSize = Object.fromEntries(info.chunkSize)
const chunkStrides = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk strides for each dimension will be based on the input MultiscaleSpatialImage dims and their order: maybe a slice here on the dims array starting on the dim index and a reduce based on the chunkSize.

@PaulHax PaulHax force-pushed the zarr-channels branch 2 times, most recently from ced60eb to 0822171 Compare May 9, 2022 13:16
@PaulHax PaulHax marked this pull request as ready for review May 9, 2022 13:16
Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦖 🌮 🎉

Looks awesome!

Steps for optimization noted inline that I think we should add before merging.

Comment on lines 99 to 119
for (let cc = itStart.c; cc < itEnd.c; cc++) {
for (let zz = itStart.z; zz < itEnd.z; zz++) {
for (let yy = itStart.y; yy < itEnd.y; yy++) {
for (let xx = itStart.x; xx < itEnd.x; xx++) {
pixelArray[
cc +
xx * pixelStrides.x +
yy * pixelStrides.y +
zz * pixelStrides.z
] =
chunk[
// subtract x * chunkSize.x from xx to start at beginning of chunk despite itStart.x
(xx - x * chunkSize.x) * chunkStrides.x +
(yy - y * chunkSize.y) * chunkStrides.y +
(zz - z * chunkSize.z) * chunkStrides.z +
(cc - c * chunkSize.c) * chunkStrides.c
]
} // column
} // row
} // slice
} // component
} // chunk
Copy link
Member

@thewtex thewtex May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimizations to add to this critical piece:

[ ] 1. Three versions of the loop based on the dim order. a) tczyx, b) tzyxc c) other. a) for most OME-NGFF, b for the interleaved channel c) in the wild, e.g. it is also common to have tzcyz. c) can be like it is now. a) and b) will further be optimized in the following steps.

[ ] 2. For the conditions where we can use them in the inner-most loop, use .subarray and .set for getting and setting the contiguous region.

[ ] 3. Cache the offsets in the intermediate loops.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added caching of offsets. Do we have any example images of non tczyx dim ordering we would like to support? I have not tested any...

@thewtex
Copy link
Member

thewtex commented May 12, 2022

Looking good. Can we base and add tests for the new datasets?

@PaulHax
Copy link
Collaborator Author

PaulHax commented May 12, 2022

Looking good. Can we base and add tests for the new datasets?

Added tests for a selection of the working images.

Some issues:

  • All the first_instar_brains: .zattrs 'axes' metadata order does not match the array's .zattrs "_ARRAY_DIMENSIONS". I'll change code to use _ARRAY_DIMENSIONS.
  • first_instar_brain_czyx.zarr/.zattrs file is empty {}
  • Most of the idr datasets have their dtype endianness the way we don't support, >u2 and >f4. Not sure what to do there.

@PaulHax PaulHax force-pushed the zarr-channels branch 2 times, most recently from 152d4fe to 761e8b7 Compare May 13, 2022 16:48
@thewtex
Copy link
Member

thewtex commented May 13, 2022

For the first two, we should fix the data generator. Could you please create issues on the multi-scale-spatial-image repository, and assign it to me?

For the endianness issue, we should transform an chunk array from big to little before loading into the itk.Image. Could you please look for a JS library to do that and apply when required?

Chrome --no-sandbox
Thread --dockered flag through Karma to toggle some tests
Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

['f8', 'getFloat64'],
])

export const ElementGetter = (dtype, buffer) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For efficiency, let's have this operate over the chunk TypedArray instead of the element

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where to do this optimization? Current in worker flow: Dataview(chunkBuffer).getX(index, endiness) -> IJKTypedArray.

@PaulHax PaulHax force-pushed the zarr-channels branch 3 times, most recently from b10d70c to 0e2b82d Compare May 16, 2022 15:53
@PaulHax
Copy link
Collaborator Author

PaulHax commented May 16, 2022

[ ] 2. For the conditions where we can use them in the inner-most loop, use .subarray and .set for getting and setting the contiguous region.

Implemented the above with guidance from the old code. Good speedup with first_instar_brain_zyxc.zarr highest resolution taking the worker ~11ms vs old way of ~26ms.

Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turbo-tacular!~ 🚀 👨‍🎤

Please take a look at a comment inline, follow-up as needed, and merge this puppy. 🥺

dataset,
}) => {
const dims =
pixelArrayAttrs?._ARRAY_DIMENSIONS ?? // xarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just change the order here -- first try / trust axes, then _ARRAY_DIMENSIONS?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that many NGFFs don't have a foo.zarr/s0/.zattrs where _ARRAY_DIMENSIONS lives. To avoid 404s the code is not fetching the scale level .zattrs, so no _ARRAY_DIMENSIONS to fall back to.

@thewtex thewtex merged commit aef6541 into Kitware:master May 17, 2022
@oeway
Copy link
Collaborator

oeway commented May 18, 2022

Hi @PaulHax @thewtex Awesome to see this PR get merged! Well done, can't wait to try it.

It looks like the master branch CI is failing and that it explains why I cannot already show the multi-channel zarr images.

@github-actions
Copy link

🎉 This PR is included in version 12.3.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@thewtex thewtex mentioned this pull request May 18, 2022
@thewtex
Copy link
Member

thewtex commented May 18, 2022

Hi @oeway CI was given some ❤️ , v12.3.0 is now out with these improvements, please give it a whirl!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants