Navigating image stack #478

jakirkham · 2017-09-23T21:09:54Z

Trying to figure out to what degree this is possible with datashader. Having not really used this library at this point, this question may be pretty naive. Thanks in advance for any advice.

Am interested in taking a stack of arrays (larger than memory) and navigating through them in a Jupyter notebook with a slider. Would be nice to see the intensity range in a colorbar as well. The arrays themselves are usually float32 though sometimes might be different (e.g. int16). The values in the arrays are simply intensities. Would want to normalize the view based on the maximum and minimum values in the stack. Can pretty easily represent this stack of images using a Dask Array that either pulls from a computation or, if necessary, pulls from a large file on disk. Could also provide something that acts like a NumPy reading directly from disk.

To what extent is something like this possible? Are there any useful examples that might be relevant for this use case? What sort of issues might one encounter trying to handle this use case?

The text was updated successfully, but these errors were encountered:

jbednar · 2017-09-24T04:22:29Z

This should be a good use case for datashader+holoviews+xarray+dask. I'd personally like to try it out for a large stack of microscope images scanned from a volume of tissue, particularly for EM images where each individual image is large as well. So far, our raster-image testing has only used geo-located data, which tends to have only a dozen or so co-registered layers, not a deep z-stack. There have been lots of changes over the past few months that should make such usage vastly more practical:

datashader is now supported by holoviews, which makes it easy to navigate multidimensional spaces using widgets
datashader's regridding support for rasters has recently been completely overhauled for performance, though I think it still needs major API improvements
the most recent dev release of Bokeh (0.12.10, from conda install -c bokeh/label/dev bokeh), provides binary array transport for the notebook, making it much, much faster to transfer image data from Python into the browser
xarray has improved its file-reading support, making it simpler to get image data into a dask.array data structure, and in response datashader now accepts xarray data directly

So there is good reason to think that this should now be a practical way to work with large image stacks. Unfortunately, I haven't personally tested this case, and I would suspect that there will still be some practical problems when trying to get that set up. @philippjfr, maybe you could sketch the outlines of some HoloViews-based, Datashader-backed code that you think could work for such a case? The basic idea should be trivial; it's just trying to make sure that the data doesn't all get read into memory or copied that I'd be worried about.

philippjfr · 2017-09-24T11:07:28Z

Yes, it does sound like a good use-case and I agree with the approach you have outlined. Depending on the size of each individual image I'm not even sure that datashader is necessary. I'll briefly sketch out two approaches, here is one where you define a very simple function to load the data as a NumPy array, which will be lazily executed as you drag the slider

import holoviews as hv
from holoviews.operation.datashader import regrid
hv.extension('bokeh')

def load_image(z):
    arr = ... # Load data for a particular z-value from disk as NumPy array
    return hv.Image(arr, bounds=(left, bottom, right, top), vdims='Intensity')

# Define DynamicMap with z-dimension to slide through
image_stack = hv.DynamicMap(load_image, kdims=['z'])

# Apply regridding in case data is large and set a global Intensity range 
regridded = regrid(image_stack).redim.range(Intensity=(0, 1000))
display_obj = regridded.opts(plot={'Image': dict(width=400, height=400, colorbar=True)})

another approach is using xarray to wrap and label your dask arrays (note that this depends on a datashader PR I just opened #479):

import dask.array as da
import xarray as xr
import holoviews as hv

from holoviews.operation.datashader import regrid

hv.extension('bokeh')

# Load DataArray in some way
sizes = (1000, 1000, 100)
darr = da.random.random(sizes, chunks=100)

# Wrap in xarray DataArray and label coordinates
dims = ['z', 'y', 'x',]
coords = {d: np.arange(s) for d, s in zip(dims, sizes)}
xrstack = xr.DataArray(darr, dims=dims, coords=coords)

# Wrap in HoloViews Dataset
ds = hv.Dataset(xrstack)

# Convert to stack of images with x/y-coordinates along axes
image_stack = ds.to(hv.Image, ['x', 'y'], dynamic=True)

# Apply regridding if each image is large
regridded = regrid(image_stack)

# Set a global Intensity range
regridded = regridded.redim.range(Intensity=(0, 1000))

# Set plot options
display_obj = regridded.opts(plot={'Image': dict(colorbar=True, width=400, height=400)})

Once you have the display_obj you can display it directly in the notebook or use hv.renderer('bokeh').server_doc(display_obj) to turn the whole thing into a bokeh server app.

jbednar · 2017-09-24T13:19:43Z

Sounds good, thanks! I've merged that PR, which looked uncontroversial to me. Is there a dataset that we could use to make this a public example? I'd like to try it out, myself, particularly for the case where each image is very large and we're only looking at a small patch of it at a time.

jakirkham · 2017-11-17T22:59:52Z

Thanks very much for the feedback. Sorry I've been slow replying on this front. Am in the very early stages of playing with holoviews and datashader. That said, this should work nicely based on my early prototyping, which borrows heavily from the comments here. :)

Most of the image data I'm playing with tends to have smaller frames ATM (e.g. roughly 1000x1000) though lots of frames (e.g. on the order of 10,000), but there definitely is some larger frame size data that might crop up in the future. Some good public data of this sort can be found from Neurofinder. Though this is all calcium imaging data.

As you mentioned EM data, would suggest taking a look at data hosted in our local DVID instance. This is all public data AFAIK. Maybe @stuarteberg can point you to some of interest. There is a viewer in the web interface to look at various portions and some open source software to retrieve data from DVID. Though you can also use the web API to grab pieces and stitch them together. @mrocklin did this with Dask on some older now unavailable(?) data in this gist.

Hope that helps. Please let me know if you have any other questions.

VolkerH · 2018-03-27T17:59:08Z

Before I embark on this journey myself, I just wanted to ask whether anyone has done further work in this direction?

jbednar · 2018-03-27T20:03:09Z

Yes! To quote from issue holoviz/geoviews#144 :

You can see an example of navigating an EM stack at pangeo.pydata.org (launch a server and select scikit-image-example.ipynb from the examples). If you try that example on your own machine, you should use the very latest HoloViews alpha release 1.10.0a2 (from conda install -c ioam/label/dev holoviews), which has massive speed improvements for this case.

underchemist · 2020-06-27T22:53:48Z

I'm attempting to follow the examples by @philippjfr except with hv.RGB and I'm running into issues. I have multiple jpeg (3 channel) images in a directory that I've loaded into an xarray with dimension z, x, y, channel, where z is the number of images. I want to be able to display the images and have a slider to navigate through the z dimension. I get an error when

xrstack = ... # load images
ds = hv.Dataset(xrstack)
image_stack = ds.to(hv.RGB, ['x', 'y'], dynamic=True)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-80b5bac31eea> in <module>
----> 1 image_stack = ds.to(hv.RGB, ['x', 'y'], dynamic=True)
/data/CondaEnvs/ya006948/envs/sbda/lib/python3.7/site-packages/holoviews/core/data/__init__.py in __call__(self, new_type, kdims, vdims, groupby, sort, **kwargs)
    141                 (max_d is not None and len(dims) > max_d)):
    142                 raise ValueError("%s %s must be between length %s and %s." %
--> 143                                  (type_name, dim_type, min_d, max_d))
    144 
    145         if groupby is None:
ValueError: RGB vdims must be between length 3 and 4.

From looking at the docs for hv.RGB it seems like inputs in the form of Z (NxMx(3 or 4)) should be accepted. If this isn't supported in this when declared through ds.to(...) how can I reshape my dataset to be in the form hv.RGB expects?

jakirkham mentioned this issue Jan 30, 2018

Dask image subpackage dask/dask#3111

Closed

jbednar mentioned this issue Mar 13, 2018

Loading local tiles to geoviews holoviz/geoviews#144

Closed

ianthomas23 assigned droumis Nov 21, 2022

droumis added this to CZI R5 neuro (moved) Jan 4, 2023

droumis removed their assignment Feb 17, 2023

droumis added this to CZI R5 neuro May 17, 2023

droumis moved this to Todo in CZI R5 neuro Jan 4, 2024

droumis self-assigned this Jan 4, 2024

droumis moved this from Todo to WIP in CZI R5 neuro Jun 23, 2024

droumis removed their assignment Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Navigating image stack #478

Navigating image stack #478

jakirkham commented Sep 23, 2017

jbednar commented Sep 24, 2017 •

edited

Loading

philippjfr commented Sep 24, 2017 •

edited

Loading

jbednar commented Sep 24, 2017

jakirkham commented Nov 17, 2017

VolkerH commented Mar 27, 2018

jbednar commented Mar 27, 2018

underchemist commented Jun 27, 2020

Navigating image stack #478

Navigating image stack #478

Comments

jakirkham commented Sep 23, 2017

jbednar commented Sep 24, 2017 • edited Loading

philippjfr commented Sep 24, 2017 • edited Loading

jbednar commented Sep 24, 2017

jakirkham commented Nov 17, 2017

VolkerH commented Mar 27, 2018

jbednar commented Mar 27, 2018

underchemist commented Jun 27, 2020

jbednar commented Sep 24, 2017 •

edited

Loading

philippjfr commented Sep 24, 2017 •

edited

Loading