Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: pickle serialization for itk images (issue 1091) #2829

Merged

Conversation

GenevieveBuckley
Copy link
Contributor

This PR adds pickle support to ITK images. We want this so that ITK images can be serialized for use with Dask arrays.

Closes #1091

cc @thewtex

PR Checklist

  • No API changes were made (or the changes have been approved)
  • No major design changes were made (or the changes have been approved)
  • Added test (or behavior not changed)
  • Updated API documentation (or API not changed)
  • Added license to new files (if any)
  • Added Python wrapping to new files (if any) as described in ITK Software Guide Section 9.5
  • Added ITK examples for all new major features (if any)

Refer to the ITK Software Guide for
further development details if necessary.

@github-actions github-actions bot added the area:Python wrapping Python bindings for a class label Oct 21, 2021
@GenevieveBuckley
Copy link
Contributor Author

GenevieveBuckley commented Oct 21, 2021

This is not going to pass the CI checks right now. There's a few more steps:

  • 1. We need to add the functions itkimage_to_json & itkimage_to_json (plus their helper functions _image_to_type & _type_to_image). Question: Where do these need to live?
Click to expand details to see what python code we need to add:
import os
import six
from functools import reduce
import numpy as np
try:
    import zstandard as zstd
except ImportError:
    import zstd

import itk


def _image_to_type(itkimage):  # noqa: C901
    component = itk.template(itkimage)[1][0]
    if component == itk.UL:
        if os.name == 'nt':
            return 'uint32_t', 1
        else:
            return 'uint64_t', 1
    mangle = None
    pixelType = 1
    if component == itk.SL:
        if os.name == 'nt':
            return 'int32_t', 1,
        else:
            return 'int64_t', 1,
    if component in (itk.SC, itk.UC, itk.SS, itk.US, itk.SI, itk.UI, itk.F,
            itk.D, itk.B):
        mangle = component
    elif component in [i[1] for i in itk.Vector.items()]:
        mangle = itk.template(component)[1][0]
        pixelType = 5
    elif component == itk.complex[itk.F]:
        # complex float
        return 'float', 10
    elif component == itk.complex[itk.D]:
        # complex float
        return 'double', 10
    elif component in [i[1] for i in itk.CovariantVector.items()]:
        # CovariantVector
        mangle = itk.template(component)[1][0]
        pixelType = 7
    elif component in [i[1] for i in itk.Offset.items()]:
        # Offset
        return 'int64_t', 4
    elif component in [i[1] for i in itk.FixedArray.items()]:
        # FixedArray
        mangle = itk.template(component)[1][0]
        pixelType = 11
    elif component in [i[1] for i in itk.RGBAPixel.items()]:
        # RGBA
        mangle = itk.template(component)[1][0]
        pixelType = 3
    elif component in [i[1] for i in itk.RGBPixel.items()]:
        # RGB
        mangle = itk.template(component)[1][0]
        pixelType = 2
    elif component in [i[1] for i in itk.SymmetricSecondRankTensor.items()]:
        # SymmetricSecondRankTensor
        mangle = itk.template(component)[1][0]
        pixelType = 8
    else:
        raise RuntimeError('Unrecognized component type: {0}'.format(str(component)))
    _python_to_js = {
        itk.SC: 'int8_t',
        itk.UC: 'uint8_t',
        itk.SS: 'int16_t',
        itk.US: 'uint16_t',
        itk.SI: 'int32_t',
        itk.UI: 'uint32_t',
        itk.F: 'float',
        itk.D: 'double',
        itk.B: 'uint8_t'
    }
    return _python_to_js[mangle], pixelType


def itkimage_to_json(itkimage, manager=None):
    """Serialize a Python itk.Image object.
    Attributes of this dictionary are to be passed to the JavaScript itkimage
    constructor.
    """
    if itkimage is None:
        return None
    else:
        direction = itkimage.GetDirection()
        directionMatrix = direction.GetVnlMatrix()
        directionList = []
        dimension = itkimage.GetImageDimension()
        pixel_arr = itk.array_view_from_image(itkimage)
        componentType, pixelType = _image_to_type(itkimage)
        if 'int64' in componentType:
            # JavaScript does not yet support 64-bit integers well
            if componentType == 'uint64_t':
                pixel_arr = pixel_arr.astype(np.uint32)
                componentType = 'uint32_t'
            else:
                pixel_arr = pixel_arr.astype(np.int32)
                componentType = 'int32_t'
        compressor = zstd.ZstdCompressor(level=3)
        compressed = compressor.compress(pixel_arr.data)
        pixel_arr_compressed = memoryview(compressed)
        for col in range(dimension):
            for row in range(dimension):
                directionList.append(directionMatrix.get(row, col))
        imageType = dict(
            dimension=dimension,
            componentType=componentType,
            pixelType=pixelType,
            components=itkimage.GetNumberOfComponentsPerPixel()
        )
        return dict(
            imageType=imageType,
            origin=tuple(itkimage.GetOrigin()),
            spacing=tuple(itkimage.GetSpacing()),
            size=tuple(itkimage.GetBufferedRegion().GetSize()),
            direction={'data': directionList,
                       'rows': dimension,
                       'columns': dimension},
            compressedData=compressed
        )


def _type_to_image(jstype):
    _pixelType_to_prefix = {
        1: '',
        2: 'RGB',
        3: 'RGBA',
        4: 'O',
        5: 'V',
        7: 'CV',
        8: 'SSRT',
        11: 'FA'
    }
    pixelType = jstype['pixelType']
    dimension = jstype['dimension']
    if pixelType == 10:
        if jstype['componentType'] == 'float':
            return itk.Image[itk.complex, itk.F], np.float32
        else:
            return itk.Image[itk.complex, itk.D], np.float64

    def _long_type():
        if os.name == 'nt':
            return 'LL'
        else:
            return 'L'
    prefix = _pixelType_to_prefix[pixelType]
    _js_to_python = {
        'int8_t': 'SC',
        'uint8_t': 'UC',
        'int16_t': 'SS',
        'uint16_t': 'US',
        'int32_t': 'SI',
        'uint32_t': 'UI',
        'int64_t': 'S' + _long_type(),
        'uint64_t': 'U' + _long_type(),
        'float': 'F',
        'double': 'D'
    }
    _js_to_numpy_dtype = {
        'int8_t': np.int8,
        'uint8_t': np.uint8,
        'int16_t': np.int16,
        'uint16_t': np.uint16,
        'int32_t': np.int32,
        'uint32_t': np.uint32,
        'int64_t': np.int64,
        'uint64_t': np.uint64,
        'float': np.float32,
        'double': np.float64
    }
    dtype = _js_to_numpy_dtype[jstype['componentType']]
    if pixelType != 4:
        prefix += _js_to_python[jstype['componentType']]
    if pixelType not in (1, 2, 3, 10):
        prefix += str(dimension)
    prefix += str(dimension)
    return getattr(itk.Image, prefix), dtype


def itkimage_from_json(js, manager=None):
    """Deserialize a Javascript itk.js Image object."""
    if js is None:
        return None
    else:
        ImageType, dtype = _type_to_image(js['imageType'])
        decompressor = zstd.ZstdDecompressor()
        if six.PY2:
            asBytes = js['compressedData'].tobytes()
            pixelBufferArrayCompressed = np.frombuffer(asBytes, dtype=np.uint8)
        else:
            pixelBufferArrayCompressed = np.frombuffer(js['compressedData'],
                                                    dtype=np.uint8)
        pixelCount = reduce(lambda x, y: x * y, js['size'], 1)
        numberOfBytes = pixelCount * \
            js['imageType']['components'] * np.dtype(dtype).itemsize
        pixelBufferArray = \
            np.frombuffer(decompressor.decompress(pixelBufferArrayCompressed,
                                                numberOfBytes),
                        dtype=dtype)
        pixelBufferArray.shape = js['size'][::-1]
        # Workaround for GetImageFromArray required until 5.0.1
        # and https://github.com/numpy/numpy/pull/11739
        pixelBufferArrayCopyToBeRemoved = pixelBufferArray.copy()
        # image = itk.PyBuffer[ImageType].GetImageFromArray(pixelBufferArray)
        image = itk.PyBuffer[ImageType].GetImageFromArray(
            pixelBufferArrayCopyToBeRemoved)
        Dimension = image.GetImageDimension()
        image.SetOrigin(js['origin'])
        image.SetSpacing(js['spacing'])
        direction = image.GetDirection()
        directionMatrix = direction.GetVnlMatrix()
        directionJs = js['direction']['data']
        for col in range(Dimension):
            for row in range(Dimension):
                directionMatrix.put(
                    row, col, directionJs[col + row * Dimension])
        return image
  • 2. Add a test like pickle.loads(pickle.dumps(itk_image)) Question: Where should the test live?

I see a "Testing" folder in the ITK repo, but I haven't narrowed it down any further. Also, I don't see tests written in the python language, so I'm not totally sure how we'd go about testing a python specific thing like pickle.
I've found a folder called Wrapping/Generators/Python/Tests/. Not entirely sure which file within that is the best fit yet, but I might be able to work that out. Maybe somewhere in extras.py, somewhere around here, perhaps

It looks like ITK uses unittest for python tests.

@thewtex
Copy link
Member

thewtex commented Oct 21, 2021

We need to add the functions itkimage_to_json & itkimage_to_json

These could go in extras.py with the other conversion functions:

https://github.com/InsightSoftwareConsortium/ITK/blob/master/Wrapping/Generators/Python/itk/support/extras.py

To follow the naming convention there,

itkimage_to_json -> json_from_image
itkimage_from_json -> image_from_json

their helper functions _image_to_type & _type_to_image)

These could go in helpers.py:

https://github.com/InsightSoftwareConsortium/ITK/blob/master/Wrapping/Generators/Python/itk/support/helpers.py

Add a test like pickle.loads(pickle.dumps(itk_image)) Question: Where should the test live?
I see a "Testing" folder in the ITK repo, but I haven't narrowed it down any further. Also, I don't see tests written in the python language, so I'm not totally sure how we'd go about testing a python specific thing like pickle.
I've found a folder called Wrapping/Generators/Python/Tests/. Not entirely sure which file within that is the best fit yet, but I might be able to work that out. Maybe somewhere in extras.py, somewhere around here, perhaps

Yes, https://github.com/InsightSoftwareConsortium/ITK/blob/master/Wrapping/Generators/Python/Tests/extras.py

It looks like ITK uses unittest for python tests.

They are mostly simple assert's, but unittest could be used.

@GenevieveBuckley thank you!

@thewtex thewtex self-requested a review October 21, 2021 11:31
@github-actions github-actions bot added the type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct label Oct 21, 2021
@GenevieveBuckley GenevieveBuckley marked this pull request as ready for review October 21, 2021 23:46
@GenevieveBuckley
Copy link
Contributor Author

I'm a little bit worried I don't see the tests FAIL at commit eb18f29. They should be failing here, because the code is still incomplete and I've added this test:

# Test serialization with pickle
assert (pickle.loads(pickle.dumps(image)) == image).all()

Thoughts:

  • Maybe image is not the right kind of ITK image?
  • Maybe the CI isn't running that particular check (seems super unlikely)

I'm going to go ahead and push the rest of my commits, but I'm hoping these links to the CI builds will let you see the build for this moment in time, when we expect to see test failures and don't:

@GenevieveBuckley
Copy link
Contributor Author

Things to check in the review:

  1. Are my imports correct (I think they are, but I don't have a local ITK dev env setup, so I can't check this easily)
  2. Do the tests behave as expected (locally & on CI)

Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GenevieveBuckley looking awesome! 🏅

Nice work on the test!

The CI is currently failing on:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/vsts/work/1/s-build/Wrapping/Generators/Python/itk/__init__.py", line 28, in <module>
    from itk.support.extras import *
  File "/home/vsts/work/1/s-build/Wrapping/Generators/Python/itk/support/extras.py", line 22, in <module>
    import six
ModuleNotFoundError: No module named 'six'
itkTestDriver: Process exited with return value: 1

A few comments inline on how to address this, the import's etc.

Wrapping/Generators/Python/PyBase/pyBase.i Show resolved Hide resolved
Wrapping/Generators/Python/PyBase/pyBase.i Outdated Show resolved Hide resolved
Wrapping/Generators/Python/Tests/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
@GenevieveBuckley
Copy link
Contributor Author

Minor comments addressed, two things remaining:

  1. use the numpy direction array directly, and
  2. Remove zstandard compression/decompression

@thewtex
Copy link
Member

thewtex commented Oct 27, 2021

Current ITK.macOS.Python test failure:

Traceback (most recent call last):
  File "/Users/runner/work/1/s/Wrapping/Generators/Python/Tests/extras.py", line 167, in <module>
    assert (pickle.loads(pickle.dumps(image)) == image).all()
  File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkImagePython.py", line 4444, in __getstate__
    state = itk.dict_from_image(self)
  File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/extras.py", line 812, in dict_from_image
    import zstandard as zstd
ModuleNotFoundError: No module named 'zstandard'
itkTestDriver: Process exited with return value: 1

@thewtex thewtex added this to the ITK 5.3.0 milestone Oct 28, 2021
import itk

ImageType, dtype = type_to_image(image_dict['imageType'])
image = itk.PyBuffer[ImageType].GetImageFromArray(image_dict['data'])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right way to do it.

Not sure whether we'll run into this problem (stackoverflow post) or not, but if we do there is a suggestion over there on how to fix it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we want / need to roughtly do it this way, but GetImageFromArray -> GetImageViewFromArray

image.SetOrigin(image_dict['origin'])
image.SetSpacing(image_dict['spacing'])
direction = image_dict['direction']['data']
image.SetDirection(direction)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done this because of your suggestion to use the numpy direction array in the other function, so I imagine this function also needed to be updated. I've pulled the direction numpy array out of the dictionary and then used the .SetDirection() method.

It feels like logic is duplicated between image_from_dict and setstate. Is that right, or do I misunderstand what needs to happen where?

            def __setstate__(self, state):
                """Set object state, necessary for serialization with pickle."""
                import itk
                deserialized = itk.image_from_dict(state)
                self.__dict__['this'] = deserialized
                self.SetOrigin(state['origin'])
                self.SetSpacing(state['spacing'])
                direction = np.asarray(self.GetDirection())
                self.SetDirection(direction)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.dict['this'] = deserialized

That's cool! If it works, then, yes, maybe that is all we need.

@GenevieveBuckley
Copy link
Contributor Author

Minor comments addressed, two things remaining:

1. use the numpy direction array directly, and

2. Remove zstandard compression/decompression

I think I've done these two things now. We'll see what the CI tests say.

I have not moved zstandard compression into __getstate__/__setstate__, right now it's been completely removed. It's my understanding that you want to try this out without any compression, is that correct @thewtex?

@GenevieveBuckley
Copy link
Contributor Author

Ok, I think now we just need to fix the test.

2021-10-28T06:49:19.4210320Z 150: RuntimeError: Size mismatch of image and Buffer.
2021-10-28T06:49:19.4221430Z 150: 
2021-10-28T06:49:19.4311360Z 150: The above exception was the direct cause of the following exception:
2021-10-28T06:49:19.4312300Z 150: 
2021-10-28T06:49:19.4323390Z 150: Traceback (most recent call last):
2021-10-28T06:49:19.4415490Z 150:   File "/Users/runner/work/1/s/Wrapping/Generators/Python/Tests/extras.py", line 167, in <module>
2021-10-28T06:49:19.4426640Z 150:     assert (pickle.loads(pickle.dumps(image)) == image).all()
2021-10-28T06:49:19.4519520Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkImagePython.py", line 4450, in __setstate__
2021-10-28T06:49:19.4577890Z 150:     deserialized = itk.image_from_dict(state)
2021-10-28T06:49:19.4606180Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/extras.py", line 848, in image_from_dict
2021-10-28T06:49:19.4614160Z 150:     image = itk.PyBuffer[ImageType].GetImageFromArray(image_dict['data'])
2021-10-28T06:49:19.4636070Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkPyBufferPython.py", line 3051, in GetImageFromArray
2021-10-28T06:49:19.4656660Z 150:     imageView = itkPyBufferIRGBUC2.GetImageViewFromArray(ndarr, is_vector)
2021-10-28T06:49:19.4688430Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkPyBufferPython.py", line 3021, in GetImageViewFromArray
2021-10-28T06:49:19.4745730Z 150:     imgview = itkPyBufferIRGBUC2._GetImageViewFromArray(ndarr, ndarr.shape[::-1], 1)
2021-10-28T06:49:19.4819430Z 150: SystemError: <built-in function itkPyBufferIRGBUC2__GetImageViewFromArray> returned a result with an error set
2021-10-28T06:49:19.4829650Z 169: Loading ITKPyBase... done
2021-10-28T06:49:19.4909310Z 167: Loading ITKImageGrid... done
2021-10-28T06:49:19.5010890Z 150: itkTestDriver: Process exited with return value: 1
2021-10-28T06:49:19.5024220Z 167/248 Test #150: PythonExtrasTest ...................................................***Failed   18.35 sec

I think this might be happening at line 848 of support/extras.py.
Possibilities:

  1. It's the thing mentioned in this stackoverflow post, where python floats are generally 64 bit, whereas the itk float is 32-bit.
  2. I've mucked something up when removing the decompression part of this code (there was a lot of stuff happening around getting the buffer that I thought was only important to decompression, but perhaps I was wrong)
  3. ...I'm still a bit worried about what looks like code duplication here, but now that I'm typing this out it seems much less likely to be relevant for this particular problem.

I'll try possibility 1 first. The suggestion on stackoverflow is to add dtype = numpy.float32 to the numpy arrays, so I'll give that a go and see what happens.

@GenevieveBuckley
Copy link
Contributor Author

GenevieveBuckley commented Oct 28, 2021

Thinking out loud here.
It looks like lines 817-824 should already be handling the 32 bit type potential issue. Should we remove those? It doesn't seem like a good idea to take them out.

We have other numpy arrays for the direction, but that's not the line that's in the traceback. I don't think there's much point changing those arrays to be 32 bit, but it would be easy enough to do.

Ooh, just found this thread with a suggestion from Matt to use modifiedImage = itk.GetImageViewFromArray(F) instead of a memory buffer. I'll try that now!

@GenevieveBuckley
Copy link
Contributor Author

ITK.macOS.Python is failing. It seems that type of origin is not what's expected when we try to deserialze the image.

2021-10-28T11:27:23.5183950Z 150: Traceback (most recent call last):
2021-10-28T11:27:23.5261720Z 150:   File "/Users/runner/work/1/s/Wrapping/Generators/Python/Tests/extras.py", line 167, in <module>
2021-10-28T11:27:23.5262880Z 150:     assert (pickle.loads(pickle.dumps(image)) == image).all()
2021-10-28T11:27:23.5289120Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkImagePython.py", line 4450, in __setstate__
2021-10-28T11:27:23.5363790Z 150:     deserialized = itk.image_from_dict(state)
2021-10-28T11:27:23.5393980Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/extras.py", line 830, in image_from_dict
2021-10-28T11:27:23.5469000Z 150:     image.SetOrigin(image_dict['origin'])
2021-10-28T11:27:23.5470150Z 150: TypeError: Expecting an itkPointD3, an int, a float, a sequence of int or a sequence of float.
2021-10-28T11:27:23.5497180Z 150: Additional information:
2021-10-28T11:27:23.5572060Z 150: Wrong number or type of arguments for overloaded function 'itkImageBase3_SetOrigin'.
2021-10-28T11:27:23.5599950Z 150:   Possible C/C++ prototypes are:
2021-10-28T11:27:23.5673070Z 150:     itkImageBase3::SetOrigin(itkPointD3 const)
2021-10-28T11:27:23.5703360Z 150:     itkImageBase3::SetOrigin(double const *)
2021-10-28T11:27:23.5775900Z 150:     itkImageBase3::SetOrigin(float const *)

All the other tests, including ITK.Linux.Python are passing, which is good. I'm not sure why linux can run this test fine but it fails on Mac, that does seem odd.

We're using the itk .getOrigin method, and then wrapping the result in a tuple before adding it to the dictionary that gets pickled. A tuple of values certainly seems like it should fit the definition "a sequence of int or a sequence of float". Maybe the tuple needs to be unpacked somehow on the other end? That would also seem strange, since this code works well in the other context we borrowed it from.

@GenevieveBuckley
Copy link
Contributor Author

I worked on this some more this afternoon, and I still don't understand why that test is failing. Because I was just reusing whatever the last image variable was, my best guess is currently that maybe it had a NaN or something weird as its origin. So I've updated the test to construct an itk image especially for pickling/unpickling, and we'll see if that helps. Otherwise, I'm a bit stumped.

Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
import itk

ImageType, dtype = type_to_image(image_dict['imageType'])
image = itk.PyBuffer[ImageType].GetImageFromArray(image_dict['data'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we want / need to roughtly do it this way, but GetImageFromArray -> GetImageViewFromArray

Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've done these two things now. We'll see what the CI tests say.

Awesome, thanks, Genevieve!

I have not moved zstandard compression into getstate/setstate, right now it's been completely removed. It's my understanding that you want to try this out without any compression, is that correct @thewtex?

Yes, we could possibly add the compression in a subsequent step. For distributed computing with Dask distributed, the compute / data transfer may make it worthwhile (?)

Regarding the size / typing errors, we do need to use:

    ImageType = helpers.image_type_from_wasm_type(image_dict['imageType'])
    image = itk.PyBuffer[ImageType].GetImageViewFromArray(image_dict['data'])

Where type_to_image is renamed to image_type_from_wasm_type and it only returns the ImageType.

The reason for this is that a np.ndarray does not contain all the typing information as an imageType. The dtype corresponds to the itk componentType. However, the imageType also defines how many spatial dimensions there are, if it is multi-component image, how many components there are and what those components mean, i.e. are they RGB values or values of a co-variant vector or values of a contravariant vector, etc.

Additionally, ideally we want this serialization format to correspond to what is used for interfacing with WebAssembly, described here:

Wrapping/Generators/Python/PyBase/pyBase.i Show resolved Hide resolved
@GenevieveBuckley
Copy link
Contributor Author

Have tried to address review comments, waiting on CI checks now.

@GenevieveBuckley
Copy link
Contributor Author

ITK doesn't seem to recognize the array type, I'm getting messages to say it's not wrapped. We're using a uint8 test array. I've just adjusted the test to try the default integer type (int64) just to see what happens.

The recommended solution is to set ITK_WRAP_${type}. I'm not entirely sure:

  1. Where to find this in the code - searching for "ITK_WRAP_" across the repository returns 4,800 results
  2. I'm not sure how many extra different types we'd need to wrap. When I run itk.PyBuffer.GetTypes() it says that itkImageUC2 is in the list. So I have to assume that it's the extra uint8 bit at the end that is causing the difference and also needs wrapping? If that's the case, there might be many other array dtypes that also need the same treatment (...but also, the ITK message says that we don't want to just wrap everything, because then that makes the package too big. So I don't know how to choose what's needed.)
2021-11-01T04:22:29.5169190Z 150: Traceback (most recent call last):
2021-11-01T04:22:29.5245040Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/template_class.py", line 525, in __getitem__
2021-11-01T04:22:29.5276550Z 150:     this_item = self.__template__[key]
2021-11-01T04:22:29.5287930Z 150: KeyError: (<class 'itk.itkImagePython.itkImageUC2'>, <class 'numpy.uint8'>)
2021-11-01T04:22:29.5293920Z 150: 
2021-11-01T04:22:29.5349140Z 150: During handling of the above exception, another exception occurred:
2021-11-01T04:22:29.5378830Z 150: 
2021-11-01T04:22:29.5451240Z 150: Traceback (most recent call last):
2021-11-01T04:22:29.5480370Z 150:   File "/Users/runner/work/1/s/Wrapping/Generators/Python/Tests/extras.py", line 176, in <module>
2021-11-01T04:22:29.5552620Z 150:     assert (pickle.loads(pickle.dumps(image)) == image).all()
2021-11-01T04:22:29.5582710Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkImagePython.py", line 4451, in __setstate__
2021-11-01T04:22:29.5654270Z 150:     deserialized = itk.image_from_dict(state)
2021-11-01T04:22:29.5684970Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/extras.py", line 844, in image_from_dict
2021-11-01T04:22:29.5754860Z 150:     image = itk.PyBuffer[ImageType].GetImageViewFromArray(image_dict['data'])
2021-11-01T04:22:29.5787390Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/template_class.py", line 529, in __getitem__
2021-11-01T04:22:29.5858280Z 150:     raise itk.TemplateTypeError(self, key)
2021-11-01T04:22:29.5890140Z 150: itk.support.extras.TemplateTypeError: itk.PyBuffer is not wrapped for input type `itk.Image[itk.UC,2], uint8`.
Full details - click to expand!:
2021-11-01T04:22:29.5169190Z 150: Traceback (most recent call last):
2021-11-01T04:22:29.5245040Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/template_class.py", line 525, in __getitem__
2021-11-01T04:22:29.5276550Z 150:     this_item = self.__template__[key]
2021-11-01T04:22:29.5287930Z 150: KeyError: (<class 'itk.itkImagePython.itkImageUC2'>, <class 'numpy.uint8'>)
2021-11-01T04:22:29.5293920Z 150: 
2021-11-01T04:22:29.5349140Z 150: During handling of the above exception, another exception occurred:
2021-11-01T04:22:29.5378830Z 150: 
2021-11-01T04:22:29.5451240Z 150: Traceback (most recent call last):
2021-11-01T04:22:29.5480370Z 150:   File "/Users/runner/work/1/s/Wrapping/Generators/Python/Tests/extras.py", line 176, in <module>
2021-11-01T04:22:29.5552620Z 150:     assert (pickle.loads(pickle.dumps(image)) == image).all()
2021-11-01T04:22:29.5582710Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/itkImagePython.py", line 4451, in __setstate__
2021-11-01T04:22:29.5654270Z 150:     deserialized = itk.image_from_dict(state)
2021-11-01T04:22:29.5684970Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/extras.py", line 844, in image_from_dict
2021-11-01T04:22:29.5754860Z 150:     image = itk.PyBuffer[ImageType].GetImageViewFromArray(image_dict['data'])
2021-11-01T04:22:29.5787390Z 150:   File "/Users/runner/work/1/s-build/Wrapping/Generators/Python/itk/support/template_class.py", line 529, in __getitem__
2021-11-01T04:22:29.5858280Z 150:     raise itk.TemplateTypeError(self, key)
2021-11-01T04:22:29.5890140Z 150: itk.support.extras.TemplateTypeError: itk.PyBuffer is not wrapped for input type `itk.Image[itk.UC,2], uint8`.
2021-11-01T04:22:29.5962970Z 150: 
2021-11-01T04:22:29.5995970Z 150: To limit the size of the package, only a limited number of
2021-11-01T04:22:29.6064550Z 150: types are available in ITK Python. To print the supported
2021-11-01T04:22:29.6097690Z 150: types, run the following command in your python environment:
2021-11-01T04:22:29.6113420Z 150: 
2021-11-01T04:22:29.6171850Z 150:     itk.PyBuffer.GetTypes()
2021-11-01T04:22:29.6214970Z 150: 
2021-11-01T04:22:29.6273310Z 150: Possible solutions:
2021-11-01T04:22:29.6299720Z 150: * If you are an application user:
2021-11-01T04:22:29.6317900Z 150: ** Convert your input image into a supported format (see below).
2021-11-01T04:22:29.6376120Z 150: ** Contact developer to report the issue.
2021-11-01T04:22:29.6402390Z 150: * If you are an application developer, force input images to be
2021-11-01T04:22:29.6419710Z 150: loaded in a supported pixel type.
2021-11-01T04:22:29.6478930Z 150: 
2021-11-01T04:22:29.6504980Z 150:     e.g.: instance = itk.PyBuffer[itk.Image[itk.SS,2]].New(my_input)
2021-11-01T04:22:29.6521580Z 150: 
2021-11-01T04:22:29.6581470Z 150: * (Advanced) If you are an application developer, build ITK Python yourself and
2021-11-01T04:22:29.6606990Z 150: turned to `ON` the corresponding CMake option to wrap the pixel type or image
2021-11-01T04:22:29.6623300Z 150: dimension you need. When configuring ITK with CMake, you can set
2021-11-01T04:22:29.6683270Z 150: `ITK_WRAP_${type}` (replace ${type} with appropriate pixel type such as
2021-11-01T04:22:29.6708610Z 150: `double`). If you need to support images with 4 or 5 dimensions, you can add
2021-11-01T04:22:29.6724810Z 150: these dimensions to the list of dimensions in the CMake variable
2021-11-01T04:22:29.6784960Z 150: `ITK_WRAP_IMAGE_DIMS`.
2021-11-01T04:22:29.6810720Z 150: 
2021-11-01T04:22:29.6824150Z 150: Supported input types:
2021-11-01T04:22:29.6866850Z 150: 
2021-11-01T04:22:29.6886940Z 150: itk.Image[itk.SS,2]
2021-11-01T04:22:29.6928350Z 150: itk.Image[itk.SS,3]
2021-11-01T04:22:29.6968720Z 150: itk.Image[itk.UC,2]
2021-11-01T04:22:29.6988820Z 150: itk.Image[itk.RGBPixel[itk.UC],2]
2021-11-01T04:22:29.7031540Z 150: itk.Image[itk.RGBAPixel[itk.UC],2]
2021-11-01T04:22:29.7070310Z 150: itk.Image[itk.UC,3]
2021-11-01T04:22:29.7090230Z 150: itk.Image[itk.RGBPixel[itk.UC],3]
2021-11-01T04:22:29.7128190Z 150: itk.Image[itk.RGBAPixel[itk.UC],3]
2021-11-01T04:22:29.7132950Z 150: itk.Image[itk.F,2]
2021-11-01T04:22:29.7193510Z 150: itk.Image[itk.SymmetricSecondRankTensor[itk.F,2],2]
2021-11-01T04:22:29.7229970Z 150: itk.Image[itk.Vector[itk.F,2],2]
2021-11-01T04:22:29.7236270Z 150: itk.Image[itk.CovariantVector[itk.F,2],2]
2021-11-01T04:22:29.7295040Z 150: itk.Image[itk.Vector[itk.F,3],2]
2021-11-01T04:22:29.7334710Z 150: itk.Image[itk.CovariantVector[itk.F,3],2]
2021-11-01T04:22:29.7338640Z 150: itk.Image[itk.Vector[itk.F,4],2]
2021-11-01T04:22:29.7396520Z 150: itk.Image[itk.CovariantVector[itk.F,4],2]
2021-11-01T04:22:29.7436410Z 150: itk.Image[itk.F,3]
2021-11-01T04:22:29.7439990Z 150: itk.Image[itk.SymmetricSecondRankTensor[itk.F,3],3]
2021-11-01T04:22:29.7498000Z 150: itk.Image[itk.Vector[itk.F,2],3]
2021-11-01T04:22:29.7537830Z 150: itk.Image[itk.CovariantVector[itk.F,2],3]
2021-11-01T04:22:29.7541550Z 150: itk.Image[itk.Vector[itk.F,3],3]
2021-11-01T04:22:29.7599130Z 150: itk.Image[itk.CovariantVector[itk.F,3],3]
2021-11-01T04:22:29.7639200Z 150: itk.Image[itk.Vector[itk.F,4],3]
2021-11-01T04:22:29.7642800Z 150: itk.Image[itk.CovariantVector[itk.F,4],3]
2021-11-01T04:22:29.7701950Z 150: itk.Image[itk.D,2]
2021-11-01T04:22:29.7740640Z 150: itk.Image[itk.SymmetricSecondRankTensor[itk.D,2],2]
2021-11-01T04:22:29.7802860Z 150: itk.Image[itk.Vector[itk.D,2],2]
2021-11-01T04:22:29.7842220Z 150: itk.Image[itk.CovariantVector[itk.D,2],2]
2021-11-01T04:22:29.7844230Z 150: itk.Image[itk.Vector[itk.D,3],2]
2021-11-01T04:22:29.7904210Z 150: itk.Image[itk.CovariantVector[itk.D,3],2]
2021-11-01T04:22:29.7943630Z 150: itk.Image[itk.Vector[itk.D,4],2]
2021-11-01T04:22:29.7945530Z 150: itk.Image[itk.CovariantVector[itk.D,4],2]
2021-11-01T04:22:29.8005560Z 150: itk.Image[itk.D,3]
2021-11-01T04:22:29.8044950Z 150: itk.Image[itk.SymmetricSecondRankTensor[itk.D,3],3]
2021-11-01T04:22:29.8046770Z 150: itk.Image[itk.Vector[itk.D,2],3]
2021-11-01T04:22:29.8106810Z 150: itk.Image[itk.CovariantVector[itk.D,2],3]
2021-11-01T04:22:29.8146350Z 150: itk.Image[itk.Vector[itk.D,3],3]
2021-11-01T04:22:29.8147830Z 150: itk.Image[itk.CovariantVector[itk.D,3],3]
2021-11-01T04:22:29.8208360Z 150: itk.Image[itk.Vector[itk.D,4],3]
2021-11-01T04:22:29.8249980Z 150: itk.Image[itk.CovariantVector[itk.D,4],3]
2021-11-01T04:22:29.8315330Z 150: itk.Image[itk.UI,2]
2021-11-01T04:22:29.8358090Z 150: itk.Image[itk.UI,3]
2021-11-01T04:22:29.8397900Z 150: itk.Image[itk.UL,2]
2021-11-01T04:22:29.8452810Z 150: itk.Image[itk.UL,3]
2021-11-01T04:22:29.8507560Z 150: itk.Image[itk.ULL,2]
2021-11-01T04:22:29.8537860Z 150: itk.Image[itk.ULL,3]
2021-11-01T04:22:29.8591970Z 150: itk.Image[itk.SI,2]
2021-11-01T04:22:29.8611040Z 150: itk.Image[itk.SI,3]
2021-11-01T04:22:29.8639990Z 150: itk.VectorImage[itk.SS,2]
2021-11-01T04:22:29.8693050Z 150: itk.VectorImage[itk.UC,2]
2021-11-01T04:22:29.8712380Z 150: itk.VectorImage[itk.F,2]
2021-11-01T04:22:29.8741220Z 150: itk.VectorImage[itk.SS,3]
2021-11-01T04:22:29.8794420Z 150: itk.VectorImage[itk.UC,3]
2021-11-01T04:22:29.8813630Z 150: itk.VectorImage[itk.F,3]

Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The

Wrapping/Generators/Python/itk/support/helpers.py Outdated Show resolved Hide resolved
@thewtex
Copy link
Member

thewtex commented Nov 1, 2021

ITK doesn't seem to recognize the array type, I'm getting messages to say it's not wrapped.

A few comments inline that should address this.

@GenevieveBuckley
Copy link
Contributor Author

Thanks for helping push this along Matt, I really appreciate it

@dzenanz
Copy link
Member

dzenanz commented Nov 2, 2021

Maybe the Mac test failure is due to different compiler defaults. Does the test fail with clang on Linux?

@thewtex
Copy link
Member

thewtex commented Nov 2, 2021

@GenevieveBuckley thank you so much for your persistence and efforts! 🧗

The error on mac does look odd. I will test locally and follow-up.

@GenevieveBuckley
Copy link
Contributor Author

The error on mac does look odd. I will test locally and follow-up.

Thanks Matt!

Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor issues noted inline after testing locally, a squash, and we will be good to rock 🚀 👨‍🎤

Wrapping/Generators/Python/itk/support/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/Tests/extras.py Outdated Show resolved Hide resolved
Wrapping/Generators/Python/itk/support/helpers.py Outdated Show resolved Hide resolved
@thewtex
Copy link
Member

thewtex commented Nov 4, 2021

ITK.macOS.Python error:

  File "/Users/runner/work/1/s/Wrapping/Generators/Python/Tests/extras.py", line 176, in <module>
    assert (pickle.loads(pickle.dumps(image)) == image).all()
AttributeError: 'bool' object has no attribute 'all'

Can be addressed by changing

    assert (pickle.loads(pickle.dumps(image)) == image).all()

to:

serialize_deserialize = pickle.loads(pickle.dumps(image))
# verify_input_information checks origin, spacing, direction consistency
comparison = itk.comparison_image_filter(image, serialize_deserialize, verify_input_information=True)
assert np.sum(comparison) == 0.0  

@thewtex
Copy link
Member

thewtex commented Nov 5, 2021

/azp run ITK.macOS.Python

@thewtex
Copy link
Member

thewtex commented Nov 7, 2021

💚

@GenevieveBuckley GenevieveBuckley changed the title WIP: Add pickle support for ITK images, needed for Dask arrays ENH: Add pickle support for ITK images, needed for Dask arrays Nov 7, 2021
@github-actions github-actions bot added the type:Enhancement Improvement of existing methods or implementation label Nov 7, 2021
@GenevieveBuckley GenevieveBuckley changed the title ENH: Add pickle support for ITK images, needed for Dask arrays ENH: pickle serialization for itk images (issue 1091) Nov 7, 2021
@GenevieveBuckley
Copy link
Contributor Author

GenevieveBuckley commented Nov 7, 2021

Finally the tests are all passing! 🎉
I've updated the PR title, and squashed all the commits down to one (which has just re-triggered the CI, so I sure hope nothing got lost in that process). With a bit of luck, it should be ready for approval soon. It's ready now 🎉

itk.SS: 'int16_t',
itk.US: 'uint16_t',
itk.SI: 'int32_t',
itk.UI: 'uint32_t',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add the long, long long support here as in _long_type() in image_type_from_wasm_type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment suggests that maybe this was not quite what you had in mind?

The _long_type() function behaves differently on posix vs windows machines, do we need to treat them differently here too? I'm sorry, I don't have a Windows machine handy, or I'd be able to test this out for myself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on Visual Studio and 64-bit Windows, long is still 32-bits. GCC and clang have long as 64-bits on 64-bit machines, so we do need to treat long differently. I saw an if os.nt as a distinguisher already used in the code to treat long differently.

Copy link
Member

@thewtex thewtex Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This size ambiguity of long is a reason why we use the explicitly sized type names in the WebAssembly serialization interface. Recently NumPy has moved to prefer more explicit use of primative type sizes, e.g. np.float64.

Copy link
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo!! @GenevieveBuckley you rock! 💯 🍾 🎇

@thewtex thewtex merged commit ed2a07e into InsightSoftwareConsortium:master Nov 11, 2021
@GenevieveBuckley
Copy link
Contributor Author

Hooray! 😄

It would be good to do another Dask & ITK blogpost, to let people know, so they can try it out. Would you be interested in that?

I'm not familiar with ITK's release schedule, but this looks most likely to be included from the next release candidate (version 5.3rc2 and above).

@thewtex
Copy link
Member

thewtex commented Nov 11, 2021

It would be good to do another Dask & ITK blogpost, to let people know, so they can try it out. Would you be interested in that?

Yes, absolutely!

Through this process, an interesting question came up about zstandard compression with Dask distributed. Since we are leveraging the NumPy array serialization, I am wondering if we could monkey-patch the NumPy serialization, which could help performance beyond ITK (or not)? 💭 🤔

I'm not familiar with ITK's release schedule, but this looks most likely to be included from the next release candidate (version 5.3rc2 and above).

Version 5.3rc2 is coming out now, but this will be available in 5.3rc3, coming out in the next week or two. We can test before the 5.3.0 release.

@jakirkham
Copy link
Member

FWIW Distributed does its own compression, which could use any of several compressors (including Zstandard). So this may already do what you want without needing to add this to pickle.

@GenevieveBuckley
Copy link
Contributor Author

GenevieveBuckley commented Nov 12, 2021

Version 5.3rc2 is coming out now, but this will be available in 5.3rc3, coming out in the next week or two. We can test before the 5.3.0 release

Ok, good. John suggests we should re-run the code from the blogpost, which should tell us if this is still a problem or not.

I'm mid-way through attempting to build & install itk locally, but not completely confident I'll be able to do that well. So I can either wait for the v5.3rc03 pre-release, or someone with a local itk development version could do that sooner.

@GenevieveBuckley
Copy link
Contributor Author

And because @thewtex doesn't seem to be on the Dask slack, I'll share a copy of this relevant comment by @jakirkham:

As to any other remaining pieces, would trust that blogpost more than my own memory at this point

We might want to recheck this GIL issue ( #1134 ). There is no repro in the issue, but maybe just rerunning the code in the original blog will identify it

This issue ( #1151 ) is also related to the blog. Basically let’s us drop map_blocks and just use ITK on Dask directly. Though it involves changes in NumPy (like in C or Cython). So Idk if that’s something worth tackling or not ( numpy/numpy#14020 ). It would be very useful, but it could also get into the weeds a bit

So to summarize maybe check the GIL issue and then it should be good for a new blogpost (ideally with simpler code than we had before)

Blogpost link: https://blog.dask.org/2019/08/09/image-itk

@jakirkham
Copy link
Member

To clarify this:

As to any other remaining pieces, would trust that blogpost more than my own memory at this point
We might want to recheck this GIL issue ( #1134 ). There is no repro in the issue, but maybe just rerunning the code in the original blog will identify it

I think Matt R. & I were seeing lock-ups when trying to run things in parallel with ITK and Dask. So the question would be when running this code today with a recent ITK version do they still occur? Matt M. suggests it may be fixed ( #1134 (comment) ). Either confirming the issue is fixed or still present would be useful. Including a repro if still present would also be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Python wrapping Python bindings for a class type:Enhancement Improvement of existing methods or implementation type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support itk.Image use in Dask Array map_blocks
4 participants