Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta_dict and lazy resampling in apply transform #6595

Open
tangy5 opened this issue Jun 9, 2023 · 24 comments
Open

Meta_dict and lazy resampling in apply transform #6595

tangy5 opened this issue Jun 9, 2023 · 24 comments
Assignees
Labels
bug Something isn't working

Comments

@tangy5
Copy link
Contributor

tangy5 commented Jun 9, 2023

Hi Team, I'm not sure if this is a bug or it's expected.

When I set env var USE_META_DICT=1, some transforms will fail due to the "lazy" argument. If USE_META_DICT=0 or none, it works normally.

Currently, this is causing some functions to fail when USE_META_DICT is set to true (1).

Here is a code snipet for reproducing:

  1. run export USE_META_DICT=1
  2. Create a python script with following codes lazy_meta_dict.py, then run python lazy_meta_dict.py
from functools import partial

import numpy as np
import torch
from monai.data import CacheDataset, create_test_image_3d
from monai.transforms import (
    AddChanneld,
    Compose,
    CropForegroundd,
    DivisiblePadd,
)

def get_data(num_examples, input_size, data_type=np.asarray, include_label=True):
    custom_create_test_image_3d = partial(
        create_test_image_3d, *input_size, rad_max=7, num_seg_classes=1, num_objs=1
    )
    data = []
    for _ in range(num_examples):
        im, label = custom_create_test_image_3d()
        d = {}
        d["image"] = data_type(im)
        d["image_meta_dict"] = {"affine": np.eye(4)}
        if include_label:
            d["label"] = data_type(label)
            d["label_meta_dict"] = {"affine": np.eye(4)}
        d["label_transforms"] = []
        data.append(d)
    return data[0] if num_examples == 1 else data

def test_epistemic_scoring():
    input_size = (20, 20, 20)
    device = "cuda" if torch.cuda.is_available() else "cpu"
    keys = ["image", "label"]
    num_training_ims = 10
    train_data = get_data(num_training_ims, input_size)
    # print("Hey!!:{}".format(CropForegroundd.__dict__.items()))
    transforms = Compose(
        [
            AddChanneld(keys),
            CropForegroundd(keys, source_key="image"),
            DivisiblePadd(keys, 4),
        ]
    )

    train_ds = CacheDataset(train_data, transforms)

if __name__ == "__main__":
    test_epistemic_scoring()

Expected error:

  File "/home/yucheng/anaconda3/lib/python3.9/site-packages/monai/transforms/transform.py", line 145, in apply_transform
    return _apply_transform(transform, data, unpack_items, lazy, overrides, log_stats)
  File "/home/yucheng/anaconda3/lib/python3.9/site-packages/monai/transforms/transform.py", line 102, in _apply_transform
    return transform(data, lazy=lazy) if isinstance(transform, LazyTrait) else transform(data)
TypeError: wrapper() got an unexpected keyword argument 'lazy'
@tangy5 tangy5 changed the title Meta_dict and lazy resamling in apply transform Meta_dict and lazy resampling in apply transform Jun 9, 2023
@KumoLiu
Copy link
Contributor

KumoLiu commented Jun 9, 2023

Seems we missed passing lazy in the wrapper.

if config.USE_META_DICT:
# call_update after MapTransform.__call__
cls.__call__ = transforms.attach_hook(cls.__call__, MapTransform.call_update, "post") # type: ignore

def wrapper(inst, data):

@wyli
Copy link
Contributor

wyli commented Jun 9, 2023

Thanks, I can replicate this issue in 1.2.0 (but not 1.2.0rc7), it's introduced by #6537. The root cause is that the API changes to the __call__ of MapTransform subclasses in the lazy resampling refactoring, for example in monai/transforms/spatial/dictionary.py:

Screenshot 2023-06-09 at 10 08 08

they are inconsistent with the base class MapTransform data assumptions that we've been using all the time:

def __call__(self, data):
"""
``data`` often comes from an iteration over an iterable,
such as :py:class:`torch.utils.data.Dataset`.
To simplify the input validations, this method assumes:
- ``data`` is a Python dictionary,
- ``data[key]`` is a Numpy ndarray, PyTorch Tensor or string, where ``key`` is an element
of ``self.keys``, the data shape can be:
#. string data without shape, `LoadImaged` transform expects file paths,
#. most of the pre-/post-processing transforms expect: ``(num_channels, spatial_dim_1[, spatial_dim_2, ...])``,
except for example: `AddChanneld` expects (spatial_dim_1[, spatial_dim_2, ...])
- the channel dimension is often not omitted even if number of channels is one.

@Nic-Ma @ericspod we may have to release 1.2.1 to fix these. Let's discuss this in today's dev meeting. cc @mmodat

@wyli wyli added the bug Something isn't working label Jun 9, 2023
@atbenmurray
Copy link
Contributor

@tangy5 please reach out to me so I can further understand the problem. I'm fresh to this as of the meeting

@atbenmurray
Copy link
Contributor

@tangy5 let's schedule a quick chat on monday, so we can figure out the best way to resolve it, if you are free.

@tangy5
Copy link
Contributor Author

tangy5 commented Jun 9, 2023

@tangy5 let's schedule a quick chat on monday, so we can figure out the best way to resolve it, if you are free.

Thank you so much. Currently, this is not a blocker issue for other MONAI related platforms such as MONAI Label and MONAI Toolkit, but we'd better discuss whether we will make it compatible with meta_dict. If we want to deprecate the meta_dict, we can tell users to use metaTensor instead, if users are using meta_dict with moani=1.2.0, we can tell them to shift to metaTensor or downgrade monai.

In both cases, it should work. Thank you! I'm available for a chat on Monday if a discussion is needed.

@Nic-Ma
Copy link
Contributor

Nic-Ma commented Jun 10, 2023

@atbenmurray ,

Please feel free to organize the discussion meeting.
I think @wyli 's suggestion is one of the proper solutions, maybe we should keep the data: Dict input simple and make lazy as one item in the data, similar to how we add new key: value in the data: https://github.com/Project-MONAI/MONAI/blob/dev/monai/transforms/io/dictionary.py#L178.
CC @ericspod @wyli for more discussion.

Thanks.

@atbenmurray
Copy link
Contributor

Hi @Nic-Ma I'd like to look for other options as well.

It seems to be very restrictive for dictionary based transforms to be forced to only accept one call parameter. It isn't true for non-dictionary transforms, and makes us have to have two different mechanisms for essentially the same piece of functionality. It's almost always better to have one mechanism instead of two where possible.

I'm trying to understand why it isn't simply a case of extending wrapper to take args and kwargs. That is a pretty normal pattern for such a function. See torch internals for many examples of using args and kwargs to forwarg general arguments to a call site after doing their own behind the scenes work.

atbenmurray added a commit to atbenmurray/MONAI that referenced this issue Jun 10, 2023
@atbenmurray
Copy link
Contributor

I've put together a draft PR #6598 that solves the problem. It might need tweaking, of course.

@Nic-Ma
Copy link
Contributor

Nic-Ma commented Jun 12, 2023

Hi @atbenmurray ,

I think the initial reason for accepting only 1 call parameter in the dict transforms is that:
We designed to chain the dict transform and didn't plan to chain the array transforms, to ensure dict transforms are consistent with the same input and output in a chain, 1 dict data is more clear.
CC @wyli @ericspod to add more if I missed anything.

Thanks.

@atbenmurray
Copy link
Contributor

It would make more sense if we also applied the same restriction to array transforms, but we don't. Allowing additional arguments for array transforms but not for dictionary transforms means that we need different code paths elsewhere depending on whether we are handling array transforms and dictionary transforms. In general, I think we should minimize discrepancies if we can.

Have you had a look at the PR? It is a really simple and small fix and seems to do away with the need to restrict dictionary call in this way.

@wyli
Copy link
Contributor

wyli commented Jun 12, 2023

I think the initial reason for accepting only 1 call parameter in the dict transforms is that: We designed to chain the dict transform and didn't plan to chain the array transforms, to ensure dict transforms are consistent with the same input and output in a chain, 1 dict data is more clear. CC @wyli @ericspod to add more if I missed anything.

Thanks.

yes, another benefit is that this also allows for the subclass implmentations to dynamically determine the additional arguments based on the value of specific data keys.

@wyli
Copy link
Contributor

wyli commented Jun 12, 2023

This PR #6598 is simple at first glance, it is in fact worsening the overall transform API inconsistency that I commented on this thread earlier (#6595 (comment)).

@atbenmurray
Copy link
Contributor

It is the case that array transforms get chained, however. I don't know how many users do it but we certainly make it possible for them to do it and even show them how to do it.

@atbenmurray
Copy link
Contributor

This PR #6598 is simple at first glance, it is in fact worsening the overall transform API inconsistency that I commented on this thread earlier (#6595 (comment)).

Do you mean that MapTransform has a __call__ function that takes data? Transform also has a __call__ function that takes data and we don't restrict optional parameters being added on by subclasses of Transform.

@wyli
Copy link
Contributor

wyli commented Jun 12, 2023

Do you mean that MapTransform has a __call__ function that takes data? Transform also has a __call__ function that takes data and we don't restrict optional parameters being added on by subclasses of Transform.

Yse we are treating these as a pattern across all the applications we (the core devs) built as the documentation suggested. we don't explicitly check the inconsistency as it is not a fatal error at this base class level for 3rd party applications.

If you propose changes to this, please feel free to raise a new feature request, over the feature request we can investigate the impact and come up with a development plan if it becomes useful.

@atbenmurray
Copy link
Contributor

My point is that we aren't crossing any red lines by relaxing the restriction on MapTransform. derived transforms.

@wyli
Copy link
Contributor

wyli commented Jun 12, 2023

could you please define 'crossing any red lines'?

My point is that we aren't crossing any red lines by relaxing the restriction on MapTransform. derived transforms.

could you please define 'crossing any red lines'? it seems to me that the comment doesn't help address this technical issue or towards solving the issue?

@atbenmurray
Copy link
Contributor

'Crossing any red lines' just means changing the design in a way that breaks critical assumptions. To rephrase:

My point is that we aren't breaking the design by relaxing the restriction on MapTransform derived transforms.

@wyli
Copy link
Contributor

wyli commented Jun 12, 2023

'Crossing any red lines' just means changing the design in a way that breaks critical assumptions. To rephrase:

My point is that we aren't breaking the design by relaxing the restriction on MapTransform derived transforms.

The change is the root cause of the bug reported in this ticket. Which means certain types of v1.1 use cases will also be affected and the users upgraded from v1.1 to v1.2 are not expecting these, nor is it documented. so in my understanding it's critical. For this type of base APIs, I believe we should always be very careful when changing the assumptions, even if it looks trivial at first glance.

@atbenmurray
Copy link
Contributor

If, going forward MetaTransform has a contract that one cannot add any optional parameters to __call__, that should be clearly stated in its documentation. Neither @ericspod nor @Nic-Ma, nor myself spotted it. I don't think it is necessary, however, as the *args, **kwargs forwarding fix in the PR solves it cleanly.

@atbenmurray
Copy link
Contributor

We also have some holes in the unit tests if this was able to get through undetected. If we want to go with the PR, I can add additional tests before review

@wyli
Copy link
Contributor

wyli commented Jun 12, 2023

I don't think it is necessary, however, as the *args, **kwargs forwarding fix in the PR solves it cleanly.

Nic and I tried to list reasons why the seemingly simple fix is not ideal, it's unfortunate that this statement appears repeatedly without convincing evidence... and as I understand it, this thread is moving towards unnecessary debate. So I'll gratefully stop making further comments here.

@atbenmurray
Copy link
Contributor

@tangy5 I think I've understood everything I need to from the code. If I understand correctly, you don't have any transforms that you specifically need to adapt. It's primarily an issue of tests failing. Is this correct?

@tangy5
Copy link
Contributor Author

tangy5 commented Jun 12, 2023

@tangy5 I think I've understood everything I need to from the code. If I understand correctly, you don't have any transforms that you specifically need to adapt. It's primarily an issue of tests failing. Is this correct?

@atbenmurray , thanks, currently, this is not impacting subprojects, as we modified the tests. For users, we will recommend using MetaTensor instead. We can think about MONAi itself, and whether the meta_dict and lazy arg can be compatible.

@atbenmurray atbenmurray self-assigned this Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants