-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix v2 transforms in spawn mp context #8067
Conversation
test/datasets_utils.py
Outdated
if isinstance(dataset, VOCDetection): | ||
assert wrapped_sample[0][0].size == (321, 123) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the point of this test is just to show that the fix works. It obviously won't stay as-is.
We should really add transforms to this test though. @pmeier Do you have any suggestion on how you'd prefer to do that? Otherwise, I'll kinda just do something similar for all datasets, i.e. just check that the images in wrapped_sample
are of a specific expected size.
dataset.transform = self.transform | ||
dataset.transforms = self.transforms | ||
dataset.target_transform = self.target_transform | ||
return wrap_dataset_for_transforms_v2, (dataset, self._target_keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I don't really understand why we needed to have __reduce__
in the first place. I understand it's needed to support pickle, but my understanding stops here.
The whole logic seems really strange, i.e. calling yet again wrap_dataset_for_transforms_v2()
, which is how the instance-being-pickled was created in the first place anyway. And now on top of that we add the "field resetting logic" i.e. we just undo what we did in __init__
.
I understand it's needed to support pickle, but my understanding stops here.
@pmeier can you remind me the details of this? Do you think we could support pickleability of these datasets in a different way that would require this "fix"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pickle
does not support dynamic types like we do in
wrapped_dataset_cls = type(f"Wrapped{type(dataset).__name__}", (VisionDatasetTVTensorWrapper, type(dataset)), {}) |
by default. Thus, we implement the __reduce__
method and tell it how to construct the object by its parts. Now only the items on L212 need to be pickled. While unpickling, the dynamic type is created anew and thus we circumvent the issue.
We have the dynamic type in the first place to support isinstance
checks. There was also a different option in #7239 (comment) that would work without a dynamic type. I think that could potentially work without your fix, but I would need to test. However, this option also has its drawbacks (see discussion in the original PR).
@@ -198,8 +199,19 @@ def __getitem__(self, idx): | |||
def __len__(self): | |||
return len(self._dataset) | |||
|
|||
# TODO: maybe we should use __getstate__ and __setstate__ instead of __reduce__, as recommended in the docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See just above this link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can try, but I think it is not possible. The "state" in __get_state__
and __set_state__
is the second return value of __reduce__
below. And while we can recreate a VisionDatasetTVTensorWrapper
from them, pickle
does not know how to create the dynamic type. I'll give it a shot.
@@ -1005,9 +1014,11 @@ def inject_fake_data(self, tmpdir, config): | |||
) | |||
return num_videos_per_class * len(classes) | |||
|
|||
@pytest.mark.xfail(reason="FIXME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails, I think it's because Kinetics doesn't convert its transform
attribute into a transforms
attribute, but I haven't double-checked. If that's OK I'd like to merge this PR right now to get it behind us, and investigate that separately just after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK with investigating later. Will have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
test/datasets_utils.py
Outdated
# https://github.com/pytorch/vision/issues/8066 | ||
# Implicitly, this also checks that the wrapped datasets are pickleable. | ||
|
||
# To save CI/test time, we only check on macOS where "spawn" is the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spawn is also the default on Windows. Since macOS CI is by far the costliest, should we just use Windows here?
@@ -1005,9 +1014,11 @@ def inject_fake_data(self, tmpdir, config): | |||
) | |||
return num_videos_per_class * len(classes) | |||
|
|||
@pytest.mark.xfail(reason="FIXME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK with investigating later. Will have a look.
@@ -198,8 +199,19 @@ def __getitem__(self, idx): | |||
def __len__(self): | |||
return len(self._dataset) | |||
|
|||
# TODO: maybe we should use __getstate__ and __setstate__ instead of __reduce__, as recommended in the docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can try, but I think it is not possible. The "state" in __get_state__
and __set_state__
is the second return value of __reduce__
below. And while we can recreate a VisionDatasetTVTensorWrapper
from them, pickle
does not know how to create the dynamic type. I'll give it a shot.
Thanks for the review! Failure seem to be unrelated. There was
on the 3.10 windows job but this dataset isn't touched by this PR, and the other windows job are executing the new test and running fine. Let's hope it's just a one-off, I'll merge |
Reviewed By: vmoens Differential Revision: D50789087 fbshipit-source-id: 4feccf22ac58fa8b7028dd64c85dd247800e466d
Fixes #8066
cc @vfdev-5