-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To extend torchvision for video #855
Comments
Thanks for opening the issue! Adding support for video data is in the plans, and will be integrated in the next major release of torchvision. This also involves the transforms. |
Hi! |
It's currently in a private branch, I'm working on some other things now and I'll get back to video once I've finished those next tasks, hopefully by the end of the week |
@fmassa Are there news on this already? Using these transforms in video clips would be very useful for us right now. |
@kateiyas You can download a pip package called flerken (under development), which contains a framework for pytorch but also torchvision adapted for video You have all the torchvision transforms there (only main compose class has been rewritten) |
@JuanFMontesinos Thanks, but a few small adaptations to this package fulfilled my needs: https://github.com/hassony2/torch_videovision I hope all of this will be integrated into pytorch soon. |
We got a bit late with the work on video. It won't be present in the 0.3 release, but in the next one. |
@fmassa What is the rough timeline that you have in mind for releasing the major changes in the |
Next release with video is planned for end of July. First PR adding video reading is already merged in #1039 |
@JuanFMontesinos you might also be interested in https://torchvideo.readthedocs.io/en/latest/ |
@willprice thanks! It looks really nice |
TorchVision 0.4 with support for video has been released, see https://github.com/pytorch/vision/releases/tag/v0.4.0 We still need to adapt the default transforms to support video data, but that might be a breaking change so we currently have them in the |
Initial set of transforms for video have been added in #1353 |
What is a current state with video transformation? What I've understood so far is that there was a plan to add transformation. Currently _transforms_video.py and _functional_video.py remain private. From this I can assume that these files won't live long and will be discarded soon. |
@Greeser with the upcoming release, almost all torchvision transformations will work on tensors of shape |
Motivation
I've realized that the way torchvision is coded it's not possible to store a transformation to be applied several times. Video requires the same transformation to be applied to the whole sequence.
Proposed changes
I propose to restructure the code with minor changes such that:
A base transformation class (template) were created, providing get_params and reset_params method:
get_params would provide needed parameters if necessary meanwhile reset_params would act as param initilizer + reseter.
To modify compose class to deal with list/tuples of frames such that when the list were exhausted, paramters would be reset:
To set random parameters and image parameters as object attributes. As some parameters requires image features to be computed, parameters would be initialized as None and computed/stored with the 1st frame:
Example 1:
Example 2:
The text was updated successfully, but these errors were encountered: