Avoid patching LightningModule methods during training #6030

awaelchli · 2021-02-17T14:06:56Z

🚀 Feature

Can we implement the dataloaders without 🐒-patching the methods in LightningModule?

Motivation

Currently, we patch the LightningModule methods in the trainer when also a DataModule is used.
https://github.com/PyTorchLightning/pytorch-lightning/blob/5157ba55095a6a9f93ec1976aac877c87b00158f/pytorch_lightning/trainer/connectors/data_connector.py#L115

A datamodule's dataloader methods have precedence over the once defined in the LightningModule, but the LightningModule code should not be altered. The user does not know that this happens, and after training is complete, the user may wishes to continue using the model instance.

Pitch

Store the dataloader references in the trainer (or data connector) directly, without "attaching" them to the user's model.
This would also enable typing inference as mentioned by @gianscarpe.

Alternatives

Keep as is, but user will not be happy.
It's also harder to debug the way it is right now.

gianscarpe · 2021-02-17T15:00:51Z

Hi @awaelchli, I could work on this, as I already explores datamodules for some issue I got in my own project releated to before_batch_transfer and mypy type checking :)

awaelchli · 2021-02-17T15:06:18Z

Aha!! Awesome, you are welcome to work on this. Ping me if you encounter any troubles along the way.
cc @PyTorchLightning/core-contributors :)

gianscarpe · 2021-02-20T21:25:05Z

Hi @awaelchli, I'm working on the issue and I opened a draft PR #6103 . I have some questions:

If I pass a datamodule with only train dataloaders implemented, while I have a model with val and test dataloaders implemented, do we "combine" them? From tests it seems that the answer should be "yes", since for example predict mode is tested using ClassifierModel with train_dataloader implemented and passing a datamodule with only predict_datamodule
If I pass a datamodule with trainer_dataloader and I have a model with the same function implemented, which one should be considered? I believe datamodule implementation

awaelchli · 2021-02-20T22:37:05Z

If I pass a datamodule with only train dataloaders implemented, while I have a model with val and test dataloaders implemented, do we "combine" them?

Yes it looks like combining is the current behavior. Makes sense. Even better would be to log which ones are used.

If I pass a datamodule with trainer_dataloader and I have a model with the same function implemented, which one should be considered? I believe datamodule implementation

datamodule and dataloaders passed to fit have precedence over the methods defined in the model.

ananthsub · 2021-04-29T06:04:58Z

@awaelchli @carmocca why do we need to bind the dataloader functions to the model at all? why can't we set these as attributes of the trainer? and then delete corresponding at the end of the call? or just call the right one of model/datamodule/dataloader inside of the training/evaluation/predict loop?

awaelchli · 2021-04-29T11:01:03Z

why do we need to bind the dataloader functions to the model at all?

I don't know why. It was there from the start. I don't see a reason why it should be necessary.

why can't we set these as attributes of the trainer?
we can, and it's exactly what we propose in this issue.

If @gianscarpe doesn't have time to do it, I can try to get this started in 1.4. It will be a relatively wide-spreading refactor.

kaushikb11 · 2021-04-29T12:41:40Z

@awaelchli Adding you as an assignee as well! 🚀

gianscarpe · 2021-05-12T16:20:27Z

Hei @awaelchli, I just forgot about this PR. I started working on the thing, if you want to we can work together :)

awaelchli · 2021-05-12T22:25:05Z

Yes please, that would be awesome. Feel free to kick it off and I will be happy to help finish it and I can also help with failing tests.

justusschock · 2021-08-16T07:25:30Z

@justusschock do you think it is possible to avoid patching all together? There are good reasons not to do it #6030 from a user's perspective. While I believe your PR solves the major issue, there could still be problems when the user wants to call their dataloader method to produce a fresh dataloader, for example in a callback (nothing is stopping them from doing so).

Some work has started here #7522

Originally posted by @awaelchli in #8885 (comment)

justusschock · 2021-08-16T07:26:08Z

@awaelchli @justusschock I think we need a centralization point for where dataloaders come from. Patching the dataloader methods onto the model is using the model as the source of truth, but the side effects are visible to the end user. Rather, we could create an internal DataHolder that could be used to pool the object that has the DataHooks available. This would also codify the priority/precedence across datamodules, lightning module, and dataloaders passed directly to the trainer.

It will also raise the question: what happens when both the datamodule and the lightning module have these hooks implemented? https://github.com/PyTorchLightning/pytorch-lightning/blob/e0605472306d6b95bf2616ab88f8c29f4498402e/pytorch_lightning/core/hooks.py#L455-L807

do we raise an exception? do we run only the datamodule's, if available? do we run both and if so, in what order?

cc @ninginthecloud as another data area we should explore

Originally posted by @ananthsub in #8885 (comment)

justusschock · 2021-08-16T07:32:38Z

@awaelchli I agree with @ananthsub that this is definitely possible and should be done!

Rather, we could create an internal DataHolder that could be used to pool the object that has the DataHooks available. This would also codify the priority/precedence across datamodules, lightning module, and dataloaders passed directly to the trainer.

I like that idea. Should this be separate the from the DataConnector?

It will also raise the question: what happens when both the datamodule and the lightning module have these hooks implemented?
do we raise an exception? do we run only the datamodule's, if available? do we run both and if so, in what order?

Personally, I would give priority to whatever was passed explicitly. So if the module has a loader implemented but a loader, datamodule with the loader is passed to the entrypoint I would only use that one. i wouldn't raise an exception but a warning.

One point is, that we don't know the precedence otherwise. In validation it would be fine to chain them, but in training it isn't that easy and I think we should be consistent here between training and validation.

Another point is, that this is also what's the current behaviour. I.e. by patching the model we also ignore the loaders from the model when given a loader explicitly.

ananthsub · 2021-08-17T20:31:32Z

fyi @ninginthecloud

zzzwen · 2021-09-08T23:20:17Z

One more data point, this patching "solution" is also making test mocking hard.

Say I create a mock data module.

mock_data_module = MagicMock(
    spec=pl.LightningDataModule,
    wraps=pl.LightningDataModule,
)

This line will fail https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/connectors/data_connector.py#L225

because parent and instance code is the same.
https://github.com/PyTorchLightning/pytorch-lightning/blob/a079d7fccc0a9be25b40296f2a348c4b4f40c8cf/pytorch_lightning/utilities/model_helpers.py#L70-L71

Therefore train_dataloader will not be attached to lightning_module
and it will fail the validator

https://github.com/PyTorchLightning/pytorch-lightning/blob/a079d7fccc0a9be25b40296f2a348c4b4f40c8cf/pytorch_lightning/trainer/trainer.py#L938

awaelchli · 2021-09-14T22:42:04Z

@zzzwen I completely agree. If we had the possibility to mock these methods it would simplify and harden a bunch of dataloader tests.

awaelchli added feature Is an improvement or enhancement help wanted Open to be worked on refactor labels Feb 17, 2021

awaelchli mentioned this issue Feb 17, 2021

Add before_batch_transfer and after_batch_transfer hooks #3671

Merged

7 tasks

carmocca assigned gianscarpe Feb 17, 2021

carmocca added this to the 1.3 milestone Feb 21, 2021

carmocca modified the milestones: v1.3, v1.4 Apr 26, 2021

kaushikb11 assigned awaelchli Apr 29, 2021

gianscarpe mentioned this issue May 13, 2021

avoid patching #7522

Closed

11 tasks

ananthsub mentioned this issue May 21, 2021

trainer.validate throws error when using ckpt_path #7600

Closed

edenlightning modified the milestones: v1.4, v1.5 Jul 6, 2021

This was referenced Jul 15, 2021

rework dataloader reset logic in Trainer #8435

Closed

Clear dataloader references before attaching new dataloaders to Trainer #8442

Merged

ananthsub mentioned this issue Aug 12, 2021

Predicting with custom dataloader overwrites "predict_dataloader()" method of module #8868

Closed

justusschock mentioned this issue Aug 13, 2021

[Bugfix] Detach Loaders after running entrypoint #8885

Merged

12 tasks

justusschock mentioned this issue Aug 17, 2021

[bugfix] Move setup handling to the training type plugin #8935

Closed

12 tasks

tchaton added the let's do it! approved to implement label Sep 10, 2021

awaelchli mentioned this issue Sep 29, 2021

remove dataloader patching on the LightningModule #9764

Merged

11 tasks

awaelchli modified the milestones: v1.5, v1.6 Oct 17, 2021

awaelchli closed this as completed in #9764 Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid patching LightningModule methods during training #6030

Avoid patching LightningModule methods during training #6030

awaelchli commented Feb 17, 2021 •

edited

Loading

gianscarpe commented Feb 17, 2021

awaelchli commented Feb 17, 2021

gianscarpe commented Feb 20, 2021

awaelchli commented Feb 20, 2021

ananthsub commented Apr 29, 2021

awaelchli commented Apr 29, 2021

kaushikb11 commented Apr 29, 2021

gianscarpe commented May 12, 2021

awaelchli commented May 12, 2021

justusschock commented Aug 16, 2021

justusschock commented Aug 16, 2021

justusschock commented Aug 16, 2021

ananthsub commented Aug 17, 2021

zzzwen commented Sep 8, 2021 •

edited

Loading

awaelchli commented Sep 14, 2021

Avoid patching LightningModule methods during training #6030

Avoid patching LightningModule methods during training #6030

Comments

awaelchli commented Feb 17, 2021 • edited Loading

🚀 Feature

Motivation

Pitch

Alternatives

gianscarpe commented Feb 17, 2021

awaelchli commented Feb 17, 2021

gianscarpe commented Feb 20, 2021

awaelchli commented Feb 20, 2021

ananthsub commented Apr 29, 2021

awaelchli commented Apr 29, 2021

kaushikb11 commented Apr 29, 2021

gianscarpe commented May 12, 2021

awaelchli commented May 12, 2021

justusschock commented Aug 16, 2021

justusschock commented Aug 16, 2021

justusschock commented Aug 16, 2021

ananthsub commented Aug 17, 2021

zzzwen commented Sep 8, 2021 • edited Loading

awaelchli commented Sep 14, 2021

awaelchli commented Feb 17, 2021 •

edited

Loading

zzzwen commented Sep 8, 2021 •

edited

Loading