Call DataModule hooks implicitly in trainer #2755

nateraw · 2020-07-29T19:56:35Z

What does this PR do?

You can now just pass datamodule to trainer.fit/trainer.test and we'll call dm.setup and dm.prepare_data if you haven't already.

dm = MyDataModule()
model = CoolSystem()
trainer = Trainer()
trainer.fit(model, dm)

Fixes #2751 and #2742

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2020-07-29T20:16:22Z

Codecov Report

Merging #2755 into master will increase coverage by 0%.
The diff coverage is 97%.

@@          Coverage Diff           @@
##           master   #2755   +/-   ##
======================================
  Coverage      91%     91%           
======================================
  Files          76      76           
  Lines        6787    6819   +32     
======================================
+ Hits         6150    6187   +37     
+ Misses        637     632    -5

williamFalcon · 2020-07-30T01:16:22Z

needs setup with step arg

docs/source/datamodules.rst

nateraw · 2020-07-30T07:53:32Z

pytorch_lightning/core/datamodule.py

@@ -155,14 +234,14 @@ def prepare_data(self):
        """

    @abstractmethod
-    def setup(self, *args, **kwargs):
+    def setup(self, stage: Optional[str] = None):


is def setup(self, stage: Optional[str] = None): the way we want this to look?

Added this to allow for people to set None as default value of stage, which will let them call dm.setup(). It will act as if you've setup both fit and test related setups if you do it this way, which won't confuse trainer when you go to call trainer.test after trainer.fit.

I documented this usage heavily in the docs.

pytorch_lightning/trainer/trainer.py

nateraw · 2020-07-30T22:03:39Z

needs setup with step arg

@williamFalcon All done and updated docs.

williamFalcon · 2020-07-30T22:14:45Z

pytorch_lightning/trainer/trainer.py

+
+            # If datamodule.prepare_data() has not been called yet, call it
+            if self.is_overridden('prepare_data', datamodule) and not datamodule.has_prepared_data:
+                datamodule.prepare_data()


this is the wrong place to call this.
this needs to be called when we call the prepare_data hook. otherwise the timing will be wrong

ananyahjha93 · 2020-07-30T22:38:14Z

docs/source/datamodules.rst

+                # self.dims = tuple(self.mnist_test[0][0].shape)
+
+        def train_dataloader(self):
+            return DataLoader(self.mnist_train, batch_size=32)


maybe add batch_size as a param?

ananyahjha93

get pickle test cases to pass

pep8speaks · 2020-07-31T00:08:46Z

Hello @nateraw! Thanks for updating this PR.

In the file tests/core/test_datamodules.py:

Line 53:9: E116 unexpected indentation (comment)
Line 58:9: E116 unexpected indentation (comment)
Line 63:13: E116 unexpected indentation (comment)

Comment last updated at 2020-08-02 00:01:39 UTC

mergify · 2020-08-01T00:36:34Z

This pull request is now in conflict... :(

Borda

why most of the tests in docs were converted to samples without testing?

Borda · 2020-08-02T09:52:24Z

docs/source/datamodules.rst

@@ -11,33 +11,63 @@ Data preparation in PyTorch follows 5 steps:
 A DataModule is simply a collection of a train_dataloader, val_dataloader(s), test_dataloader(s) along with the
 matching transforms and data processing/downloads steps required.

+.. code-block:: python


why not test code?

there are a ton of unnecessary doctests.

What you mean by unnecessary, that there is no need for examples? Otherwise all examples shall be tested that they are aligned with actual code base...
cc: @awaelchli

agree, this code block would be perfect for a doctest, and it is as simple as adding .. testcode. Even if we don't make any assertions here, Python will parse the code and run the import statements. It would help us keep the docs up-to-date with the api.

yeah we do not need anything extra, but it checks syntax and eventually nb of passed arguments or kwargs

mergify bot requested a review from a team July 29, 2020 19:57

nateraw linked an issue Jul 29, 2020 that may be closed by this pull request

[DataModule] prepare_data() and setup() not called #2742

Closed

nateraw added the ready PRs ready to be merged label Jul 29, 2020

nateraw requested review from ananyahjha93, Borda and williamFalcon July 29, 2020 21:17

Borda removed the ready PRs ready to be merged label Jul 29, 2020

Borda approved these changes Jul 29, 2020

View reviewed changes

mergify bot requested a review from a team July 29, 2020 21:51

Borda added bug Something isn't working ci Continuous Integration ready PRs ready to be merged labels Jul 29, 2020

nateraw mentioned this pull request Jul 29, 2020

Use Lightning DataModules Lightning-Universe/lightning-bolts#130

Merged

4 tasks

nateraw removed the ready PRs ready to be merged label Jul 30, 2020

nateraw force-pushed the dm-hook-fixes branch from a870ea7 to 7b49a56 Compare July 30, 2020 07:50

nateraw commented Jul 30, 2020

View reviewed changes

nateraw force-pushed the dm-hook-fixes branch from 7839df4 to bcc8720 Compare July 30, 2020 21:28

williamFalcon reviewed Jul 30, 2020

View reviewed changes

mergify bot requested a review from a team July 30, 2020 22:15

nateraw changed the title ~~Call DataModule hooks implicitly in trainer~~ [WIP] Call DataModule hooks implicitly in trainer Jul 30, 2020

ananyahjha93 approved these changes Jul 30, 2020

View reviewed changes

mergify bot requested a review from a team July 30, 2020 23:51

ananyahjha93 suggested changes Jul 30, 2020

View reviewed changes

mergify bot requested a review from a team July 30, 2020 23:53

nateraw force-pushed the dm-hook-fixes branch from 9c10e67 to a82178b Compare July 31, 2020 00:08

ananyahjha93 approved these changes Jul 31, 2020

View reviewed changes

mergify bot requested a review from a team July 31, 2020 22:14

nateraw force-pushed the dm-hook-fixes branch from 06d42fc to 8562b3a Compare August 1, 2020 00:35

nateraw and others added 15 commits July 31, 2020 18:55

✨ call dm hooks in trainer implicitly

8dc681a

✅ update tests

5064b7d

📝 remove unused stage arg from dm docs

0e43c0b

✅ update tests

8550fb3

✅ update tests

94c1eb1

🚧 include stage in datamodule.setup

05a16d7

📝 docs

d55bcd7

📝 docs

981378c

added more dm tests

9331b60

added more dm tests

6be261b

🐛 call dm.setup everywhere

5233ac7

🔥 pickle tests now implied by accelerator tests

cb1b848

🎨 set dm as attr of trainer

1b77442

🐛 .

f059cc4

🚧 wip

a3be9e7

nateraw force-pushed the dm-hook-fixes branch from 8562b3a to a3be9e7 Compare August 1, 2020 00:56

williamFalcon added 6 commits August 1, 2020 19:20

add can prepare test

ecc2875

add can prepare test

c54ac9d

verified setup in fit

2bdb10e

fixed setup call

8107ef9

fixed setup call

57e942a

fixed setup call

c138110

williamFalcon changed the title ~~[WIP] Call DataModule hooks implicitly in trainer~~ Call DataModule hooks implicitly in trainer Aug 2, 2020

williamFalcon merged commit 036bcea into Lightning-AI:master Aug 2, 2020

Borda reviewed Aug 2, 2020

View reviewed changes

mergify bot requested a review from a team August 2, 2020 09:58

AtomScott mentioned this pull request Aug 25, 2020

Rename train_dataloader or update docs for trainer.fit()? #3146

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call DataModule hooks implicitly in trainer #2755

Call DataModule hooks implicitly in trainer #2755

nateraw commented Jul 29, 2020 •

edited

Loading

codecov bot commented Jul 29, 2020 •

edited

Loading

williamFalcon commented Jul 30, 2020

nateraw Jul 30, 2020

nateraw Jul 30, 2020

nateraw commented Jul 30, 2020

williamFalcon Jul 30, 2020

ananyahjha93 Jul 30, 2020

ananyahjha93 left a comment •

edited

Loading

pep8speaks commented Jul 31, 2020 •

edited

Loading

mergify bot commented Aug 1, 2020

Borda left a comment

Borda Aug 2, 2020

williamFalcon Aug 2, 2020

Borda Aug 2, 2020

awaelchli Aug 2, 2020

Borda Aug 2, 2020

Call DataModule hooks implicitly in trainer #2755

Call DataModule hooks implicitly in trainer #2755

Conversation

nateraw commented Jul 29, 2020 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Jul 29, 2020 • edited Loading

Codecov Report

williamFalcon commented Jul 30, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nateraw commented Jul 30, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ananyahjha93 left a comment • edited Loading

Choose a reason for hiding this comment

pep8speaks commented Jul 31, 2020 • edited Loading

Comment last updated at 2020-08-02 00:01:39 UTC

mergify bot commented Aug 1, 2020

Borda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nateraw commented Jul 29, 2020 •

edited

Loading

codecov bot commented Jul 29, 2020 •

edited

Loading

ananyahjha93 left a comment •

edited

Loading

pep8speaks commented Jul 31, 2020 •

edited

Loading