-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Call DataModule hooks implicitly in trainer #2755
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2755 +/- ##
======================================
Coverage 91% 91%
======================================
Files 76 76
Lines 6787 6819 +32
======================================
+ Hits 6150 6187 +37
+ Misses 637 632 -5 |
needs setup with step arg |
@@ -155,14 +234,14 @@ def prepare_data(self): | |||
""" | |||
|
|||
@abstractmethod | |||
def setup(self, *args, **kwargs): | |||
def setup(self, stage: Optional[str] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is def setup(self, stage: Optional[str] = None):
the way we want this to look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this to allow for people to set None as default value of stage
, which will let them call dm.setup()
. It will act as if you've setup both fit
and test
related setups if you do it this way, which won't confuse trainer when you go to call trainer.test
after trainer.fit
.
I documented this usage heavily in the docs.
@williamFalcon All done and updated docs. |
pytorch_lightning/trainer/trainer.py
Outdated
|
||
# If datamodule.prepare_data() has not been called yet, call it | ||
if self.is_overridden('prepare_data', datamodule) and not datamodule.has_prepared_data: | ||
datamodule.prepare_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the wrong place to call this.
this needs to be called when we call the prepare_data hook. otherwise the timing will be wrong
# self.dims = tuple(self.mnist_test[0][0].shape) | ||
|
||
def train_dataloader(self): | ||
return DataLoader(self.mnist_train, batch_size=32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add batch_size as a param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get pickle test cases to pass
Hello @nateraw! Thanks for updating this PR.
Comment last updated at 2020-08-02 00:01:39 UTC |
This pull request is now in conflict... :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why most of the tests in docs were converted to samples without testing?
@@ -11,33 +11,63 @@ Data preparation in PyTorch follows 5 steps: | |||
A DataModule is simply a collection of a train_dataloader, val_dataloader(s), test_dataloader(s) along with the | |||
matching transforms and data processing/downloads steps required. | |||
|
|||
.. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not test code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are a ton of unnecessary doctests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you mean by unnecessary, that there is no need for examples? Otherwise all examples shall be tested that they are aligned with actual code base...
cc: @awaelchli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, this code block would be perfect for a doctest, and it is as simple as adding .. testcode
. Even if we don't make any assertions here, Python will parse the code and run the import statements. It would help us keep the docs up-to-date with the api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we do not need anything extra, but it checks syntax and eventually nb of passed arguments or kwargs
What does this PR do?
You can now just pass datamodule to trainer.fit/trainer.test and we'll call
dm.setup
anddm.prepare_data
if you haven't already.Fixes #2751 and #2742
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃