Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce that the optimizer closure is executed when optimizer_step is overridden #9360

Merged
merged 8 commits into from
Sep 8, 2021

Conversation

carmocca
Copy link
Contributor

@carmocca carmocca commented Sep 7, 2021

What does this PR do?

Part of #9349

This has the advantage that the ClosureResult is no longer Optional.

Does your PR introduce any breaking changes? If yes, please list them.

optimizer_step now requires that the closure is executed. Not doing it would skip the entire training_step, zero_grad, and backward. We consider this to be a bug so it's better to explicitly fail. Not doing it could also break progress tracking and fault tolerance.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

@carmocca carmocca added feature Is an improvement or enhancement refactor labels Sep 7, 2021
@carmocca carmocca added this to the v1.5 milestone Sep 7, 2021
@carmocca carmocca self-assigned this Sep 7, 2021
@carmocca carmocca requested a review from awaelchli September 7, 2021 16:44
@codecov
Copy link

codecov bot commented Sep 7, 2021

Codecov Report

Merging #9360 (ad0cc5a) into master (ca679cd) will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #9360   +/-   ##
======================================
  Coverage      92%     92%           
======================================
  Files         179     179           
  Lines       14910   14978   +68     
======================================
+ Hits        13760   13826   +66     
- Misses       1150    1152    +2     

@mergify mergify bot added the has conflicts label Sep 8, 2021
@mergify mergify bot removed the has conflicts label Sep 8, 2021
@mergify mergify bot added the ready PRs ready to be merged label Sep 8, 2021
@tchaton tchaton enabled auto-merge (squash) September 8, 2021 09:19
Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@tchaton tchaton merged commit 15d9430 into master Sep 8, 2021
@tchaton tchaton deleted the refactor/enforce-closure-execution branch September 8, 2021 10:24
@daniellepintz
Copy link
Contributor

@carmocca it looks like this PR introduced a test failure in tests/utilities/test_parsing.py::test_is_picklable

https://github.com/PyTorchLightning/pytorch-lightning/runs/3543485388

@carmocca
Copy link
Contributor Author

carmocca commented Sep 9, 2021

@carmocca it looks like this PR introduced a test failure in tests/utilities/test_parsing.py::test_is_picklable

I can't think of a reason why the changes in this PR would make that test fail. Since 1.10 testing comes from nightly, it might be caused by a change in PyTorch and this was the first PR to see it.

result = self._result
self._result = None # free memory
if self._result is None:
raise MisconfigurationException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carmocca FB had a problem here where they got the error message but not calling the closure was actually intentional. Are we ok with downgrading this to a warning message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can downgrade this to a warning if necessary, however, if there isn't a reason for allowing this, I still believe this should be an error.

We can make this a warning and a deprecation and then convert this back to an error two minor releases later.

@ananthsub what do you think of this solution? Do you have an argument for why skipping the closure should be allowed?

@carmocca carmocca mentioned this pull request Oct 18, 2021
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement ready PRs ready to be merged refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants