Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HorovodStrategy Teardown() Edge Case #11751

Closed
speediedan opened this issue Feb 5, 2022 · 1 comment · Fixed by #11752
Closed

Fix HorovodStrategy Teardown() Edge Case #11751

speediedan opened this issue Feb 5, 2022 · 1 comment · Fixed by #11752
Labels
bug Something isn't working strategy: horovod (removed)
Milestone

Comments

@speediedan
Copy link
Contributor

speediedan commented Feb 5, 2022

🐛 Bug

HorovodStrategy.teardown() may not complete gracefully if an exception is thrown before HorovodStrategy._exit_stack is set.

    def teardown(self) -> None:
        super().teardown()
>       self._exit_stack.__exit__(None, None, None)
E       AttributeError: 'NoneType' object has no attribute '__exit__'

pytorch_lightning[/strategies/horovod.py:200](): AttributeError

Notice below that self._exit_stack isn't set until callback setup() hooks are called:

https://github.com/PyTorchLightning/pytorch-lightning/blob/58324b5197aef20eb7acb577f953c0fae7c2dc05/pytorch_lightning/strategies/horovod.py#L79-L84

To Reproduce

Test to reproduce

Expected behavior

If an exception is thrown in any callback setup() hook, the Horovod teardown should still proceed without error.

Environment

  • CUDA:
    • GPU:
      • GeForce RTX 2070 SUPER
      • GeForce RTX 2070
    • available: True
    • version: 11.3
  • Packages:
    • numpy: 1.21.2
    • pyTorch_debug: False
    • pyTorch_version: 1.10.1
    • pytorch-lightning: 1.6.0dev
    • tqdm: 4.62.3
  • System:

Additional context

I'll be uploading a PR to address this bug shortly

cc @awaelchli

@awaelchli
Copy link
Contributor

@speediedan Good that you found this, thanks for sending the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working strategy: horovod (removed)
Projects
None yet
2 participants