Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix printing momentum for non-deepspeed optimizer #464

Merged
merged 2 commits into from
Oct 7, 2020

Conversation

Zixxy
Copy link
Contributor

@Zixxy Zixxy commented Oct 7, 2020

Fix printing momentum for non-deepspeed optimizer

Fix printing momentum for non-deepspeed optimizer
@tjruwase
Copy link
Contributor

tjruwase commented Oct 7, 2020

@Zixxy Thanks for creating this PR. Please consider the following comments:

  1. Which optimizer (or use case) motivated this PR?

  2. 'betas' is not from DeepSpeed, but rather from the Adam-style optimizers of Torch: Adam, AdamW, Adamax, etc.

  3. It would be great to support optimizers that name their momentum differently from 'betas', but the current fix will break optimizers that use 'betas' but are not DeepSpeed optimizers. How about changing get_mom() to take an optional string parameter representing the momentum name? def get_mom(self, name=None) or def get_mom(self, name='betas')

Will (3) work for your scenario?

fix momentum access for Adam
@Zixxy
Copy link
Contributor Author

Zixxy commented Oct 7, 2020

  1. torch.optim.SGD / torch.optim.RMSProp
  2. Ah right, fixed
  3. It is fix for your _report_progress function: https://github.com/microsoft/DeepSpeed/blob/2efea6944616c7be9e35874adc37dbaf150ea05e/deepspeed/runtime/engine.py#L1007 , I would have to determine optimizer type inside of this function to call get_mom(name="betas"). Let me know if that is preferable anyway.

@tjruwase
Copy link
Contributor

tjruwase commented Oct 7, 2020

Oh I see, you are fixing the call within the engine code. For some reason, I was thinking of the client calling engine.get_mom(). In that case, your fix works just fine. Thanks so much.

It is exciting to see other optimizers, besides Adam & Lamb, are working seemingly out-of-the-box with DeepSpeed.

@tjruwase tjruwase merged commit c39a76f into deepspeedai:master Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants