-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhance] Learning rate in log can show the base learning rate of optimizer #1019
Conversation
# |<---- Using a Maximum Of 50 Characters ---->| # Explain why this change is being made # |<---- Try To Limit Each Line to a Maximum Of 72 Characters ---->| # Provide links or keys to any relevant tickets, articles or other resources # Example: Github issue open-mmlab#23 # --- COMMIT END --- # Type can be # 🔬 exp (new experiment) # 🚀 feat (new feature) # 🧐 fix (bug fix) # 🏗️ refactor (refactoring production code) # 🍻 style (formatting, missing semi colons, etc; no code change) # 📝 docs (changes to documentation) # ✅ test (adding or refactoring tests; no production code change) # 💚 chore (updating grunt tasks etc; no production code change) # --------------------
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1019 +/- ##
=======================================
Coverage ? 77.72%
=======================================
Files ? 139
Lines ? 11500
Branches ? 2333
=======================================
Hits ? 8938
Misses ? 2151
Partials ? 411
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
# |<---- Using a Maximum Of 50 Characters ---->| # Explain why this change is being made # |<---- Try To Limit Each Line to a Maximum Of 72 Characters ---->| # Provide links or keys to any relevant tickets, articles or other resources # Example: Github issue open-mmlab#23 # --- COMMIT END --- # Type can be # 🔬 exp (new experiment) # 🚀 feat (new feature) # 🧐 fix (bug fix) # 🏗️ refactor (refactoring production code) # 🍻 style (formatting, missing semi colons, etc; no code change) # 📝 docs (changes to documentation) # ✅ test (adding or refactoring tests; no production code change) # 💚 chore (updating grunt tasks etc; no production code change) # --------------------
@HAOCHENYE , I have implemented an update to the optimizer parameter group to include a new parameter. However, it appears that this update has a significant impact on existing unit tests, particularly in the
|
…e attribute of 'OptimWrapper' class.
…s' to the attribute of 'OptimWrapper' class.
Hi @HAOCHENYE , we have applied a fix that passed all tests locally, could you please review this PR ? -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html |
Hi @HAOCHENYE, We have conducted distributed training to assess if there are any backward compatibility issues that may arise from this pull request (PR). Fortunately, we have found no breaking changes that would affect the downstream repository. We tested the current PR using MMseg with 4 GPUs, and the attached logs provide more details about the testing process. |
Hi @HAOCHENYE , i have merged your refine branch into this pr, please let me know anything else we can improve. |
Hi @AkideLiu , should we also update the logic to get lr in
|
@zhouzaida sure, i have update the logic of get lr in |
Co-authored-by: Zaida Zhou <[email protected]>
Hi @AkideLiu , thanks for your contribution. This PR will be merged after passing the CI. |
Hi @HAOCHENYE @zhouzaida , We appreciate your efforts and support during the development of this PR, hope mmengine project has a great future. |
Fix for issue #482
This is Akide. We are a course group (COMP SCI 4023 - Software Process Improvement) of five members from The University of Adelaide, trying to complete this issue and contribute. We will try to kick off this issue this week. If you have any more ideas, please don't hesitate to contact us.
Motivation
For example, the config of optim_wrapper is:
The log will record the learning rate of model.backbone(2e-5), rather than the base learning rate of optimizer(2e-4).
Modification
We have initiated an initial investigation and identified the following logics that may be related to the issue at hand:
runner.message_hub.update_scalar
.RuntimeInfoHook::before_train_iter
.runner.optim_wrapper.get_lr()
.OptimWrapper::get_lr
, an unsorted list of learning rates is created from the optimizer parameter groups.The core issue can potentially be resolved by sorting the learning rate list according to paramwise_cfg. At this stage, we are planning to create a preliminary pull request to sort the learning rate list without conditional checking. We believe it is essential to define rules to automatically identify the base learning rate in the optimizer parameter groups.
The output shows that :
Please let us know if you have any concerns or suggestions. We appreciate your input and look forward to collaborating on this matter.
BC-breaking (Optional)
Does the modification introduce changes that break the backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist