-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug of resume model: #1130
Comments
meta is the information to be saved in the checkpoint rather than self.meta |
I know that, but the trouble is, you have initialized self.meta in base_runner.resume, like this: mmcv/mmcv/runner/base_runner.py Line 362 in 59ed0dd
but not do any update it after epoch/iter. For example, when I resume model from epoch=4, iter=200, and save model after 1 epoch, after the line, mmcv/mmcv/runner/iter_based_runner.py Line 203 in 54ece10
meta is updated by self.meta, the saved information is always epoch=4, iter=200. It will be always the meta you initialized. so, the following line maybe the simplest solution to fix it. |
I think you
I think you are right. Actually I had trained my model to 7 epoches, but after I resume from ckpt it returned to previous 4 epoches. |
@luckycaicai @ailias , thanks for your feedback, the issue will be resolved by the PR #1108 |
Original code:
mmcv/mmcv/runner/iter_based_runner.py
Line 203 in 54ece10
Fix to:
self.meta.update(meta)
As well as,
mmcv/mmcv/runner/epoch_based_runner.py
Line 159 in 54ece10
The text was updated successfully, but these errors were encountered: