Loss goes nan when training dual swin-base #39

seanzhuh · 2021-08-25T02:22:22Z

Hi, I've transferred your code to my own codebase, since your modification of the original mmdetection lies in 3 files (please correct me if I'm wrong):

CBNetV2/mmdet/models/backbones/cbnet.py
CBNetV2/mmdet/models/necks/cbnet_fpn.py
CBNetV2/mmdet/models/detectors/two_stage.py

I directly do a copy-paste to transfer your code my own version of mmdetection, however, loss goes to nan since epoch 17, the reason is that gradient is overflowing, amp loss scaler has to shrink to a small number, until divided by zero, thus it goes to nan.

I use default setting of AMP as yours, I can't figure out what's wrong, could you help me?

fuweifu-vtoo · 2021-09-16T03:40:07Z

I meet the same situation!
have you solved it?

seanzhuh · 2021-09-16T15:57:19Z

Nope, I've tried to transfer CBNetV2 to MMdetection v2.12.0 but still goes NaN exactly at 17th epoch.

…

On Thu, 16 Sept 2021 at 11:40, fuweifu-vtoo ***@***.***> wrote: I meet the same situation! have you solved it? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#39 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AG6FBLOZSE7ALADGAOWIMI3UCFRKDANCNFSM5CYBQLHA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss goes nan when training dual swin-base #39

Loss goes nan when training dual swin-base #39

seanzhuh commented Aug 25, 2021

fuweifu-vtoo commented Sep 16, 2021

seanzhuh commented Sep 16, 2021 via email

Loss goes nan when training dual swin-base #39

Loss goes nan when training dual swin-base #39

Comments

seanzhuh commented Aug 25, 2021

fuweifu-vtoo commented Sep 16, 2021

seanzhuh commented Sep 16, 2021 via email