-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradient overflow #30
Comments
Hi @datar001, By default, we trained on 4-8 V100 GPUs. We have not tried training on a single 1080Ti. For this warning, it is expected due to the use of mixed precision (fp16). |
One more piece of information. From our past experience, if the loss scaler stay above 1, the training should be steady. |
@datar001 Have you solved this problem? I have the same problem when running on 8 RTX 3090 GPUs. |
Met the same problem on 2X2080TI or 2XP6000. |
Hi, Why did this repo output the "Gradient overflow"? I run msrrvt_qa task with 1 and 2 1080Ti GPU(s). Can this repo be achieved by a single GPU (1080Ti)? Thanks!
The text was updated successfully, but these errors were encountered: