Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoBatch cudnn.benchmark=True fix #9448

Merged
merged 4 commits into from
Sep 16, 2022
Merged

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Sep 16, 2022

May resolve #9287

Signed-off-by: Glenn Jocher [email protected]

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Redefined conditions for batch size determination and disabled CUDNN benchmark by default.

📊 Key Changes

  • Added a condition to return the default batch size when torch.backends.cudnn.benchmark is enabled.
  • Commented out the line that sets torch.backends.cudnn.benchmark to True by default, referencing an issue in the YOLOv5 repository.

🎯 Purpose & Impact

  • The added condition helps to avoid altering the batch size if CUDNN's benchmarking feature is activated. This can ensure stability and avoid performance degradation in certain cases.
  • Disabling the CUDNN benchmark by default aims to bypass potential problems with the AutoBatch feature, as reported in an issue. This change prioritizes model and training stability over the possible but not guaranteed speed improvements benchmarking might provide.
  • Users may notice a change in training speed, but with a likely increase in predictability and stability during model training, especially when reproducing results across different runs.

May resolve #9287

Signed-off-by: Glenn Jocher <[email protected]>
Signed-off-by: Glenn Jocher <[email protected]>
Signed-off-by: Glenn Jocher <[email protected]>
Signed-off-by: Glenn Jocher <[email protected]>
@glenn-jocher
Copy link
Member Author

Fix successfully tested in Colab on T4 GPU:

Screenshot 2022-09-16 at 21 33 47

@glenn-jocher glenn-jocher merged commit 5e1a955 into master Sep 16, 2022
@glenn-jocher glenn-jocher deleted the glenn-jocher-patch-1 branch September 16, 2022 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AutoBatch: CUDA anomaly detected
1 participant