-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to disable SyncBN for single GPU training #255
Comments
You can change configs like this and use correct corresponding To reduce GPU memory usage, I would suggest you try fp16 configs or other datasets (like KITTI, which needs smaller models). I could not guarantee setting samples_per_gpu=1 is enough in your case. Anyway you can definitely feel free to have a try. |
Hi. Thanks for the quick reply! I replaced the file configs/base/models/hv_pointpillars_fpn_nus.py with the one in your reply. I believe indeed now all norm operations are using naïve sync: However, I am still seeing the same issue. The log of the call of the training is below: Please note I have installed the most recent NVidia driver, CUDA and PyTorch. Data preperation phase on NuScenes went OK (took a very long time to complete). for some reason, I see the following in the log: 2021-01-05 11:50:51,677 - mmdet - INFO - Environment info:sys.platform: linux
TorchVision: 0.8.2
|
I think you may misunderstand my suggestion. I mean you need to replace all the lines using |
Hi. Thanks for the clarifications ! I was able to start training on my single GPU system. I'm closing this topic. Thanks again for the quick response and best regards, Yaniv |
Hi. This is a follow-up to #29 . I have a single GPU and I would like to train PointsPillars on NuScenes using (for example)
python tools/train.py --no-validate --gpus 1 configs/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py
I'm getting the error
"Default process group is not initialized"
AssertionError: Default process group is not initialized
as in issue 29.
Could you please provide some details on how to disable SyncBN and use BN instead ?
I have an additional question: is there a way to reduce GPU memory usage during training, so I can use a GTX 1080 with 8GB ? Will setting samples_per_gpu=1, workers_per_gpu=1 be sufficient ?
Thanks
Yaniv
The text was updated successfully, but these errors were encountered: