-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iou3d failed when inference with gpu:1 #65
Comments
How did you swtich to gpu:1? From your description, it seems that either your model or some part of data (some tensors) was loaded to gpu:0. Perhaps you can set CUDA_VISIBLE_DEVICES when inferencing with different gpus. |
Training with command Set CUDA_VISIBLE_DEVICES at script |
I have just tried to reproduce your case, but everything is ok on my machine. I both tried Have you ever succeeded to inference with the same gpu? Besides, your environment information and detailed error traceback may help us debug. |
When I train a model with command I read your suggestion then export And I tried your way: export I'm not sure if there is something wrong during training makes it train the model with gpu:1 and then evaluate it with gpu:0(which is default gpu if CUDA_VISIBLE_DEVICE wasn't modified) Environment:
TorchVision: 0.6.1 Error traceback: Config: |
This indicates some bug exists in the code that is not device agnostic. We will create a PR to fix this bug ASAP. |
This bug is a little subtle. It is caused by incorrect memory allocation when creating new tensors in the iou3d.cpp file, which is an operation borrowed from other codebase. We have fixed it with setting specific cuda device ids in the procedure. You can refer to the commit for more details. Please feel free to reopen this issue if you have any other questions. |
* add export info * add dump-info funciton * add collect info * fix lint * add docstring * docstring * docstring
Thanks for your error report and we appreciate it a lot.
Checklist
Describe the bug
Training on single GPU, when using default gpu (gpu:0) , everything is ok.
Switch to gpu:1, report
an illegal memory access was encountered mmdet3d/ops/iou3d/src/iou3d.cpp 121
during inference, however training is ok.Reproduction
Environment
python mmdet3d/utils/collect_env.py
to collect necessary environment infomation and paste it here.$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)Error traceback
If applicable, paste the error trackback here.
Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
The text was updated successfully, but these errors were encountered: