-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA error: device-side assert triggered #6
Comments
Thank you for your interest in our work. May I ask have you set the right data path and the right training GPUs? |
Thank you for your prompt response. I have modified the data path and GPUs. The error above is when I specify only one GPU, when I use multiple GPUs the error is the following: Traceback (most recent call last): -- Process 0 terminated with the following error: |
May I ask have you compiled the |
I only compiled pointops2 according to your instruction. For one gpu, For multi gpu, |
The error may be caused by the kpconv provided by torch-points3d. I wonder whether you successfully install it? Can you double check that torch-points3d can work smoothly? |
Thanks. I have checked it. But when I use multi gpus, this error has not appeared. Kpconv provided by torch-points3d can work well. |
Can you run successfully now? If you use one GPU, remember to add |
I have solved the bug of one gpu by modifing this. But now I still have the error of line 291 both using one gpu and multi gpus. |
Hi, How do you modify it? |
model = torch.nn.DataParallel(model.cuda()) ----> model = model.cuda() |
But, that way we can't use multi-GPU training. I am also getting this error when using model = torch.nn.DataParallel(model.cuda()). any suggestions? |
Hi, authors,
Thanks a lot for your awesome work.
I met this error, have you ever met it?
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mmvc/Congcong/Stratified-Transformer/model/stratified_transformer.py", line 438, in forward
feats = layer(feats, xyz, batch, neighbor_idx)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mmvc/Congcong/Stratified-Transformer/model/stratified_transformer.py", line 357, in forward
feats = self.kpconv(xyz, xyz, neighbor_idx, feats)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch_points3d/modules/KPConv/kernels.py", line 83, in forward
new_feat = KPConv_ops(
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch_points3d/modules/KPConv/convolution_ops.py",line 95, in KPConv_ops
neighborhood_features = gather(features, neighbors_indices)
File "/home/mmvc/anaconda3/envs/pytorch19/lib/python3.8/site-packages/torch_points3d/core/common_modules/gathering.py", line 10, in gather
idx[idx == -1] = x.shape[0] - 1 # Shadow point
RuntimeError: CUDA error: device-side assert triggered
The text was updated successfully, but these errors were encountered: