Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when Training s3dis #98

Open
leenamx opened this issue Apr 11, 2024 · 0 comments
Open

Error when Training s3dis #98

leenamx opened this issue Apr 11, 2024 · 0 comments

Comments

@leenamx
Copy link

leenamx commented Apr 11, 2024

Hi, I encountered error when training s3dis dataset.

I don't know how to reslove it. I would be very appreciated if you can help me!

20240411163747

[04/11 08:36:36 main-logger]: #Model parameters: 8022970
[04/11 08:36:42 main-logger]: augmentation all
[04/11 08:36:42 main-logger]: jitter_sigma: 0.005, jitter_clip: 0.02
Totally 204 samples in train set.
[04/11 08:36:42 main-logger]: train_data samples: '6120'
Totally 67 samples in val set.
[04/11 08:36:42 main-logger]: scheduler: MultiStep. scheduler_update: epoch. milestones: [60, 80], gamma: 0.1
[04/11 08:36:42 main-logger]: lr: [0.006, 0.0006000000000000001]
WARNING [04/11 08:36:48 main-logger]: batch_size shortened from 2 to 1, points from 157383 to 80000
WARNING [04/11 08:36:48 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:49 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:49 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:50 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:50 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:52 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:55 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:55 main-logger]: batch_size shortened from 2 to 1, points from 143502 to 80000
WARNING [04/11 08:36:55 main-logger]: batch_size shortened from 2 to 1, points from 149464 to 80000
WARNING [04/11 08:36:56 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:57 main-logger]: batch_size shortened from 2 to 1, points from 145628 to 65628
WARNING [04/11 08:36:57 main-logger]: batch_size shortened from 2 to 1, points from 154192 to 74192
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [38,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [39,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [40,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [41,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [42,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [43,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [50,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [51,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [52,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [56,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [57,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [58,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [62,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [131,0,0], thread: [63,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
WARNING [04/11 08:36:57 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:57 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
WARNING [04/11 08:36:58 main-logger]: batch_size shortened from 2 to 1, points from 160000 to 80000
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from createEvent at /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/cuda/CUDAEvent.h:174 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x4d (0x7f1f8ffa21bd in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: + 0xaca8ba (0x7f1d81d5c8ba in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #2: + 0x2ecb98 (0x7f1dc696fb98 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #3: c10::TensorImpl::release_resources() + 0x175 (0x7f1f8ff88fb5 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #4: + 0x1db509 (0x7f1dc685e509 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: + 0x4c634c (0x7f1dc6b4934c in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: THPVariable_subclass_dealloc(_object*) + 0x292 (0x7f1dc6b49652 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)

frame #25: __libc_start_main + 0xe7 (0x7f1fad437c87 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted

@X-Lai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant