-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] raft::cuda_error thrown from experimental RF backend #3107
Labels
bug
Something isn't working
Comments
hcho3
added
? - Needs Triage
Need team to review and classify
bug
Something isn't working
and removed
? - Needs Triage
Need team to review and classify
labels
Oct 31, 2020
Backtrace from GDB:
|
JohnZed
added a commit
to JohnZed/cuml
that referenced
this issue
Nov 5, 2020
dantegd
pushed a commit
that referenced
this issue
Nov 12, 2020
* Patch and test for RF crash #3107 * Cleanups of RF regression fixes * Add failing tests to RF regression * Expand experimental backend testing and align pointers * Expand python RF regression test * Updates based on review feedback * Update changelog * Add classification tests * Review comments and style fixes for RF
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
The experimental RF backend crashes with an exception
raft::cuda_error
.Output:
[W] [19:16:40.163584] Using experimental backend for growing trees
[W] [19:16:43.017118] Using experimental backend for growing trees
[W] [19:16:45.835559] Using experimental backend for growing trees
[W] [19:16:48.644952] Using experimental backend for growing trees
[W] [19:16:51.465764] Using experimental backend for growing trees
[W] [19:16:54.285171] Using experimental backend for growing trees
[W] [19:16:57.105698] Using experimental backend for growing trees
[W] [19:16:59.935863] Using experimental backend for growing trees
[W] [19:17:02.781407] Using experimental backend for growing trees
[W] [19:17:05.623277] Using experimental backend for growing trees
terminate called after throwing an instance of 'raft::cuda_error'
what(): CUDA error encountered at: file=/home/phcho/Desktop/cuml/cpp/build/raft/src/raft/cpp/include/raft/mr/host/allocator.hpp line=48: call='cudaFreeHost(p)', Reason=cudaErrorIllegalAddress:an illegal memory access was encountered
Aborted (core dumped)
Steps/Code to reproduce bug
Expected behavior
The program should not crash.
Environment details (please complete the following information):
Conda environment detail ("conda list")
The text was updated successfully, but these errors were encountered: