Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] COMPUTE-SANITIZER due to "invalid argument" on CUDA API call to cudaFreeHost #2039

Closed
NvTimLiu opened this issue May 14, 2024 · 1 comment · Fixed by rapidsai/cudf#15753
Assignees
Labels
bug Something isn't working

Comments

@NvTimLiu
Copy link
Collaborator

Describe the bug

nightly build failed on : COMPUTE-SANITIZER due to "invalid argument" on CUDA API call to cudaFreeHost

details of sanitizer_for_pid_20945.log

========= COMPUTE-SANITIZER
========= Program hit cudaErrorInvalidValue (error 1) due to "invalid argument" on CUDA API call to cudaFreeHost.
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame: [0x480aa6]
=========                in /usr/lib64/libcuda.so.1
=========     Host Frame: [0x36c6bde]
=========                in /tmp/cudf8889764934804329139.so
=========     Host Frame:Java_ai_rapids_cudf_Cuda_freePinned [0xd9a8e1]
=========                in /tmp/cudf8889764934804329139.so
=========     Host Frame: [0xffffffffe7e80b26]
=========                in 
========= 
========= Program hit cudaErrorInvalidValue (error 1) due to "invalid argument" on CUDA API call to cudaGetLastError.
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame: [0x480aa6]
=========                in /usr/lib64/libcuda.so.1
=========     Host Frame: [0x36be2c4]
=========                in /tmp/cudf8889764934804329139.so
=========     Host Frame:cudf::detail::throw_cuda_error(cudaError, char const*, unsigned int) [0xceb78a]
=========                in /tmp/cudf8889764934804329139.so
=========     Host Frame:Java_ai_rapids_cudf_Cuda_freePinned [0xd9a907]
=========                in /tmp/cudf8889764934804329139.so
=========     Host Frame: [0xffffffffe7e80b26]
=========                in 
========= 
========= ERROR SUMMARY: 2 errors
@NvTimLiu NvTimLiu added bug Something isn't working ? - Needs Triage labels May 14, 2024
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue May 15, 2024
Fixes NVIDIA/spark-rapids-jni#2039.  CudaTest#testCudaException causes the compute-sanitizer to fail the test because it (correctly) flags an invalid argument being passed to a CUDA runtime call.  Updated the tagging for the test to avoid running it under the compute-sanitizer.

Authors:
  - Jason Lowe (https://github.com/jlowe)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Gera Shegalov (https://github.com/gerashegalov)

URL: #15753
@jlowe
Copy link
Member

jlowe commented May 21, 2024

Fixed by rapidsai/cudf#15753

@jlowe jlowe closed this as completed May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants