-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LTO] Unit test testCudaCheck failing for aarch64
architecture
#40834
Comments
A new Issue was created by @aandvalenzuela Andrea Valenzuela. @Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign heterogeneous |
Not sure why it succeeds in non-LTO IBs:
CMSSW_13_1_X_2023-02-20-2300 for |
I took a look and it seems that the cmssw/HeterogeneousCore/CUDAUtilities/interface/cudaCheck.h Lines 45 to 49 in 918a7fc
stay uninitialized after the cuGetErrorName() and cuGetErrorString() calls. Ok, I didn't really check that, but their value in the abortOnCudaError() are garbage, and initializing them to nullptr seems to fix the crash.
|
Fixed in #40840 |
Thanks @makortel! |
Hello,
Test
testCudaCheck
(moduleHeterogeneousCore/CUDAUtilities
) is failing since 1st of Feb in LTO IBs (aarch64
architecture only) due to segmentation violation when checking the driver API:See stacktrace.
Test was added on 1st of Feb via #40619 and it succeeds since then in
amd64
, but not onaarch64
. Tests are supposed to pass on machines with and without GPUs (#40619 (comment)). I can reproduce the issue in all ouraarch64
machines.Thanks,
Andrea.
The text was updated successfully, but these errors were encountered: