-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to CUDA 12.6 #97
Conversation
jakirkham
commented
Aug 3, 2024
•
edited
Loading
edited
- Bump CUDA version to 12.6
e368ff3
to
1a0b9ae
Compare
With CUDA 12.6 am seeing the following test failure on CI: ___________________ test_duplicate_symbols_cubin_and_fatbin ____________________
device_functions_cubin = ('test_device_functions.cubin', b"\x7fELF\x02\x01\x013\x07\x00\x00\x00\x00\x00\x00\x00\x01\x00\xbe\x00x\x00\x00\x00\x0...x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\t\x00\x00\x18\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")
device_functions_fatbin = ('test_device_functions.fatbin', b'P\xedU\xba\x01\x00\x10\x00\x00\t\x00\x00\x00\x00\x00\x00\x02\x00\x01\x01@\x00\x00\x...x1e\x08@\x00\x1f\x08@\x00\x00\x1f\xa4@\x00\x05\x1e\t@\x00\x1a\t@\x00P\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
gpu_arch_flag = '-arch=sm_70'
def test_duplicate_symbols_cubin_and_fatbin(
device_functions_cubin, device_functions_fatbin, gpu_arch_flag
):
# This link errors because the cubin and the fatbin contain the same
# symbols.
nvjitlinker = NvJitLinker(gpu_arch_flag)
name, cubin = device_functions_cubin
nvjitlinker.add_cubin(cubin, name)
name, fatbin = device_functions_fatbin
> with pytest.raises(NvJitLinkError, match="NVJITLINK_ERROR_INVALID_INPUT error"):
E Failed: DID NOT RAISE <class 'pynvjitlink.api.NvJitLinkError'>
test_pynvjitlink_api.py:90: Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to me.
CI failure seems to be from a test where the underlying lib is correctly erroring but I think there error isn't being translated into an NvJitLinkError correctly somehow. Looking into it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the package hashing error noted above: #97 (comment)
This uses a build time generated variant file as noted below
Have separated the bulk of these changes into PR: #101 As that is now in, will rebase this so it contains only the CUDA 12.6 update |
@brandon-b-miller @gmarkall Would one of you be able to follow up on this failure? |
This appears to be a change in upstream nvJitLink behaviour
Thanks for the assist Bradley! 🙏 |
* Fix building tests in multi-gpu environment (#98) * Change cmake.verbose = true to build.verbose = true (#99) * Use build-system.requires to set scikit-build-core minimum version (#100) * Set CUDA version in one file (and use everywhere else) (#101) * Drop Python 3.9 support (#102) * Use CI workflow branch 'branch-24.10' again (#105) * Use conda strict channel priority. (#109) * Update to CUDA 12.6 (#97)