-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Build only compute
for the newest arch in CMAKE_CUDA_ARCHITECTURES
#706
[REVIEW] Build only compute
for the newest arch in CMAKE_CUDA_ARCHITECTURES
#706
Conversation
84a6a7b
to
468a8e5
Compare
@harrism any chance you could review before we merge? I'm not super familiar with nvcc gencode options 😄 |
# newest arch only to build that way while the rest built only for sm. | ||
list(SORT CMAKE_CUDA_ARCHITECTURES ORDER ASCENDING) | ||
list(POP_BACK CMAKE_CUDA_ARCHITECTURES latest_arch) | ||
list(TRANSFORM CMAKE_CUDA_ARCHITECTURES APPEND "-real") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is -real
a cmake command? What does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-real
and -virtual
are special keywords that can be used with CMAKE_CUDA_ARCHITECTURES
to provide abstractions around different CUDA compilers code generation API.
For nvcc:
input | compiler invocation |
---|---|
80 | --generate-code=arch=compute_80,code=[sm_80,compute_80] |
80-virtual | --generate-code=arch=compute_80,code=compute_80 |
80-real | --generate-code=arch=compute_80,code=sm_80 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I see the output of this command when I see above now.CMAKE_CUDA_ARCHITECTURES
is unset?
We want SASS for all architectures we support, right? If we only include SASS ("-real"/) for 80, then users with anything but Ampere GPUs will experience looooong load/import times due to PTX-JIT to their present architecture. We do need to include PTX, but only for those who have GPUs we don't officially support (e.g. forward compatibility).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I had it backwards. The -real is appended to all but the last entry. I thought it was only being appended to the last entry. All good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want SASS for all architectures we support, right? If we only include SASS ("-real"/) for 80, then users with anything but Ampere GPUs will experience looooong load/import times
You are correct. The code above is sneaky, as what we do is remove the 'newest' and only apply -real
to any existing values. So input 70,80
becomes 70-real, 80
and input 80
becomes 80
@gpucibot merge |
Based on #706 as they both modify `cmake/Modules/SetGPUArchs.cmake` This brings rmm's handling of `CMAKE_CUDA_ARCHITECTURES` to match that is proposed for cudf in rapidsai/cudf#7391 Authors: - Robert Maynard (@robertmaynard) Approvers: - Mark Harris (@harrism) - Keith Kraus (@kkraus14) URL: #709
RMM doesn't need to build
sm
andcompute
for each architecture, but only for the newest arch.