You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently when NATIVE architectures are specified but no local GPUs are detected, rapids_cuda_set_architectures falls back to producing the list of supported architectures. This is done by passing that list of architectures to rapids_cuda_detect_architectures, which then uses it as the fallback output. The result is that if native arch detection fails, NATIVE is equivalent to RAPIDS, except that the latest virtual architecture is not built like it is for RAPIDS. This behavior seems confusing. If it was an intentional design decision for rapids-cmake to fall back to producing all supported GPU architectures if native detection failed -- and I assume it was since that would only occur on CPU-only machines that are very likely to be machines that are being used to build packages for redistribution (e.g. our CI) -- then I would expect that this fallback should also produce what we consider to be the default build option for RAPIDS.
Should we change NATIVE to use the RAPIDS behavior when native detection fails?
The text was updated successfully, but these errors were encountered:
NATIVE never produces any form of SASS. When no GPU is detected on the machine the goal was to fall back to generate SASS for all supported GPUs so the code will run.
I don't think we should expect that NATIVE will ever produce the same as RAPIDS when no CUDA driver / GPU exists.
Consider going forward we might want to start having RAPIDS generate 90a code, that wouldn't be needed for NATIVE in fallback mode as 90 is sufficient for SASS execution.
NATIVE never produces any form of SASS. When no GPU is detected on the machine the goal was to fall back to generate SASS for all supported GPUs so the code will run.
I assume you mean that NATIVE never produces any form of PTX? It's not clear to me that the fallback to "build all supported SASS" is necessarily a better choice than producing the same behavior as RAPIDS. I get your point with the 90a example, but conversely if the goal is to make it "so the code will run" wouldn't you also want to produce nonzero PTX in case you end up on a newer architecture than the list of supported architectures? That's why we include PTX when generating with RAPIDS.
So I think we can close this and move forward with deprecating NATIVE in 24.08
In any case, I was also thinking about the switch to native as well when writing up this issue, and I agree that probably makes this issue moot so I'm fine closing.
Currently when NATIVE architectures are specified but no local GPUs are detected,
rapids_cuda_set_architectures
falls back to producing the list of supported architectures. This is done by passing that list of architectures torapids_cuda_detect_architectures
, which then uses it as the fallback output. The result is that if native arch detection fails, NATIVE is equivalent to RAPIDS, except that the latest virtual architecture is not built like it is for RAPIDS. This behavior seems confusing. If it was an intentional design decision for rapids-cmake to fall back to producing all supported GPU architectures if native detection failed -- and I assume it was since that would only occur on CPU-only machines that are very likely to be machines that are being used to build packages for redistribution (e.g. our CI) -- then I would expect that this fallback should also produce what we consider to be the default build option for RAPIDS.Should we change NATIVE to use the RAPIDS behavior when native detection fails?
The text was updated successfully, but these errors were encountered: