Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CUDA 12.2 #2092

Merged
merged 27 commits into from
Feb 12, 2024
Merged

Support CUDA 12.2 #2092

merged 27 commits into from
Feb 12, 2024

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Jan 11, 2024

Description

  • switches to CUDA 12.2.2 for building conda packages and wheels
  • adds new tests running against CUDA 12.2.2

Notes for Reviewers

This is part of ongoing work to build and test packages against CUDA 12.2.2 across all of RAPIDS.

For more details see:

Planning a second round of PRs to revert these references back to a proper branch-24.{nn} release branch of shared-workflows once rapidsai/shared-workflows#166 is merged.

I intentionally did not add a CUDA 12.2 environment for ANN benchmarks, as I assumed that would be more involved and because it isn't strictly necessary to support building and publishing packages that support CUDA 12.2.

raft/dependencies.yaml

Lines 23 to 26 in 93a504e

bench_ann:
output: conda
matrix:
cuda: ["11.8", "12.0"]

(created with rapids-reviser)

@jameslamb jameslamb changed the title add CUDA 12.2 support for conda packages and wheels WIP: use CUDA 12.2 for building and testing wheels Jan 11, 2024
@jameslamb jameslamb changed the title WIP: use CUDA 12.2 for building and testing wheels WIP: add CUDA 12.2 support for conda packages and wheels Jan 11, 2024
@cjnolet cjnolet added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 11, 2024
@bdice
Copy link
Contributor

bdice commented Jan 11, 2024

https://github.com/rapidsai/raft/actions/runs/7494925981/job/20403970360?pr=2092#step:7:644

-- CPM: Using local package [email protected]

I suspect this is causing issues due to buggy install rules which we've patched in rapids-cmake but aren't present in the CUDA 12.2 local copy that this build is finding.

We probably need to do something to prevent RAPIDS from using the CUDA 12.2 CCCL version 2.2.0, since it doesn't have the CMake patches we need. Always downloading is the only solution I can think of, unless there's a safe way for us to force patching a local package. I think this probably needs something like CPM_DOWNLOAD_CCCL? cc: @robertmaynard @vyasr

@bdice
Copy link
Contributor

bdice commented Jan 11, 2024

It looks like force-downloading CCCL has fixed the build here. I opened a rapids-cmake PR to always download CCCL: rapidsai/rapids-cmake#522

@vyasr
Copy link
Contributor

vyasr commented Jan 12, 2024

The diagnosis is right. I'm not sure of the fix though. I'll comment on the rapids-cmake PR.

@jakirkham jakirkham added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 13, 2024
@bdice
Copy link
Contributor

bdice commented Jan 19, 2024

While working on #2102, I noticed an issue. In the libraft-headers-only recipe, we currently rely on cuda-cudart-dev in host to add some desired run-exports for cuda-cudart. This is going to need some rethinking for CUDA 12.2, since the run-exports we've been using to our advantage in CUDA 12.0 are not going to be correct for CUDA Enhanced Compatibility for CUDA 12.2. You can see that here, where CI is pulling cuda-version=12.2 for the 12.0 CI job. https://github.com/rapidsai/raft/actions/runs/7562776933/job/20636164208?pr=2092#step:7:345 We're not testing what we thought we were testing (it's really 12.2 libraries and not 12.0 libraries).

We will also need to ignore the strong run export of cuda-version >={{ cuda_version }} from cuda-nvcc, in my understanding. https://github.com/conda-forge/cuda-nvcc-feedstock/blob/717380326f9e9599c77fac378b19adf894380b3c/recipe/meta.yaml#L47

I'm not sure if we want to treat this as a CUDA packaging bug or as a bug in libraft's recipe (it applies to all of RAPIDS). We had discussed "assuming CUDA Enhanced Compatibility by default" and considered the possibility of loosening the CUDA compiler pin to 12.* in conda-forge rather than migrating for each minor version. cc: @jakirkham

@jameslamb jameslamb changed the base branch from branch-24.02 to branch-24.04 January 22, 2024 15:25
@github-actions github-actions bot added the cpp label Jan 22, 2024
dependencies.yaml Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the cpp label Jan 24, 2024
@jakirkham
Copy link
Member

Looks like this is now green! 🥳

@jameslamb jameslamb changed the title WIP: add CUDA 12.2 support for conda packages and wheels (DO NOT MERGE) add CUDA 12.2 support for conda packages and wheels Jan 25, 2024
@jameslamb jameslamb marked this pull request as ready for review January 25, 2024 22:51
@jameslamb jameslamb requested a review from a team as a code owner January 25, 2024 22:51
@jameslamb jameslamb changed the title (DO NOT MERGE) add CUDA 12.2 support for conda packages and wheels Support CUDA 12.2 Jan 25, 2024
@bdice bdice removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Feb 10, 2024
@jakirkham
Copy link
Member

One of the jobs failed due to git clone. So needs a restart. All the other ones passed

@bdice
Copy link
Contributor

bdice commented Feb 11, 2024

All CI is passing and CI logs look good. We need an ops review from @AyodeAwe / @raydouglass to proceed. I'll trigger the /merge for now.

@bdice
Copy link
Contributor

bdice commented Feb 11, 2024

/merge

@rapids-bot rapids-bot bot merged commit bf850a9 into rapidsai:branch-24.04 Feb 12, 2024
62 checks passed
rapids-bot bot pushed a commit that referenced this pull request Feb 20, 2024
Follow-up to #2092

For all GitHub Actions configs, replaces uses of the `test-cuda-12.2` branch on `shared-workflows`
with `branch-24.04`, now that rapidsai/shared-workflows#166 has been merged.

### Notes for Reviewers

This is part of ongoing work to build and test packages against CUDA 12.2 across all of RAPIDS.

For more details see:

* rapidsai/build-planning#7

*(created with `rapids-reviser`)*

Authors:
  - James Lamb (https://github.com/jameslamb)
  - https://github.com/jakirkham

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: #2189
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

7 participants