Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize GHA select statements (to avoid hard-coding versions) #25

Closed
jakirkham opened this issue Dec 9, 2023 · 8 comments
Closed

Comments

@jakirkham
Copy link
Member

As there is only need of Dask-CUDA build, we filter out one architecture, Python version, and CUDA version in GHA. So as to only build the package once

https://github.com/rapidsai/dask-cuda/blob/1eecb1b2ac79ae9aaff9c26d0a3c93dd57f859f3/.github/workflows/build.yaml#L69-L70

However this currently hard-codes the versions of each in the selection logic, which means it can go stale as new versions are added and old ones dropped. Potentially resulting in the build being lost altogether (maybe even silently)

To avoid pinning to a specific version, @ajschmidt8 made several suggestions in this thread: rapidsai/dask-cuda#1294 (comment)

Filing this to track for follow-up

@jakirkham
Copy link
Member Author

@jakirkham
Copy link
Member Author

Also some relevant discussion in this thread: rapidsai/cudf#15174 (comment)

@bdice
Copy link
Contributor

bdice commented Feb 28, 2024

I worked through a bit of jq and I think I have a decent solution.


Case 1: Pure Python, for each CUDA major version

For pure Python packages that need to build for each CUDA major version, like dask-cudf-cu11 / dask-cudf-cu12, we use the highest (Python, CUDA) pair, from groups of each CUDA major version.

map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))

This was originally posted here: rapidsai/cudf#15174 (comment)

Case 2: Pure Python, no CUDA dependency

For jobs that only need a single matrix entry (no dependence on a specific Python or CUDA version), we should use the highest (Python, CUDA) pair.

map(select(.ARCH == "amd64")) | max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]) | [.]

Optionally, you can drop the map(select(.ARCH == "amd64")) if you want jobs for both amd64 and arm64.

These expressions assume that we do not care what OS value (LINUX_VER), GPU hardware (gpu), or driver (driver) is used in the resulting jobs.

@jameslamb
Copy link
Member

@bdice I tried the "Pure Python, no CUDA dependency" example in rapidsai/rapids-dask-dependency#28

Worked perfectly, thank you 😊

vyasr pushed a commit to rapidsai/rapids-dask-dependency that referenced this issue Feb 29, 2024
Contributes to rapidsai/build-planning#25.
Contributes to rapidsai/build-planning#3.

This project uses a `jq` filter in its GitHub Actions configuration to
select exactly 1 combination of `(architecture, Python version, CUDA
version)` for each of its CI jobs.

This PR proposes removing string literals referencing specific versions,
so that the configuration won't have to be updated in the future as
RAPIDS changes its supported matrix of Python and CUDA versions.

Credit for this to @bdice / @ajschmidt8 :
rapidsai/build-planning#25 (comment).
I'm just clicking the buttons 😁
@jakirkham
Copy link
Member Author

Also needed in cuxfilter

https://github.com/rapidsai/cuxfilter/blob/cfab998ffc4f348f649cd839bea5697fd8aeef02/.github/workflows/pr.yaml#L67

@bdice
Copy link
Contributor

bdice commented Feb 29, 2024

I did cudf as well. rapidsai/cudf#15191

There are some hardcoded references remaining but they're not really used in the same way as the cases we've addressed above.

Docker needs something like if latest CUDA but there's no matrix filter so I'm not sure if there's a good solution.
https://github.com/rapidsai/docker/blob/32a60e107c3ff777c7f2ddc2ac4d11a3669c3f83/.github/workflows/build-image.yml#L145

cuGraph has some PyG tests that only run on CUDA 11. I don't think these are worth modifying since it's pinning to an older version. There is some open work needed to upgrade to PyTorch 2, which I think may open up the ability to use CUDA 12 here.
https://github.com/rapidsai/cugraph/blob/ac65b17ee3e9b85368f266da1a6a3b8e5717e292/.github/workflows/pr.yaml#L165
https://github.com/rapidsai/cugraph/blob/ac65b17ee3e9b85368f266da1a6a3b8e5717e292/.github/workflows/test.yaml#L79

rapids-bot bot pushed a commit to rapidsai/cuxfilter that referenced this issue Feb 29, 2024
To eliminate hard-coding, generalize the GHA workflow logic to select one build for testing. This should simplify future cuxfilter updates.

xref: rapidsai/build-planning#25

Authors:
  - https://github.com/jakirkham
  - Ajay Thorve (https://github.com/AjayThorve)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Ajay Thorve (https://github.com/AjayThorve)

URL: #575
@jakirkham
Copy link
Member Author

Thanks Bradley! 🙏

Yeah think we can close this once cuDF is fixed

rapids-bot bot pushed a commit to rapidsai/dask-cuda that referenced this issue Feb 29, 2024
To eliminate hard-coding, generalize the GHA workflow logic to select one build for testing. This should simplify future Dask-CUDA updates.

xref: rapidsai/build-planning#25

Authors:
  - https://github.com/jakirkham

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Ray Douglass (https://github.com/raydouglass)

URL: #1318
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Mar 5, 2024
To eliminate hard-coding, generalize the GHA workflow logic to select one build for testing. This should simplify future updates.

This is a follow-up to #15174.

xref: rapidsai/build-planning#25

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Jake Awe (https://github.com/AyodeAwe)
  - https://github.com/jakirkham

URL: #15191
@bdice
Copy link
Contributor

bdice commented Mar 9, 2024

cuDF's PR is merged: rapidsai/cudf#15191

This should be safe to close now.

@bdice bdice closed this as completed Mar 9, 2024
younseojava pushed a commit to ROCm/dask-cuda-rocm that referenced this issue Apr 16, 2024
To eliminate hard-coding, generalize the GHA workflow logic to select one build for testing. This should simplify future Dask-CUDA updates.

xref: rapidsai/build-planning#25

Authors:
  - https://github.com/jakirkham

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#1318
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants