-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enable cucim and xgboost in CUDA 12 rapids builds. #669
Re-enable cucim and xgboost in CUDA 12 rapids builds. #669
Conversation
This reverts commit cc272c4.
I also enabled testing of conda packages. I think that even if we can't import GPU libraries, we should be able to ensure that the packages contained in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bradley! 🙏
Made a few minor comments below
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
We're going to have to disable tests for |
Updating this thread, sounds like we need PR ( rapidsai/cudf#13769 ) to fix a couple more |
That PR has been merged and packages uploaded |
…ration into enable-cuda-12-cucim-xgboost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bradley! 🙏
Think we may need --use-local
to pick up locally built packages when testing
Co-authored-by: jakirkham <[email protected]>
Seeing this on CI:
Think we need to set the environment variable |
…ration into enable-cuda-12-cucim-xgboost
Should we add this to both of these (as we've done with other recipes like cuDF)? test:
requires:
- cuda-version ={{ cuda_version }} Added suggestions below: |
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
We’ll need to disable the import tests because this runs on CPU — but just solving the environment with no imports is better than no testing. |
Looks like we can't actually run the tests on the runner we're using because it doesn't have a working CUDA installation, but at least testing that the environment solves is helpful. |
Jinx |
@@ -6,20 +6,21 @@ set -euo pipefail | |||
source rapids-env-update | |||
|
|||
CONDA_CONFIG_FILE="conda/recipes/versions.yaml" | |||
export CONDA_OVERRIDE_CUDA="${RAPIDS_CUDA_VERSION}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably consider the implications here. Today, I think rapids is installable even on CPU-only machines. The new rapids-xgboost package design requires __cuda
to install. This is important to support for cases like HPC systems with CPU login nodes and GPU worker nodes that use the same environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they can also install by setting CONDA_OVERRIDE_CUDA
to some value
In any event, this is coming from libxgboost
. So we could move this just to the rapids-xgboost
if we prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new rapids-xgboost package design
What is the relevant new change? Is it in libxgboost or in something about how rapids-xgboost is packaged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Well, maybe this requirement already existed. I'm not sure.
The question is whether __cuda
should be a hard requirement for installation, which is coming from xgboost-related packages. I'm not sure if it was that way for the old xgboost packages we shipped in 23.06 or not. Regardless, it feels funny that no other RAPIDS package has this requirement besides xgboost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping this in PR ( #673 ), which pulls in the new xgboost
packages
imports: # [linux64] | ||
- cucim # [linux64] | ||
- cudf # [linux64] | ||
- cudf_kafka # [linux64] | ||
- cugraph # [linux64] | ||
- cuml # [linux64] | ||
{% if cuda_major == "11" %} | ||
- cusignal # [linux64] | ||
{% endif %} | ||
- cuspatial # [linux64] | ||
- custreamz # [linux64] | ||
- cuxfilter # [linux64] | ||
- dask_cuda # [linux64] | ||
- dask_cudf # [linux64] | ||
- pylibcugraph # [linux64] | ||
- rmm # [linux64] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could run these through pkgutil.find_loader
in a run_test.py
script in the recipe
This would let us test for their existence without needing to import
them (and thus not need a GPU to test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessary for this PR, maybe file an issue or PR with this proposal later on. I feel comfortable with the current level of testing, which is higher than what we had before.
Co-authored-by: jakirkham <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks all! 🙏
Had a few comments above, but none of them are blocking
Potentially blocking issue with xgboost builds requiring Previous release installation did not require
The above command succeeds. The proposed changes here would require
The above command fails on CPU-only systems. This is testing the currently nightly I think this issue is blocking for the 23.08 release, but not necessarily for this PR. I'd be fine with merging this PR and discussing this issue separately. I can revisit this tomorrow with others. |
After discussion offline, we concluded the XGBoost issue is non-blocking for this PR. So it should be good to merge We are evaluating options to fix XGBoost packages to not require |
--variant-config-files "${CONDA_CONFIG_FILE}" \ | ||
conda/recipes/rapids-xgboost | ||
|
||
rapids-logger "Build rapids" | ||
|
||
rapids-mamba-retry mambabuild \ | ||
--no-test \ | ||
--use-local \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just want to mention that for other repos, we use --channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}"
: https://github.com/rapidsai/cudf/blob/abb59c83128f956c7edcb4d7744cb0faecf0026c/ci/build_python.sh#L18-L39
RAPIDS_CONDA_BLD_OUTPUT_DIR
is set in our CI images: https://github.com/rapidsai/ci-imgs/blob/75cad918c44c6e00480001b24cb764e1b43fa0a5/Dockerfile#L108-L113
It looks like --use-local
works, I'm just pointing this out in case we want consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah --use-local
will check the same things. Please see this list of paths that Conda checks when --use-local
is set
/merge |
PR #664 temporarily disabled CUDA 12 packages for cucim and xgboost in
rapids
. This re-enables those.This reverts commit cc272c4.
This can be merged once the following issues are closed: