Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cudf_kafka to use skbuild #14292

Merged
merged 33 commits into from
Nov 14, 2023

Conversation

jdye64
Copy link
Contributor

@jdye64 jdye64 commented Oct 17, 2023

Description

Refactor the currently outdated cudf_kafka build setup to use skbuild instead.

@github-actions github-actions bot added Python Affects Python cuDF API. CMake CMake build issue labels Oct 17, 2023
@bdice bdice added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 17, 2023
@bdice
Copy link
Contributor

bdice commented Oct 18, 2023

@bdice
Copy link
Contributor

bdice commented Oct 19, 2023

Let's also fix the missing README.md for cudf_kafka in this PR.

SetuptoolsWarning: File '/opt/conda/conda-bld/work/python/cudf_kafka/README.md' cannot be found

edit: Done in 37dc381.

rapids-bot bot pushed a commit that referenced this pull request Oct 19, 2023
…or new CI containers (#14296)

The aws-sdk-cpp pinning introduced in #14173 causes problems because newer builds of libarrow require a newer version of aws-sdk-cpp. Even though we restrict to libarrow 12.0.1, this restriction is insufficient to create solvable environments because the conda (mamba) solver doesn't seem to consistently reach far back enough into the history of builds to pull the last build that was compatible with the aws-sdk-cpp version that we need. For now, the safest way for us to avoid this problem is to downgrade to arrow 12.0.0, for which all conda package builds are pinned to the older version of aws-sdk-cpp that does not have the bug in question.

Separately, while the above issue was encountered we also got new builds of our CI images [that removed system installs of CTK packages from CUDA 12 images](rapidsai/ci-imgs#77). This changes was made because for CUDA 12 we can get all the necessary pieces of the CTK from conda-forge. However, it turns out that the cudf_kafka builds were implicitly relying on system CTK packages, and the cudf_kafka build is in fact not fully compatible with conda-forge CTK packages because it is not using CMake via scikit-build (nor any other more sophisticated library discovery mechanism like pkg-config) and therefore does not know how to find conda-forge CTK headers/libraries. This PR introduces a set of temporary patches to get around this limitation. These patches are not a long-term fix, and are only put in place assuming that #14292 is merged in the near future before we cut a 23.12 release.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)

URL: #14296
@bdice bdice requested review from vyasr and removed request for charlesbluca and davidwendt November 9, 2023 21:50
@bdice
Copy link
Contributor

bdice commented Nov 9, 2023

This PR requested a lot of reviewers because it touches a lot of files in fairly small ways. I think we would be fine with a single review from @vyasr and an ops review, since it's mostly build system changes that Vyas is familiar with (or explicitly requested during conversations about this PR).

@bdice
Copy link
Contributor

bdice commented Nov 9, 2023

@divyegala or @robertmaynard, your reviews would also be welcome here if you have time.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Few things left to clean up but overall this looks great!

build.sh Outdated Show resolved Hide resolved
build.sh Outdated Show resolved Hide resolved
dependencies.yaml Outdated Show resolved Hide resolved
python/cudf_kafka/CMakeLists.txt Outdated Show resolved Hide resolved
python/cudf_kafka/CMakeLists.txt Outdated Show resolved Hide resolved
python/cudf_kafka/CMakeLists.txt Outdated Show resolved Hide resolved
python/cudf_kafka/LICENSE Show resolved Hide resolved
python/cudf_kafka/README.md Show resolved Hide resolved
@bdice
Copy link
Contributor

bdice commented Nov 13, 2023

@vyasr and I discussed this offline, and concluded that the cudf_kafka Python package should not be possible to build if the corresponding libcudf_kafka C++ package cannot be found. This greatly simplifies the code, in 27841c7. This change makes it impossible to build pure wheels of cudf_kafka but we aren't shipping those today anyway -- and such a wheel would have to build and ship all of libcudf (and libcudf_kafka) anyway, which is undesirable. We only ship conda packages, and this won't affect conda.

@vyasr
Copy link
Contributor

vyasr commented Nov 14, 2023

I pushed one small fix that should hopefully get this passing CI.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much happier with this simplified version. The remaining test failures are unrelated to this PR and will be resolved once we can merge #14399. Nice work you two!

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was heavily involved in the development of this PR so my review is a bit unfair, but I did a round of self-review and all looks fine to me. I think we will be fine to merge this with a review from @rapidsai/ops.

@bdice bdice added breaking Breaking change and removed non-breaking Non-breaking change labels Nov 14, 2023
@bdice
Copy link
Contributor

bdice commented Nov 14, 2023

I did want to mark this PR as breaking, especially because of the packaging fixes for libcudf_kafka -- we moved some headers around (so that the include paths look like include/cudf_kafka/... instead of include/include/cudf_kafka/...). However, I don't expect any significant breakage for downstream use cases since the cudf_kafka Python package is the only tool I know of that builds on the C++ libcudf_kafka library.

@bdice
Copy link
Contributor

bdice commented Nov 14, 2023

/merge

@rapids-bot rapids-bot bot merged commit 7f3fba1 into rapidsai:branch-23.12 Nov 14, 2023
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants