Skip to content

Commit

Permalink
Downgrade to Arrow 12.0.0 for aws-sdk-cpp and fix cudf_kafka builds f…
Browse files Browse the repository at this point in the history
…or new CI containers (#14296)

The aws-sdk-cpp pinning introduced in #14173 causes problems because newer builds of libarrow require a newer version of aws-sdk-cpp. Even though we restrict to libarrow 12.0.1, this restriction is insufficient to create solvable environments because the conda (mamba) solver doesn't seem to consistently reach far back enough into the history of builds to pull the last build that was compatible with the aws-sdk-cpp version that we need. For now, the safest way for us to avoid this problem is to downgrade to arrow 12.0.0, for which all conda package builds are pinned to the older version of aws-sdk-cpp that does not have the bug in question.

Separately, while the above issue was encountered we also got new builds of our CI images [that removed system installs of CTK packages from CUDA 12 images](rapidsai/ci-imgs#77). This changes was made because for CUDA 12 we can get all the necessary pieces of the CTK from conda-forge. However, it turns out that the cudf_kafka builds were implicitly relying on system CTK packages, and the cudf_kafka build is in fact not fully compatible with conda-forge CTK packages because it is not using CMake via scikit-build (nor any other more sophisticated library discovery mechanism like pkg-config) and therefore does not know how to find conda-forge CTK headers/libraries. This PR introduces a set of temporary patches to get around this limitation. These patches are not a long-term fix, and are only put in place assuming that #14292 is merged in the near future before we cut a 23.12 release.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)

URL: #14296
  • Loading branch information
vyasr authored Oct 19, 2023
1 parent 7aa7579 commit d36904b
Show file tree
Hide file tree
Showing 11 changed files with 47 additions and 21 deletions.
4 changes: 2 additions & 2 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ dependencies:
- hypothesis
- identify>=2.5.20
- ipython
- libarrow==12.0.1.*
- libarrow==12.0.0.*
- libcufile-dev=1.4.0.31
- libcufile=1.4.0.31
- libcurand-dev=10.3.0.86
Expand Down Expand Up @@ -69,7 +69,7 @@ dependencies:
- pre-commit
- protobuf>=4.21,<5
- ptxcompiler
- pyarrow==12.0.1.*
- pyarrow==12.0.0.*
- pydata-sphinx-theme
- pyorc
- pytest
Expand Down
5 changes: 3 additions & 2 deletions conda/environments/all_cuda-120_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ dependencies:
- cachetools
- cmake>=3.26.4
- cuda-cudart-dev
- cuda-gdb
- cuda-nvcc
- cuda-nvrtc-dev
- cuda-nvtx-dev
Expand All @@ -41,7 +42,7 @@ dependencies:
- hypothesis
- identify>=2.5.20
- ipython
- libarrow==12.0.1.*
- libarrow==12.0.0.*
- libcufile-dev
- libcurand-dev
- libkvikio==23.12.*
Expand All @@ -66,7 +67,7 @@ dependencies:
- pip
- pre-commit
- protobuf>=4.21,<5
- pyarrow==12.0.1.*
- pyarrow==12.0.0.*
- pydata-sphinx-theme
- pyorc
- pytest
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ requirements:
- scikit-build >=0.13.1
- setuptools
- dlpack >=0.5,<0.6.0a0
- pyarrow =12
- pyarrow =12.0.0
- libcudf ={{ version }}
- rmm ={{ minor_version }}
{% if cuda_major == "11" %}
Expand Down
14 changes: 13 additions & 1 deletion conda/recipes/cudf_kafka/build.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,16 @@
# Copyright (c) 2020-2022, NVIDIA CORPORATION.
# Copyright (c) 2020-2023, NVIDIA CORPORATION.

# This assumes the script is executed from the root of the repo directory
# Need to set CUDA_HOME inside conda environments because the hacked together
# setup.py for cudf-kafka searches that way.
# TODO: Remove after https://github.com/rapidsai/cudf/pull/14292 updates
# cudf_kafka to use scikit-build
CUDA_MAJOR=${RAPIDS_CUDA_VERSION%%.*}
if [[ ${CUDA_MAJOR} == "12" ]]; then
target_name="x86_64-linux"
if [[ ! $(arch) == "x86_64" ]]; then
target_name="sbsa-linux"
fi
export CUDA_HOME="${PREFIX}/targets/${target_name}/"
fi
./build.sh -v cudf_kafka
11 changes: 11 additions & 0 deletions conda/recipes/cudf_kafka/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ build:
- SCCACHE_S3_KEY_PREFIX=cudf-kafka-linux64 # [linux64]
- SCCACHE_S3_USE_SSL
- SCCACHE_S3_NO_CREDENTIALS
# TODO: Remove after https://github.com/rapidsai/cudf/pull/14292 updates
# cudf_kafka to use scikit-build
- RAPIDS_CUDA_VERSION

requirements:
build:
Expand All @@ -41,13 +44,21 @@ requirements:
- {{ compiler('cxx') }}
- ninja
- sysroot_{{ target_platform }} {{ sysroot_version }}
# TODO: Remove after https://github.com/rapidsai/cudf/pull/14292 updates
# cudf_kafka to use scikit-build
{% if cuda_major == "12" %}
- cuda-gdb
{% endif %}
host:
- python
- cython >=3.0.0
- cuda-version ={{ cuda_version }}
- cudf ={{ version }}
- libcudf_kafka ={{ version }}
- setuptools
{% if cuda_major == "12" %}
- cuda-cudart-dev
{% endif %}
run:
- python
- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}
Expand Down
4 changes: 2 additions & 2 deletions conda/recipes/libcudf/conda_build_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ gtest_version:
aws_sdk_cpp_version:
- "<1.11"

libarrow_version:
- "=12"
libarrow:
- "==12.0.0"

dlpack_version:
- ">=0.5,<0.6.0a0"
Expand Down
4 changes: 2 additions & 2 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ requirements:
{% endif %}
- cuda-version ={{ cuda_version }}
- nvcomp {{ nvcomp_version }}
- libarrow {{ libarrow_version }}
- libarrow {{ libarrow }}
- dlpack {{ dlpack_version }}
- librdkafka {{ librdkafka_version }}
- fmt {{ fmt_version }}
Expand Down Expand Up @@ -104,7 +104,7 @@ outputs:
- nvcomp {{ nvcomp_version }}
- librmm ={{ minor_version }}
- libkvikio ={{ minor_version }}
- libarrow {{ libarrow_version }}
- libarrow {{ libarrow }}
- dlpack {{ dlpack_version }}
- gtest {{ gtest_version }}
- gmock {{ gtest_version }}
Expand Down
2 changes: 1 addition & 1 deletion cpp/cmake/thirdparty/get_arrow.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -411,7 +411,7 @@ if(NOT DEFINED CUDF_VERSION_Arrow)
set(CUDF_VERSION_Arrow
# This version must be kept in sync with the libarrow version pinned for builds in
# dependencies.yaml.
12.0.1
12.0.0
CACHE STRING "The version of Arrow to find (or build)"
)
endif()
Expand Down
16 changes: 9 additions & 7 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ files:
includes:
- cudatoolkit
- docs
- libarrow_run
- py_version
py_build_cudf:
output: pyproject
Expand Down Expand Up @@ -225,7 +226,7 @@ dependencies:
- &gmock gmock>=1.13.0
# Hard pin the patch version used during the build. This must be kept
# in sync with the version pinned in get_arrow.cmake.
- libarrow==12.0.1.*
- &libarrow libarrow==12.0.0.*
- librdkafka>=1.9.0,<1.10.0a0
# Align nvcomp version with rapids-cmake
- nvcomp==2.6.1
Expand All @@ -243,7 +244,7 @@ dependencies:
- cython>=3.0.0
# Hard pin the patch version used during the build. This must be kept
# in sync with the version pinned in get_arrow.cmake.
- pyarrow==12.0.1.*
- &pyarrow pyarrow==12.0.0.*
# TODO: Pin to numpy<1.25 until cudf requires pandas 2
- &numpy numpy>=1.21,<1.25
build_python:
Expand All @@ -260,16 +261,14 @@ dependencies:
- protoc-wheel
libarrow_run:
common:
- output_types: [conda, requirements]
- output_types: conda
packages:
# Allow runtime version to float up to minor version
- libarrow==12.*
- *libarrow
pyarrow_run:
common:
- output_types: [conda, requirements, pyproject]
packages:
# Allow runtime version to float up to minor version
- pyarrow==12.*
- *pyarrow
cudatoolkit:
specific:
- output_types: conda
Expand All @@ -282,6 +281,9 @@ dependencies:
- cuda-nvrtc-dev
- cuda-nvtx-dev
- libcurand-dev
# TODO: Remove after https://github.com/rapidsai/cudf/pull/14292 updates
# cudf_kafka to use scikit-build
- cuda-gdb
- matrix:
cuda: "11.8"
packages:
Expand Down
4 changes: 2 additions & 2 deletions python/cudf/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ requires = [
"ninja",
"numpy>=1.21,<1.25",
"protoc-wheel",
"pyarrow==12.0.1.*",
"pyarrow==12.0.0.*",
"rmm==23.12.*",
"scikit-build>=0.13.1",
"setuptools",
Expand Down Expand Up @@ -38,7 +38,7 @@ dependencies = [
"pandas>=1.3,<1.6.0dev0",
"protobuf>=4.21,<5",
"ptxcompiler",
"pyarrow==12.*",
"pyarrow==12.0.0.*",
"rmm==23.12.*",
"typing_extensions>=4.0.0",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
Expand Down
2 changes: 1 addition & 1 deletion python/cudf_kafka/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
requires = [
"cython>=3.0.0",
"numpy>=1.21,<1.25",
"pyarrow==12.0.1.*",
"pyarrow==12.0.0.*",
"setuptools",
"wheel",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
Expand Down

0 comments on commit d36904b

Please sign in to comment.