Skip to content

Commit

Permalink
Add conda recipe for cudf-polars (#17037)
Browse files Browse the repository at this point in the history
This PR adds conda recipes for `cudf-polars`. This is needed to get `cudf-polars` into RAPIDS Docker containers and the `rapids` metapackage.

Closes #16816.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Matthew Murray (https://github.com/Matt711)
  - James Lamb (https://github.com/jameslamb)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #17037
  • Loading branch information
bdice authored Oct 17, 2024
1 parent e493340 commit 9980997
Show file tree
Hide file tree
Showing 5 changed files with 88 additions and 3 deletions.
5 changes: 5 additions & 0 deletions ci/build_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -52,5 +52,10 @@ RAPIDS_PACKAGE_VERSION=$(head -1 ./VERSION) rapids-conda-retry mambabuild \
--channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
conda/recipes/custreamz

RAPIDS_PACKAGE_VERSION=$(head -1 ./VERSION) rapids-conda-retry mambabuild \
--no-test \
--channel "${CPP_CHANNEL}" \
--channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
conda/recipes/cudf-polars

rapids-upload-conda-to-s3 python
19 changes: 17 additions & 2 deletions ci/test_python_other.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ rapids-mamba-retry install \
--channel "${PYTHON_CHANNEL}" \
"dask-cudf=${RAPIDS_VERSION}" \
"cudf_kafka=${RAPIDS_VERSION}" \
"custreamz=${RAPIDS_VERSION}"
"custreamz=${RAPIDS_VERSION}" \
"cudf-polars=${RAPIDS_VERSION}"

rapids-logger "Check GPU usage"
nvidia-smi
Expand All @@ -37,7 +38,7 @@ rapids-logger "pytest dask_cudf (legacy)"
DASK_DATAFRAME__QUERY_PLANNING=False ./ci/run_dask_cudf_pytests.sh \
--junitxml="${RAPIDS_TESTS_DIR}/junit-dask-cudf-legacy.xml" \
--numprocesses=8 \
--dist=loadscope \
--dist=worksteal \
.

rapids-logger "pytest cudf_kafka"
Expand All @@ -54,5 +55,19 @@ rapids-logger "pytest custreamz"
--cov-report=xml:"${RAPIDS_COVERAGE_DIR}/custreamz-coverage.xml" \
--cov-report=term

# Note that cudf-polars uses rmm.mr.CudaAsyncMemoryResource() which allocates
# half the available memory. This doesn't play well with multiple workers, so
# we keep --numprocesses=1 for now. This should be resolved by
# https://github.com/rapidsai/cudf/issues/16723.
rapids-logger "pytest cudf-polars"
./ci/run_cudf_polars_pytests.sh \
--junitxml="${RAPIDS_TESTS_DIR}/junit-cudf-polars.xml" \
--numprocesses=1 \
--dist=worksteal \
--cov-config=./pyproject.toml \
--cov=cudf_polars \
--cov-report=xml:"${RAPIDS_COVERAGE_DIR}/cudf-polars-coverage.xml" \
--cov-report=term

rapids-logger "Test script exiting with value: $EXITCODE"
exit ${EXITCODE}
4 changes: 4 additions & 0 deletions conda/recipes/cudf-polars/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Copyright (c) 2024, NVIDIA CORPORATION.

# This assumes the script is executed from the root of the repo directory
./build.sh cudf_polars
61 changes: 61 additions & 0 deletions conda/recipes/cudf-polars/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Copyright (c) 2024, NVIDIA CORPORATION.

{% set version = environ['RAPIDS_PACKAGE_VERSION'].lstrip('v') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
{% set py_version = environ['CONDA_PY'] %}
{% set cuda_version = '.'.join(environ['RAPIDS_CUDA_VERSION'].split('.')[:2]) %}
{% set cuda_major = cuda_version.split('.')[0] %}
{% set date_string = environ['RAPIDS_DATE_STRING'] %}

package:
name: cudf-polars
version: {{ version }}

source:
path: ../../..

build:
number: {{ GIT_DESCRIBE_NUMBER }}
string: cuda{{ cuda_major }}_py{{ py_version }}_{{ date_string }}_{{ GIT_DESCRIBE_HASH }}_{{ GIT_DESCRIBE_NUMBER }}
script_env:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- AWS_SESSION_TOKEN
- CMAKE_C_COMPILER_LAUNCHER
- CMAKE_CUDA_COMPILER_LAUNCHER
- CMAKE_CXX_COMPILER_LAUNCHER
- CMAKE_GENERATOR
- PARALLEL_LEVEL
- SCCACHE_BUCKET
- SCCACHE_IDLE_TIMEOUT
- SCCACHE_REGION
- SCCACHE_S3_KEY_PREFIX=cudf-polars-aarch64 # [aarch64]
- SCCACHE_S3_KEY_PREFIX=cudf-polars-linux64 # [linux64]
- SCCACHE_S3_USE_SSL
- SCCACHE_S3_NO_CREDENTIALS

requirements:
host:
- python
- rapids-build-backend >=0.3.0,<0.4.0.dev0
- setuptools
- cuda-version ={{ cuda_version }}
run:
- python
- pylibcudf ={{ version }}
- polars >=1.8,<1.9
- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}

test:
requires:
- cuda-version ={{ cuda_version }}
imports:
- cudf_polars


about:
home: https://rapids.ai/
license: Apache-2.0
license_family: APACHE
license_file: LICENSE
summary: cudf-polars library
2 changes: 1 addition & 1 deletion python/cudf_polars/tests/expressions/test_agg.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def test_bool_agg(agg, request):
assert_gpu_result_equal(q, check_exact=False)


@pytest.mark.parametrize("cum_agg", expr.UnaryFunction._supported_cum_aggs)
@pytest.mark.parametrize("cum_agg", sorted(expr.UnaryFunction._supported_cum_aggs))
def test_cum_agg_reverse_unsupported(cum_agg):
df = pl.LazyFrame({"a": [1, 2, 3]})
expr = getattr(pl.col("a"), cum_agg)(reverse=True)
Expand Down

0 comments on commit 9980997

Please sign in to comment.