Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Test against old versions of key dependencies #16570

Merged
merged 33 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
749752d
CI: Test against old versions of key dependencies
seberg Aug 15, 2024
c4f8547
Use constraints for pip (incidentally fixing cupy install issues)
seberg Aug 16, 2024
3a416aa
Remove fsspec from pin for now (could constraint to 2023.5.0)
seberg Aug 16, 2024
b22d111
Consolidate pip installs, fix constraints and use for cudf-pandas tests
seberg Aug 16, 2024
224f944
TST: xfail test that clearly requires newer pandas versions
seberg Aug 16, 2024
b926cd3
TST: Reinstate (most) test version checks removed in 15100
seberg Aug 16, 2024
1d6c037
Update ci/test_python_common.sh
seberg Aug 19, 2024
e97d6fe
Merge branch 'branch-24.10' into test-oldest
seberg Aug 19, 2024
d84b0e3
Clean up pylibcudf merge conflict
seberg Aug 19, 2024
b5c17a2
Merge branch 'branch-24.10' into test-oldest
seberg Aug 19, 2024
c267b1e
Merge branch 'branch-24.10' into test-oldest
seberg Aug 21, 2024
062f601
Use oldest deps also in cudf_pandas_scripts/pandas-tests/run.sh
seberg Aug 21, 2024
4fff8b5
TST: More (heavy handed) test fixups to ignore old pandas failures
seberg Aug 21, 2024
3420a8c
More heavy handed test fixes
seberg Aug 21, 2024
caccd02
TST: Apply cudf_pandas_test fixes from gh-16595
seberg Aug 21, 2024
7f308df
Merge branch 'branch-24.10' into test-oldest
mroeschke Aug 21, 2024
5a165e2
Skip 2 more unit tests for older pandas versions
mroeschke Aug 22, 2024
f740c67
Merge remote-tracking branch 'upstream/branch-24.10' into test-oldest
mroeschke Aug 23, 2024
273e18d
Check for min numba
mroeschke Aug 23, 2024
b8fcbe7
Merge remote-tracking branch 'upstream/branch-24.10' into test-oldest
mroeschke Aug 28, 2024
3a709a0
Make pyarrow pin 14 instead of 16
mroeschke Aug 28, 2024
5c794ae
Update ci/test_wheel_cudf.sh
mroeschke Aug 28, 2024
c7452fb
Merge remote-tracking branch 'upstream/branch-24.10' into test-oldest
mroeschke Aug 28, 2024
d09c47f
Change some xfails to skip
mroeschke Aug 28, 2024
f9eb3ab
Merge branch 'test-oldest' of https://github.com/seberg/cudf into tes…
mroeschke Aug 28, 2024
769f37b
Merge remote-tracking branch 'upstream/branch-24.10' into test-oldest
mroeschke Aug 29, 2024
d701297
skip test_p2p_shuffle for min pyarrow version
mroeschke Aug 29, 2024
a994001
Merge branch 'branch-24.10' into test-oldest
galipremsagar Aug 30, 2024
309dd5b
Merge branch 'branch-24.10' into test-oldest
mroeschke Aug 30, 2024
54b646e
Merge branch 'branch-24.10' into test-oldest
mroeschke Sep 3, 2024
3928ba1
Merge remote-tracking branch 'upstream/branch-24.10' into test-oldest
mroeschke Sep 3, 2024
e34a02c
Remove oldest dependency version checking from pandas unit test script
mroeschke Sep 4, 2024
d157f1e
Merge branch 'test-oldest' of https://github.com/seberg/cudf into tes…
mroeschke Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions ci/cudf_pandas_scripts/pandas-tests/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,20 @@ RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

echo "" > ./constraints.txt
if [[ $RAPIDS_DEPENDENCIES == "oldest" ]]; then
# `test_python` constraints are for `[test]` not `[cudf-pandas-tests]`
rapids-dependency-file-generator \
--output requirements \
--file-key test_python \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
| tee ./constraints.txt
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think this is going to be a common pattern that's worth extracting into a gha-tool generate-oldest-constraints $file_key or similar? @jameslamb WDYT? It occurs many times in this PR and I assume we'll do something similar in other repos.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's worth doing!

The only part that will vary from repo-to-repo is --file-key, everything else here is mechanical and a great candidate for a gha-tools script. I could put one up in a day or two, unless you or @seberg want to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest one of the two of you pick this up. Sebastian is out for a bit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could tackle this that would be great @jameslamb. No rush. @mroeschke if you get everything else here finalized and have it ready to merge, I'm fine merging this PR with this code in and refactoring in a follow-up when James has the gha-tool ready.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put up a proposal in rapidsai/gha-tools#114


# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
-v \
--constraint ./constraints.txt \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,pandas-tests]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"

Expand Down
13 changes: 12 additions & 1 deletion ci/cudf_pandas_scripts/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,19 @@ else
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

# echo to expand wildcard before adding `[extra]` requires for pip
echo "" > ./constraints.txt
if [[ $RAPIDS_DEPENDENCIES == "oldest" ]]; then
# `test_python` constraints are for `[test]` not `[cudf-pandas-tests]`
rapids-dependency-file-generator \
--output requirements \
--file-key test_python \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
| tee ./constraints.txt
fi

python -m pip install \
-v \
--constraint ./constraints.txt \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,cudf-pandas-tests]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"
fi
Expand Down
3 changes: 2 additions & 1 deletion ci/test_python_common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ ENV_YAML_DIR="$(mktemp -d)"
rapids-dependency-file-generator \
--output conda \
--file-key test_python \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee "${ENV_YAML_DIR}/env.yaml"
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
| tee "${ENV_YAML_DIR}/env.yaml"

rapids-mamba-retry env create --yes -f "${ENV_YAML_DIR}/env.yaml" -n test

Expand Down
14 changes: 14 additions & 0 deletions ci/test_wheel_cudf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,22 @@ RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

rapids-logger "Install cudf, pylibcudf, and test requirements"

# Constraint to minimum dependency versions if job is set up as "oldest"
mroeschke marked this conversation as resolved.
Show resolved Hide resolved
echo "" > ./constraints.txt
if [[ $RAPIDS_DEPENDENCIES == "oldest" ]]; then
rapids-dependency-file-generator \
--output requirements \
--file-key py_test_cudf \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
| tee ./constraints.txt
fi

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
-v \
--constraint ./constraints.txt \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"

Expand Down
11 changes: 11 additions & 0 deletions ci/test_wheel_cudf_polars.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,20 @@ RAPIDS_PY_WHEEL_NAME="cudf_polars_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

rapids-logger "Installing cudf_polars and its dependencies"
# Constraint to minimum dependency versions if job is set up as "oldest"
echo "" > ./constraints.txt
if [[ $RAPIDS_DEPENDENCIES == "oldest" ]]; then
rapids-dependency-file-generator \
--output requirements \
--file-key py_test_cudf_polars \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
| tee ./constraints.txt
fi

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
-v \
--constraint ./constraints.txt \
"$(echo ./dist/cudf_polars_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"

Expand Down
13 changes: 13 additions & 0 deletions ci/test_wheel_dask_cudf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,21 @@ RAPIDS_PY_WHEEL_NAME="dask_cudf_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE="
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

rapids-logger "Install dask_cudf, cudf, pylibcudf, and test requirements"
# Constraint to minimum dependency versions if job is set up as "oldest"
echo "" > ./constraints.txt
if [[ $RAPIDS_DEPENDENCIES == "oldest" ]]; then
rapids-dependency-file-generator \
--output requirements \
--file-key py_test_dask_cudf \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
| tee ./constraints.txt
fi

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
-v \
--constraint ./constraints.txt \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)" \
"$(echo ./dist/dask_cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"
Expand Down
22 changes: 22 additions & 0 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -707,6 +707,28 @@ dependencies:
- pytest<8
- pytest-cov
- pytest-xdist
specific:
# Define additional constraints for testing with oldest dependencies.
- output_types: [conda, requirements]
matrices:
- matrix: {dependencies: "oldest"}
packages:
- numba==0.57.*
- numpy==1.23.*
- pandas==2.0.*
- pyarrow==16.1.0
vyasr marked this conversation as resolved.
Show resolved Hide resolved
- cupy==12.0.0 # ignored as pip constraint
- matrix:
packages:
- output_types: requirements
# Using --constraints for pip install, so we list cupy multiple times
matrices:
- matrix: {dependencies: "oldest"}
packages:
- cupy-cuda11x==12.0.0
- cupy-cuda12x==12.0.0
- matrix:
packages:
test_python_pylibcudf:
common:
- output_types: [conda, requirements, pyproject]
Expand Down
4 changes: 4 additions & 0 deletions python/cudf/cudf/tests/indexes/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,10 @@ def test_interval_range_periods_basic_dtype(start_t, end_t, periods_t):
assert_eq(pindex, gindex)


@pytest.mark.skipif(
PANDAS_VERSION < PANDAS_CURRENT_SUPPORTED_VERSION,
reason="Does not warn on older versions of pandas",
)
def test_interval_range_periods_warnings():
start_val, end_val, periods_val = 0, 4, 1.0

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import pytest

import cudf
from cudf.core._compat import PANDAS_CURRENT_SUPPORTED_VERSION, PANDAS_VERSION
from cudf.testing import assert_eq
from cudf.testing.dataset_generator import rand_dataframe

Expand Down Expand Up @@ -302,6 +303,10 @@ def get_days_from_epoch(date: datetime.date | None) -> int | None:
@pytest.mark.parametrize("namespace", [None, "root_ns"])
@pytest.mark.parametrize("nullable", [True, False])
@pytest.mark.parametrize("prepend_null", [True, False])
@pytest.mark.skipif(
PANDAS_VERSION < PANDAS_CURRENT_SUPPORTED_VERSION,
reason="Fails in older versions of pandas (datetime(9999, ...) too large)",
)
def test_can_parse_avro_date_logical_type(namespace, nullable, prepend_null):
avro_type = {"logicalType": "date", "type": "int"}
if nullable:
Expand Down
51 changes: 48 additions & 3 deletions python/cudf/cudf/tests/test_binops.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,11 @@

import cudf
from cudf import Index, Series
from cudf.core._compat import PANDAS_CURRENT_SUPPORTED_VERSION, PANDAS_VERSION
from cudf.core._compat import (
PANDAS_CURRENT_SUPPORTED_VERSION,
PANDAS_GE_220,
PANDAS_VERSION,
)
from cudf.core.buffer.spill_manager import get_global_manager
from cudf.testing import _utils as utils, assert_eq
from cudf.utils.dtypes import (
Expand Down Expand Up @@ -1781,6 +1785,24 @@ def test_datetime_dateoffset_binaryop(
reason="https://github.com/pandas-dev/pandas/issues/57448",
)
)
request.applymarker(
pytest.mark.xfail(
not PANDAS_GE_220
and dtype in {"datetime64[ms]", "datetime64[s]"}
and frequency in ("microseconds", "nanoseconds")
and n_periods != 0,
reason="https://github.com/pandas-dev/pandas/pull/55595",
)
)
request.applymarker(
pytest.mark.xfail(
not PANDAS_GE_220
and dtype == "datetime64[us]"
and frequency == "nanoseconds"
and n_periods != 0,
reason="https://github.com/pandas-dev/pandas/pull/55595",
)
)

date_col = [
f"2000-01-01 00:00:{components}",
Expand Down Expand Up @@ -1834,7 +1856,11 @@ def test_datetime_dateoffset_binaryop(
"ignore:Discarding nonzero nanoseconds:UserWarning"
)
@pytest.mark.parametrize("op", [operator.add, operator.sub])
def test_datetime_dateoffset_binaryop_multiple(date_col, kwargs, op):
@pytest.mark.skipif(
PANDAS_VERSION < PANDAS_CURRENT_SUPPORTED_VERSION,
reason="Fails in older versions of pandas",
)
def test_datetime_dateoffset_binaryop_multiple(request, date_col, kwargs, op):
gsr = cudf.Series(date_col, dtype="datetime64[ns]")
psr = gsr.to_pandas()

Expand Down Expand Up @@ -1871,8 +1897,27 @@ def test_datetime_dateoffset_binaryop_multiple(date_col, kwargs, op):
],
)
def test_datetime_dateoffset_binaryop_reflected(
n_periods, frequency, dtype, components
request, n_periods, frequency, dtype, components
):
request.applymarker(
pytest.mark.xfail(
not PANDAS_GE_220
and dtype in {"datetime64[ms]", "datetime64[s]"}
and frequency in ("microseconds", "nanoseconds")
and n_periods != 0,
reason="https://github.com/pandas-dev/pandas/pull/55595",
)
)
request.applymarker(
pytest.mark.xfail(
not PANDAS_GE_220
and dtype == "datetime64[us]"
and frequency == "nanoseconds"
and n_periods != 0,
reason="https://github.com/pandas-dev/pandas/pull/55595",
)
)

date_col = [
f"2000-01-01 00:00:{components}",
f"2000-01-31 00:00:{components}",
Expand Down
5 changes: 5 additions & 0 deletions python/cudf/cudf/tests/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import pytest

import cudf
from cudf.core._compat import PANDAS_CURRENT_SUPPORTED_VERSION, PANDAS_VERSION
from cudf.testing import assert_eq
from cudf.testing._utils import NUMERIC_TYPES, assert_exceptions_equal

Expand Down Expand Up @@ -858,6 +859,10 @@ def test_cat_from_scalar(scalar):
assert_eq(ps, gs)


@pytest.mark.skipif(
PANDAS_VERSION < PANDAS_CURRENT_SUPPORTED_VERSION,
reason="Does not warn on older versions of pandas",
)
def test_cat_groupby_fillna():
ps = pd.Series(["a", "b", "c"], dtype="category")
gs = cudf.from_pandas(ps)
Expand Down
99 changes: 65 additions & 34 deletions python/cudf/cudf/tests/test_concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import pytest

import cudf
from cudf.core._compat import PANDAS_GE_220
from cudf.core.dtypes import Decimal32Dtype, Decimal64Dtype, Decimal128Dtype
from cudf.testing import assert_eq
from cudf.testing._utils import assert_exceptions_equal, expect_warning_if
Expand Down Expand Up @@ -451,45 +452,75 @@ def test_concat_mixed_input():
[pd.Series([1, 2, 3]), pd.DataFrame({"a": []})],
[pd.Series([], dtype="float64"), pd.DataFrame({"a": []})],
[pd.Series([], dtype="float64"), pd.DataFrame({"a": [1, 2]})],
[
pd.Series([1, 2, 3.0, 1.2], name="abc"),
pd.DataFrame({"a": [1, 2]}),
],
[
pd.Series(
[1, 2, 3.0, 1.2], name="abc", index=[100, 110, 120, 130]
),
pd.DataFrame({"a": [1, 2]}),
],
[
pd.Series(
[1, 2, 3.0, 1.2], name="abc", index=["a", "b", "c", "d"]
pytest.param(
[
pd.Series([1, 2, 3.0, 1.2], name="abc"),
pd.DataFrame({"a": [1, 2]}),
],
marks=pytest.mark.xfail(
not PANDAS_GE_220,
reason="https://github.com/pandas-dev/pandas/pull/56365",
),
pd.DataFrame({"a": [1, 2]}, index=["a", "b"]),
],
[
pd.Series(
[1, 2, 3.0, 1.2, 8, 100],
name="New name",
index=["a", "b", "c", "d", "e", "f"],
),
pytest.param(
[
pd.Series(
[1, 2, 3.0, 1.2], name="abc", index=[100, 110, 120, 130]
),
pd.DataFrame({"a": [1, 2]}),
],
marks=pytest.mark.xfail(
not PANDAS_GE_220,
reason="https://github.com/pandas-dev/pandas/pull/56365",
),
pd.DataFrame(
{"a": [1, 2, 4, 10, 11, 12]},
index=["a", "b", "c", "d", "e", "f"],
),
pytest.param(
[
pd.Series(
[1, 2, 3.0, 1.2], name="abc", index=["a", "b", "c", "d"]
),
pd.DataFrame({"a": [1, 2]}, index=["a", "b"]),
],
marks=pytest.mark.xfail(
not PANDAS_GE_220,
reason="https://github.com/pandas-dev/pandas/pull/56365",
),
],
[
pd.Series(
[1, 2, 3.0, 1.2, 8, 100],
name="New name",
index=["a", "b", "c", "d", "e", "f"],
),
pytest.param(
[
pd.Series(
[1, 2, 3.0, 1.2, 8, 100],
name="New name",
index=["a", "b", "c", "d", "e", "f"],
),
pd.DataFrame(
{"a": [1, 2, 4, 10, 11, 12]},
index=["a", "b", "c", "d", "e", "f"],
),
],
marks=pytest.mark.xfail(
not PANDAS_GE_220,
reason="https://github.com/pandas-dev/pandas/pull/56365",
),
pd.DataFrame(
{"a": [1, 2, 4, 10, 11, 12]},
index=["a", "b", "c", "d", "e", "f"],
),
pytest.param(
[
pd.Series(
[1, 2, 3.0, 1.2, 8, 100],
name="New name",
index=["a", "b", "c", "d", "e", "f"],
),
pd.DataFrame(
{"a": [1, 2, 4, 10, 11, 12]},
index=["a", "b", "c", "d", "e", "f"],
),
]
* 7,
marks=pytest.mark.xfail(
not PANDAS_GE_220,
reason="https://github.com/pandas-dev/pandas/pull/56365",
),
]
* 7,
),
],
)
def test_concat_series_dataframe_input(objs):
Expand Down
Loading
Loading