Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate failures in pandas integration tests and Skip failing tests #17521

Merged
merged 28 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
0ee9ac0
Replaces uses of cudf._lib.Column.from_unique_ptr with pylibcudf.Colu…
Matt711 Dec 4, 2024
76315b0
migrate changes from Propagate failures in pandas integration tests
Matt711 Dec 4, 2024
719da3a
clean up
Matt711 Dec 4, 2024
f4be114
Merge branch 'branch-25.02' into fix/pandas/third-party-tests
Matt711 Dec 5, 2024
29798e9
remove deleted line
Matt711 Dec 5, 2024
5b870f4
add to ci job
Matt711 Dec 6, 2024
eff6883
Merge branch 'branch-25.02' into fix/pandas/third-party-tests
Matt711 Dec 6, 2024
30a9391
xfail failing tests
Matt711 Dec 6, 2024
ce69e26
remove xdist worksteal strategy
Matt711 Dec 7, 2024
4c327b2
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
581f938
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
688a561
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
a47fb4f
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
dcd54e4
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
8090316
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
22cc708
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
725bcf3
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 7, 2024
39b24cc
Merge branch 'branch-25.02' into fix/pandas/third-party-tests
Matt711 Dec 7, 2024
5106ab5
fix a bug and xfail a test
Matt711 Dec 9, 2024
75641d5
remove get call
Matt711 Dec 9, 2024
5b11e66
skip tests
Matt711 Dec 11, 2024
0b4f5fa
import or skip catboost
Matt711 Dec 11, 2024
4a2f78a
Update python/cudf/cudf_pandas_tests/third_party_integration_tests/te…
Matt711 Dec 11, 2024
3d70643
clean up
Matt711 Dec 12, 2024
ead0792
ignore catboost tests
Matt711 Dec 12, 2024
93d15a9
remove catboost tests
Matt711 Dec 12, 2024
f2e7eb0
Update .github/workflows/pr.yaml
Matt711 Dec 12, 2024
80ea00e
Update .github/workflows/pr.yaml
Matt711 Dec 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ jobs:
- pandas-tests
- pandas-tests-diff
- telemetry-setup
- third-party-integration-tests-cudf-pandas
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll delete these changes to pr.yaml once this PR is approved. I only added them to run the CI job in this PR.

Matt711 marked this conversation as resolved.
Show resolved Hide resolved
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
if: always()
Expand Down Expand Up @@ -325,6 +326,17 @@ jobs:
node_type: cpu4
build_type: pull-request
run_script: "ci/cudf_pandas_scripts/pandas-tests/diff.sh"
third-party-integration-tests-cudf-pandas:
needs: wheel-build-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
build_type: pull-request
node_type: "gpu-v100-latest-1"
arch: "amd64"
container_image: "rapidsai/ci-conda:latest"
run_script: |
ci/cudf_pandas_scripts/third-party-integration/test.sh python/cudf/cudf_pandas_tests/third_party_integration_tests/dependencies.yaml
Matt711 marked this conversation as resolved.
Show resolved Hide resolved

telemetry-summarize:
runs-on: ubuntu-latest
Expand Down
16 changes: 11 additions & 5 deletions ci/cudf_pandas_scripts/third-party-integration/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ main() {
LIBS=${LIBS#[}
LIBS=${LIBS%]}

ANY_FAILURES=0

for lib in ${LIBS//,/ }; do
lib=$(echo "$lib" | tr -d '""')
echo "Running tests for library $lib"
Expand Down Expand Up @@ -56,10 +58,6 @@ main() {
rapids-logger "Check GPU usage"
nvidia-smi

EXITCODE=0
trap "EXITCODE=1" ERR
set +e

rapids-logger "pytest ${lib}"

NUM_PROCESSES=8
Expand All @@ -72,12 +70,20 @@ main() {
fi
done

EXITCODE=0
trap "EXITCODE=1" ERR
set +e

TEST_DIR=${TEST_DIR} NUM_PROCESSES=${NUM_PROCESSES} ci/cudf_pandas_scripts/third-party-integration/run-library-tests.sh ${lib}

set -e
rapids-logger "Test script exiting with value: ${EXITCODE}"
if [[ ${EXITCODE} != 0 ]]; then
ANY_FAILURES=1
fi
done

exit ${EXITCODE}
exit ${ANY_FAILURES}
}

main "$@"
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,13 @@
import numpy as np
import pandas as pd
import pytest
from catboost import CatBoostClassifier, CatBoostRegressor, Pool

try:
from catboost import CatBoostClassifier, CatBoostRegressor, Pool
except Exception:
pytest.skip(
reason="ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject", allow_module_level=True
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
)
from sklearn.datasets import make_classification, make_regression

rng = np.random.default_rng(seed=42)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ def test_holoviews_heatmap(df):
)


@pytest.mark.skip(
reason="AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'"
)
def test_holoviews_histogram(df):
return get_plot_info(hv.Histogram(df.values))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,19 @@ def assert_plots_equal(expect, got):
pytestmark = pytest.mark.assert_eq(fn=assert_plots_equal)


@pytest.mark.skip(
reason="AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'"
)
def test_line():
df = pd.DataFrame({"x": [1, 2, 3, 4, 5], "y": [2, 4, 6, 8, 10]})
(data,) = plt.plot(df["x"], df["y"], marker="o", linestyle="-")

return plt.gca()


@pytest.mark.skip(
reason="AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'"
)
def test_bar():
data = pd.Series([1, 2, 3, 4, 5], index=["a", "b", "c", "d", "e"])
ax = data.plot(kind="bar")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ def test_numpy_dot(df):
return np.dot(df, df.T)


@pytest.mark.skip(
reason="AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'"
)
def test_numpy_fft(sr):
fft = np.fft.fft(sr)
return fft
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ def test_torch_train(data):
return model(test_x1, test_x2)


@pytest.mark.skip(
reason="AssertionError: The values for attribute 'device' do not match: cpu != cuda:0."
)
def test_torch_tensor_ctor():
s = pd.Series(range(5))
return torch.tensor(s.values)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ def test_scatter(df):
return ax


@pytest.mark.skip(
reason="AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'"
)
def test_lineplot_with_sns_data():
df = sns.load_dataset("flights")
ax = sns.lineplot(data=df, x="month", y="passengers")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def test_multidimensional_distributed_timeseries(dask_client):
rng = np.random.default_rng(seed=42)
# Each row represents data from a different dimension while each column represents
# data from the same dimension
your_time_series = rng.random(3, 1000)
your_time_series = rng.random((3, 1000))
# Approximately, how many data points might be found in a pattern
window_size = 50

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,7 @@ def call(self, values):
return tf.concat(values, axis=-1)


@pytest.mark.xfail(reason="ValueError: Invalid dtype: object")
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
def test_full_example_train_with_df(df, target):
# https://www.tensorflow.org/tutorials/load_data/pandas_dataframe#full_example
# Inputs are directly passed as dictionary of series
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,9 @@ def test_with_external_memory(
return predt


@pytest.mark.skip(
reason="TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly."
)
@pytest.mark.parametrize("device", ["cpu", "cuda"])
def test_predict(device: str) -> np.ndarray:
reg = xgb.XGBRegressor(n_estimators=2, device=device)
Expand Down
Loading