Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to pandas-tests to a dedicated workflow file and trigger it from branch.yaml #15516

Merged
merged 13 commits into from
Apr 12, 2024
18 changes: 18 additions & 0 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,21 @@ jobs:
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: dask_cudf
trigger-pandas-tests:
if: inputs.build_type == 'nightly'
needs: wheel-build-cudf
runs-on: ubuntu-latest
steps:
- name: Checkout code repo
uses: actions/checkout@v4
with:
ref: ${{ inputs.sha }}
persist-credentials: false
- name: Trigger pandas-tests
env:
GH_TOKEN: ${{ github.token }}
run: |
gh workflow run pandas-tests.yaml \
-f branch=${{ inputs.branch }} \
-f sha=${{ inputs.sha }} \
-f date=${{ inputs.date }}
27 changes: 27 additions & 0 deletions .github/workflows/pandas-tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Pandas Test Job

on:
workflow_dispatch:
inputs:
branch:
required: true
type: string
date:
required: true
type: string
sha:
required: true
type: string

jobs:
pandas-tests:
# run the Pandas unit tests
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
matrix_filter: map(select(.ARCH == "amd64" and .PY_VER == "3.9" and .CUDA_VER == "12.2.2" ))
Copy link
Member

@ajschmidt8 ajschmidt8 Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding matrix filter values is error prone since they can fail as our support matrix changes.

What's the intended matrix that this should run against? The latest CUDA version and earliest Python version?

Once I know that information, I can help you put together a more generalized matrix filter.

cc: @bdice for additional input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the baseline job is being run as nightly job and comparison job is being run as a pull-request job. The only matrix combination common to both is

- { ARCH: 'amd64', PY_VER: '3.9',  CUDA_VER: '12.2.2', LINUX_VER: 'ubuntu22.04', gpu: 'v100', driver: 'latest' }

https://github.com/rapidsai/shared-workflows/blob/13e008c746e30a830c4ebe83a6786862858dfcb8/.github/workflows/wheels-test.yaml#L75-L94`

Copy link
Contributor Author

@galipremsagar galipremsagar Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdice too pointed out this inconsistency before, but I haven't found a way to write a matrix filter that will fetch an identical job for both the jobs of pandas-tests in pr.yaml(pull-request build type) and branch.yaml (nightly build type)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using different job configurations is not an option here because that will create some noise(test failures) that's specific to certain configurations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of thing is why I've advocated for the nightly matrix to be a strict superset of the PR matrix.

rapidsai/shared-workflows#176 (comment)
rapidsai/build-planning#5

build_type: nightly
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
script: ci/cudf_pandas_scripts/pandas-tests/run.sh main
2 changes: 1 addition & 1 deletion .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ jobs:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
matrix_filter: map(select(.ARCH == "amd64" and .PY_VER == "3.9" and .CUDA_VER == "12.2.2" ))
build_type: pull-request
script: ci/cudf_pandas_scripts/pandas-tests/run.sh pr
# Hide test failures because they exceed the GITHUB_STEP_SUMMARY output limit.
Expand Down
11 changes: 0 additions & 11 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -125,14 +125,3 @@ jobs:
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
script: ci/cudf_pandas_scripts/run_tests.sh
pandas-tests:
# run the Pandas unit tests
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(min_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
build_type: nightly
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
script: ci/cudf_pandas_scripts/pandas-tests/run.sh main
7 changes: 4 additions & 3 deletions ci/cudf_pandas_scripts/pandas-tests/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,10 @@ bash python/cudf/cudf/pandas/scripts/run-pandas-tests.sh \
--dist worksteal \
--report-log=${PANDAS_TESTS_BRANCH}.json 2>&1

SUMMARY_FILE_NAME=${PANDAS_TESTS_BRANCH}-24.06-results.json
galipremsagar marked this conversation as resolved.
Show resolved Hide resolved
galipremsagar marked this conversation as resolved.
Show resolved Hide resolved
# summarize the results and save them to artifacts:
python python/cudf/cudf/pandas/scripts/summarize-test-results.py --output json pandas-testing/${PANDAS_TESTS_BRANCH}.json > pandas-testing/${PANDAS_TESTS_BRANCH}-results.json
python python/cudf/cudf/pandas/scripts/summarize-test-results.py --output json pandas-testing/${PANDAS_TESTS_BRANCH}.json > pandas-testing/${SUMMARY_FILE_NAME}
RAPIDS_ARTIFACTS_DIR=${RAPIDS_ARTIFACTS_DIR:-"${PWD}/artifacts"}
mkdir -p "${RAPIDS_ARTIFACTS_DIR}"
mv pandas-testing/${PANDAS_TESTS_BRANCH}-results.json ${RAPIDS_ARTIFACTS_DIR}/
rapids-upload-to-s3 ${RAPIDS_ARTIFACTS_DIR}/${PANDAS_TESTS_BRANCH}-results.json "${RAPIDS_ARTIFACTS_DIR}"
mv pandas-testing/${SUMMARY_FILE_NAME} ${RAPIDS_ARTIFACTS_DIR}/
rapids-upload-to-s3 ${RAPIDS_ARTIFACTS_DIR}/${SUMMARY_FILE_NAME} "${RAPIDS_ARTIFACTS_DIR}"
3 changes: 3 additions & 0 deletions ci/release/update-version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,6 @@ find .devcontainer/ -type f -name devcontainer.json -print0 | while IFS= read -r
sed_runner "s@rapidsai/devcontainers:[0-9.]*@rapidsai/devcontainers:${NEXT_SHORT_TAG}@g" "${filename}"
sed_runner "s@rapidsai/devcontainers/features/rapids-build-utils:[0-9.]*@rapidsai/devcontainers/features/rapids-build-utils:${NEXT_SHORT_TAG_PEP440}@" "${filename}"
done

# Update ci/cudf_pandas_scripts/pandas-tests/run.sh
sed -i "s/SUMMARY_FILE_NAME=${PANDAS_TESTS_BRANCH}-.*-results.json/SUMMARY_FILE_NAME=${PANDAS_TESTS_BRANCH}-${NEXT_SHORT_TAG}-results.json/g" ci/cudf_pandas_scripts/pandas-tests/run.sh
raydouglass marked this conversation as resolved.
Show resolved Hide resolved
Loading