Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update upstream testing workflows #536

Merged
merged 13 commits into from
May 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/cluster-upstream.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Docker-compose setup used during tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this differ from the other docker-compose setup file still here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only difference is that we have the Dask conda nightlies specified in EXTRA_CONDA_PACKAGES here - I would've liked to handle this more gracefully with some conditionals within the docker-compose file, but it seems like conditional variable setting isn't really an easy thing to do within compose files.

Open to other options here - I imagine just replacing the files with the equivalent docker run ... commands would probably be easier, though that comes at the cost of making local debugging somewhat more difficult.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an environment file would work?

cc @jacobtomlinson (in case you have thoughts here 🙂)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker compose does support env var substitution so perhaps the extra packages could be configured that way?

https://docs.docker.com/compose/environment-variables/

version: '3'
services:
dask-scheduler:
container_name: dask-scheduler
image: daskdev/dask:dev
command: dask-scheduler
environment:
USE_MAMBA: "true"
EXTRA_CONDA_PACKAGES: "dask/label/dev::dask"
ports:
- "8786:8786"
dask-worker:
container_name: dask-worker
image: daskdev/dask:dev
command: dask-worker dask-scheduler:8786
environment:
USE_MAMBA: "true"
EXTRA_CONDA_PACKAGES: "dask/label/dev::dask pyarrow>1.0.0" # required for parquet IO
volumes:
- /tmp:/tmp
File renamed without changes.
99 changes: 95 additions & 4 deletions .github/workflows/test-upstream.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ jobs:
use-mamba: true
python-version: ${{ matrix.python }}
channel-priority: strict
jakirkham marked this conversation as resolved.
Show resolved Hide resolved
channels: dask/label/dev,conda-forge,nodefaults
activate-environment: dask-sql
environment-file: ${{ env.CONDA_FILE }}
- name: Download the pre-build jar
Expand All @@ -89,19 +90,109 @@ jobs:
echo "JAVA_HOME=${{ env.CONDA }}\envs\dask-sql\Library" >> $GITHUB_ENV
- name: Install upstream dev Dask / dask-ml
run: |
python -m pip install --no-deps git+https://github.com/dask/dask
python -m pip install --no-deps git+https://github.com/dask/distributed
mamba update dask
python -m pip install --no-deps git+https://github.com/dask/dask-ml
- name: Test with pytest
run: |
pytest --junitxml=junit/test-results.xml --cov-report=xml -n auto tests --dist loadfile

cluster-dev:
name: "Test upstream dev in a dask cluster"
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Cache local Maven repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
- name: Set up Python
uses: conda-incubator/setup-miniconda@v2
with:
miniforge-variant: Mambaforge
use-mamba: true
python-version: "3.8"
channel-priority: strict
channels: dask/label/dev,conda-forge,nodefaults
activate-environment: dask-sql
environment-file: continuous_integration/environment-3.8-jdk11-dev.yaml
- name: Download the pre-build jar
uses: actions/download-artifact@v1
with:
name: jar
path: dask_sql/jar/
- name: Install cluster dependencies
run: |
mamba install python-blosc lz4 -c conda-forge

which python
pip list
mamba list
- name: Install upstream dev dask-ml
run: |
mamba update dask
python -m pip install --no-deps git+https://github.com/dask/dask-ml
- name: run a dask cluster
run: |
docker-compose -f .github/cluster-upstream.yml up -d

# periodically ping logs until a connection has been established; assume failure after 2 minutes
timeout 2m bash -c 'until docker logs dask-worker 2>&1 | grep -q "Starting established connection"; do sleep 1; done'

docker logs dask-scheduler
docker logs dask-worker
- name: Test with pytest while running an independent dask cluster
run: |
DASK_SQL_TEST_SCHEDULER="tcp://127.0.0.1:8786" pytest --junitxml=junit/test-cluster-results.xml --cov-report=xml -n auto tests --dist loadfile

import-dev:
name: "Test importing with bare requirements and upstream dev"
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Cache local Maven repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
- name: Set up Python
uses: conda-incubator/setup-miniconda@v2
with:
python-version: "3.8"
mamba-version: "*"
channels: dask/label/dev,conda-forge,nodefaults
channel-priority: strict
- name: Download the pre-build jar
uses: actions/download-artifact@v1
with:
name: jar
path: dask_sql/jar/
- name: Install upstream dev Dask / dask-ml
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
mamba update dask
python -m pip install --no-deps git+https://github.com/dask/dask-ml
- name: Install dependencies and nothing else
run: |
pip install -e .

which python
pip list
mamba list
- name: Try to import dask-sql
run: |
python -c "import dask_sql; print('ok')"

report-failures:
name: Open issue for upstream dev failures
needs: test-dev
needs: [test-dev, cluster-dev]
if: |
always()
&& needs.test-dev.result == 'failure'
&& (
needs.test-dev.result == 'failure' || needs.cluster-dev.result == 'failure'
)
runs-on: ubuntu-latest
defaults:
run:
Expand Down
33 changes: 19 additions & 14 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ jobs:
use-mamba: true
python-version: ${{ matrix.python }}
channel-priority: strict
channels: ${{ needs.detect-ci-trigger.outputs.triggered == 'true' && 'dask/label/dev,conda-forge,nodefaults' || 'conda-forge,nodefaults' }}
activate-environment: dask-sql
environment-file: ${{ env.CONDA_FILE }}
- name: Download the pre-build jar
Expand All @@ -112,8 +113,7 @@ jobs:
- name: Optionally install upstream dev Dask / dask-ml
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
python -m pip install --no-deps git+https://github.com/dask/dask
python -m pip install --no-deps git+https://github.com/dask/distributed
mamba update dask
python -m pip install --no-deps git+https://github.com/dask/dask-ml
- name: Test with pytest
run: |
Expand Down Expand Up @@ -149,29 +149,35 @@ jobs:
use-mamba: true
python-version: "3.8"
channel-priority: strict
jakirkham marked this conversation as resolved.
Show resolved Hide resolved
channels: ${{ needs.detect-ci-trigger.outputs.triggered == 'true' && 'dask/label/dev,conda-forge,nodefaults' || 'conda-forge,nodefaults' }}
activate-environment: dask-sql
environment-file: continuous_integration/environment-3.8-jdk11-dev.yaml
- name: Download the pre-build jar
uses: actions/download-artifact@v1
with:
name: jar
path: dask_sql/jar/
- name: Install dependencies
- name: Install cluster dependencies
run: |
mamba install python-blosc lz4 -c conda-forge

which python
pip list
mamba list
- name: Optionally install upstream dev Dask / dask-ml
- name: Optionally install upstream dev dask-ml
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
python -m pip install --no-deps git+https://github.com/dask/dask
python -m pip install --no-deps git+https://github.com/dask/distributed
mamba update dask
charlesbluca marked this conversation as resolved.
Show resolved Hide resolved
python -m pip install --no-deps git+https://github.com/dask/dask-ml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a dask-ml nightly next 🙂

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah was thinking about that 🙂 happy to do the work if Dask maintainers are okay with it, can open up an issue around this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe raise an issue on the Dask-ML repo asking this question. Can also link back to issue ( dask/community#76 ) where we had this discussion about Dask & Distributed

- name: run a dask cluster
env:
UPSTREAM: ${{ needs.detect-ci-trigger.outputs.triggered }}
run: |
docker-compose -f .github/docker-compose.yaml up -d
if [[ $UPSTREAM == "true" ]]; then
docker-compose -f .github/cluster-upstream.yml up -d
else
docker-compose -f .github/cluster.yml up -d
fi

# periodically ping logs until a connection has been established; assume failure after 2 minutes
timeout 2m bash -c 'until docker logs dask-worker 2>&1 | grep -q "Starting established connection"; do sleep 1; done'
Expand All @@ -198,26 +204,25 @@ jobs:
with:
python-version: "3.8"
mamba-version: "*"
channels: conda-forge,defaults
channels: ${{ needs.detect-ci-trigger.outputs.triggered == 'true' && 'dask/label/dev,conda-forge,nodefaults' || 'conda-forge,nodefaults' }}
channel-priority: strict
- name: Download the pre-build jar
uses: actions/download-artifact@v1
with:
name: jar
path: dask_sql/jar/
- name: Optionally install upstream dev Dask / dask-ml
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
mamba update dask
python -m pip install --no-deps git+https://github.com/dask/dask-ml
- name: Install dependencies and nothing else
run: |
pip install -e .

which python
pip list
mamba list
- name: Optionally install upstream dev Dask / dask-ml
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
python -m pip install --no-deps git+https://github.com/dask/dask
python -m pip install --no-deps git+https://github.com/dask/distributed
python -m pip install --no-deps git+https://github.com/dask/dask-ml
- name: Try to import dask-sql
run: |
python -c "import dask_sql; print('ok')"