Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial integration workflow for GHA #744

Closed
wants to merge 11 commits into from
305 changes: 305 additions & 0 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,305 @@
# **what?**
# This workflow runs all integration tests for supported OS
# and python versions and core adapters. If triggered by PR,
# the workflow will only run tests for adapters related
# to code changes. Use the `test all` and `test ${adapter}`
# label to run all or additional tests. Use `ok to test`
# label to mark PRs from forked repositories that are safe
# to run integration tests for. Requires secrets to run
# against different warehouses.

# **why?**
# This checks the functionality of dbt from a user's perspective
# and attempts to catch functional regressions.

# **when?**
# This workflow will run on every push to a protected branch
# and when manually triggered. It will also run for all PRs, including
# PRs from forks. The workflow will be skipped until there is a label
# to mark the PR as safe to run.

name: Adapter Integration Tests

on:
# pushes to release branches
# push:
# branches:
# - "main"
# - "*.latest"
# - "releases/*"
# # all PRs, important to note that `pull_request_target` workflows
# # will run in the context of the target branch of a PR
# pull_request_target:
# manual trigger
workflow_dispatch:

# explicitly turn off permissions for `GITHUB_TOKEN`
permissions: read-all

# will cancel previous workflows triggered by the same event and for the same ref for PRs or same SHA otherwise
concurrency:
group: ${{ github.workflow }}-${{ github.event_name }}-${{ contains(github.event_name, 'pull_request') && github.event.pull_request.head.ref || github.sha }}
cancel-in-progress: true

# sets default shell to bash, for all operating systems
defaults:
run:
shell: bash

jobs:
# generate test metadata about what files changed
test-metadata:
runs-on: ubuntu-latest
outputs:
relevant_changes: ${{ steps.changes.outputs.relevant_changes }}

steps:
- name: Check out the repository (non-PR)
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

- name: Check if relevant files changed
if: github.event_name == 'pull_request_target'
# https://github.com/marketplace/actions/paths-changes-filter
# For each filter, it sets output variable named by the filter to the text:
# 'true' - if any of changed files matches any of filter rules
# 'false' - if none of changed files matches any of filter rules
# also, returns:
# `changes` - JSON array with names of all filters matching any of the changed files
uses: dorny/paths-filter@v2
id: get-changes
with:
token: ${{ secrets.GITHUB_TOKEN }}
filters: |
spark:
- 'dbt/**'
- 'tests/**'
- 'dev-requirements.txt'

- name: Convert previous steps output to true bool
id: changes
if: ${{ steps.get-changes.outputs.changes }} == 'true'
run: |
if [[ ${{ steps.get-changes.outputs.changes }} == 'true' ]]
then
relevant_changes = true
else
relevant_changes = false
fi

# TODO: currently untested, to be refined by https://github.com/dbt-labs/dbt-spark/issues/723
test-spark-session:
name: integration-spark-session / python ${{ matrix.python-version }} / ${{ matrix.os }}

# run if not a PR from a forked repository or has a label to mark as safe to test
# also checks that the matrix generated is not empty
if: ${{ needs.test-metadata.outputs.relevant_changes }}
runs-on: ${{ matrix.os }}
needs: test-metadata
container:
image: godatadriven/pyspark:3.1

strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
os: [ubuntu-20.04]
include:
- python-version: 3.8
os: windows-latest
- python-version: 3.8
os: macos-latest

env:
TOXENV: integration-spark-session
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
DBT_INVOCATION_ENV: github-actions

steps:
- name: Check out the repository
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

# explicity checkout the branch for the PR,
# this is necessary for the `pull_request_target` event
- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

- name: Set up Python ${{ matrix.python-version }}
uses: actions/[email protected]
with:
python-version: ${{ matrix.python-version }}

- name: "Install Spark Dependencies"
if: ${{ contains(github.repository, 'dbt-labs/dbt-spark') }}
run: |
sudo apt-get update
sudo apt-get install -y git gcc g++ unixodbc-dev libsasl2-dev

- name: Install python dependencies
run: |
python -m pip install --user --upgrade pip
python -m pip install tox
python -m pip --version
tox --version

- name: Log Adapter
run: |
echo matrix.adapter: ${{ matrix.adapter }}

- name: Run tox (spark)
if: matrix.adapter == 'spark'
env:
# TODO: fill these with spark relevant vars
DBT_TEST_USER_1: dbt_test_role_1
DBT_TEST_USER_2: dbt_test_role_2
DBT_TEST_USER_3: dbt_test_role_3
run: tox

- uses: actions/upload-artifact@v3
if: always()
with:
name: logs
path: ./logs

- name: Get current date
if: always()
id: date
run: echo "date=$(date +'%Y-%m-%dT%H_%M_%S')" >> $GITHUB_OUTPUT #no colons allowed for artifacts

- uses: actions/upload-artifact@v3
if: always()
with:
name: integration_results_${{ matrix.python-version }}_${{ matrix.os }}_integration-spark-sessions-${{ steps.date.outputs.date }}.csv
path: integration_results.csv

# TODO: currently untested, to be refined by https://github.com/dbt-labs/dbt-spark/issues/731
test-spark-thrift:
name: integration-spark-thrift / python ${{ matrix.python-version }} / ${{ matrix.os }}

# run if not a PR from a forked repository or has a label to mark as safe to test
# also checks that the matrix generated is not empty
if: ${{ needs.test-metadata.outputs.relevant_changes }}
runs-on: ${{ matrix.os }}
needs: test-metadata
# NOTE: services don't work with act for local testings
services:
spark:
image: godatadriven/spark:3.1.1
env:
WAIT_FOR: localhost:5432
options: " --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server\n"
postgres:
image: postgres:9.6.17-alpine
env:
POSTGRES_USER: dbt
POSTGRES_PASSWORD: dbt
POSTGRES_DB: metastore

strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
os: [ubuntu-20.04]
include:
- python-version: 3.8
os: windows-latest
- python-version: 3.8
os: macos-latest

env:
TOXENV: integration-spark-thrift
PYTEST_ADDOPTS: "-v --color=yes -n4 --csv integration_results.csv"
DBT_INVOCATION_ENV: github-actions

steps:
- name: Check out the repository
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

# explicity checkout the branch for the PR,
# this is necessary for the `pull_request_target` event
- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

- name: Set up Python ${{ matrix.python-version }}
uses: actions/[email protected]
with:
python-version: ${{ matrix.python-version }}

- name: "Install Spark Dependencies"
if: ${{ contains(github.repository, 'dbt-labs/dbt-spark') }}
run: |
sudo apt-get update
sudo apt-get install -y git gcc g++ unixodbc-dev libsasl2-dev

- name: Install python dependencies
run: |
python -m pip install --user --upgrade pip
python -m pip install tox
python -m pip --version
tox --version

- name: Run tox (spark)
if: matrix.adapter == 'spark'
env:
# TODO: fill these with spark relevant vars
DBT_TEST_USER_1: dbt_test_role_1
DBT_TEST_USER_2: dbt_test_role_2
DBT_TEST_USER_3: dbt_test_role_3
run: tox

- uses: actions/upload-artifact@v3
if: always()
with:
name: logs
path: ./logs

- name: Get current date
if: always()
id: date
run: echo "date=$(date +'%Y-%m-%dT%H_%M_%S')" >> $GITHUB_OUTPUT #no colons allowed for artifacts

- uses: actions/upload-artifact@v3
if: always()
with:
name: integration_results_${{ matrix.python-version }}_${{ matrix.os }}_integration-spark-thrift-${{ steps.date.outputs.date }}.csv
path: integration_results.csv

notify-skipped-tests:
name: integration-spark-thrift / python ${{ matrix.python-version }} / ${{ matrix.os }}

# run if not a PR from a forked repository or has a label to mark as safe to test
# also checks that the matrix generated is not empty
if: ${{ needs.test-metadata.outputs.relevant_changes }} == false
runs-on: ubuntu-latest
needs: test-metadata

steps:
- name: Check out the repository
run: |
# Send notification
message="No changes found in relevant files so all integration tests were skipped."
title="Integration Tests Skipped"
echo "::notice title=$title::$message"
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ["3.7", "3.8", "3.9", "3.10"]
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]

steps:
- name: Set up Python ${{ matrix.python-version }}
Expand Down