Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow committer builds to use scripts/ci, dev and actions from the PR #37057

Merged
merged 1 commit into from
Jan 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 93 additions & 72 deletions .github/workflows/build-images.yml

Large diffs are not rendered by default.

28 changes: 0 additions & 28 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -146,34 +146,6 @@ jobs:
persist-credentials: false
- name: "Install Breeze"
uses: ./.github/actions/breeze
- name: "Retrieve defaults from branch_defaults.py"
id: defaults
# We could retrieve it differently here - by just importing the variables and
# printing them from python code, however we want to have the same code as used in
# the build-images.yml (there we cannot import python code coming from the PR - we need to
# treat the python code as text and extract the variables from there.
run: |
python - <<EOF >> ${GITHUB_ENV}
from pathlib import Path
import re
import sys

DEFAULTS_CONTENT = Path('dev/breeze/src/airflow_breeze/branch_defaults.py').read_text()
BRANCH_PATTERN = r'^AIRFLOW_BRANCH = "(.*)"$'
CONSTRAINTS_BRANCH_PATTERN = r'^DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH = "(.*)"$'

branch = re.search(BRANCH_PATTERN, DEFAULTS_CONTENT, re.MULTILINE).group(1)
constraints_branch = re.search(CONSTRAINTS_BRANCH_PATTERN, DEFAULTS_CONTENT, re.MULTILINE).group(1)

output = f"""
DEFAULT_BRANCH={branch}
DEFAULT_CONSTRAINTS_BRANCH={constraints_branch}
""".strip()

print(output)
# Stdout is redirected to GITHUB_ENV but we also print it to stderr to see it in ci log
print(output, file=sys.stderr)
EOF
- name: "Get information about the Workflow"
id: source-run-info
run: breeze ci get-workflow-info 2>> ${GITHUB_OUTPUT}
Expand Down
2 changes: 1 addition & 1 deletion dev/breeze/doc/08_ci_tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ The selective-check command will produce the set of ``name=value`` pairs of outp
from the context of the commit/PR to be merged via stderr output.

More details about the algorithm used to pick the right tests and the available outputs can be
found in `Selective Checks <dev/breeze/SELECTIVE_CHECKS.md>`_.
found in `Selective Checks <ci/04_selective_checks.md>`_.

These are all available flags of ``selective-check`` command:

Expand Down

Large diffs are not rendered by default.

30 changes: 30 additions & 0 deletions dev/breeze/doc/ci/05_workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
- [Workflows](#workflows)
- [Build Images Workflow](#build-images-workflow)
- [Differences for main and release branches](#differences-for-main-and-release-branches)
- [Committer vs. non-committer PRs](#committer-vs-non-committer-prs)
- [Tests Workflow](#tests-workflow)
- [CodeQL scan](#codeql-scan)
- [Publishing documentation](#publishing-documentation)
Expand Down Expand Up @@ -84,6 +85,13 @@ CI/CD scripts or changes to the CI/CD workflows). In this case the PR is
run in the context of the "apache/airflow" repository and has WRITE
access to the GitHub Container Registry.

When the PR changes important files (for example `generated/provider_depdencies.json` or
`pyproject.toml`), the PR is run in "upgrade to newer dependencies" mode - where instead
of using constraints to build images, attempt is made to upgrade all dependencies to latest
versions and build images with them. This way we check how Airflow behaves when the
dependencies are upgraded. This can also be forced by setting the `upgrade to newer dependencies`
label in the PR if you are a committer and want to force dependency upgrade.

## Canary run

This workflow is triggered when a pull request is merged into the "main"
Expand Down Expand Up @@ -184,6 +192,28 @@ the new branch and there are a few places where selection of tests is
based on whether this output is `main`. They are marked as - in the
"Release branches" column of the table below.

## Committer vs. non-committer PRs

There is a difference in how the CI jobs are run for committer and non-committer PRs from forks.
Main reason is security - we do not want to run untrusted code on our infrastructure for self-hosted runners,
but also we do not want to run unverified code during the `Build imaage` workflow, because that workflow has
access to GITHUB_TOKEN that has access to write to the Github Registry of ours (which is used to cache
images between runs). Also those images are build on self-hosted runners and we have to make sure that
those runners are not used to (fore example) mine cryptocurrencies on behalf of the person who opened the
pull request from their newly opened fork of airflow.

This is why the `Build Images` workflow checks if the actor of the PR (GITHUB_ACTOR) is one of the committers,
and if not, then workflows and scripts used to run image building are coming only from the ``target`` branch
of the repository, where such scripts were reviewed and approved by the committers before being merged.

This is controlled by `Selective checks <04_selective_checks.md>`__ that set appropriate output in
the build-info job of the workflow (see`is-committer-build` to `true`) if the actor is in the committer's
list and can be overridden by `non committer build` label in the PR.

Also, for most of the jobs, committer builds by default use "Self-hosted" runners, while non-committer
builds use "Public" runners. For committers, this can be overridden by setting the
`use public runners` label in the PR.

## Tests Workflow

This workflow is a regular workflow that performs all checks of Airflow
Expand Down
5 changes: 3 additions & 2 deletions dev/breeze/src/airflow_breeze/commands/ci_commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@

import click

from airflow_breeze.branch_defaults import AIRFLOW_BRANCH, DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH
from airflow_breeze.commands.common_options import (
option_answer,
option_dry_run,
Expand Down Expand Up @@ -197,14 +198,14 @@ def get_changed_files(commit_ref: str | None) -> tuple[str, ...]:
@click.option(
"--default-branch",
help="Branch against which the PR should be run",
default="main",
default=AIRFLOW_BRANCH,
envvar="DEFAULT_BRANCH",
show_default=True,
)
@click.option(
"--default-constraints-branch",
help="Constraints Branch against which the PR should be run",
default="constraints-main",
default=DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH,
envvar="DEFAULT_CONSTRAINTS_BRANCH",
show_default=True,
)
Expand Down
6 changes: 2 additions & 4 deletions dev/breeze/src/airflow_breeze/params/common_build_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,8 @@ class CommonBuildParams:
additional_dev_apt_env: str | None = None
additional_python_deps: str | None = None
additional_pip_install_flags: str | None = None
airflow_branch: str = os.environ.get("DEFAULT_BRANCH", AIRFLOW_BRANCH)
default_constraints_branch: str = os.environ.get(
"DEFAULT_CONSTRAINTS_BRANCH", DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH
)
airflow_branch: str = AIRFLOW_BRANCH
default_constraints_branch: str = DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH
airflow_constraints_location: str | None = None
builder: str = "autodetect"
build_progress: str = ALLOWED_BUILD_PROGRESS[0]
Expand Down
6 changes: 2 additions & 4 deletions dev/breeze/src/airflow_breeze/params/shell_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ class ShellParams:
Shell parameters. Those parameters are used to determine command issued to run shell command.
"""

airflow_branch: str = os.environ.get("DEFAULT_BRANCH", AIRFLOW_BRANCH)
airflow_branch: str = AIRFLOW_BRANCH
airflow_constraints_location: str = ""
airflow_constraints_mode: str = ALLOWED_CONSTRAINTS_MODES_CI[0]
airflow_constraints_reference: str = ""
Expand All @@ -130,9 +130,7 @@ class ShellParams:
collect_only: bool = False
database_isolation: bool = False
db_reset: bool = False
default_constraints_branch: str = os.environ.get(
"DEFAULT_CONSTRAINTS_BRANCH", DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH
)
default_constraints_branch: str = DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH
dev_mode: bool = False
docker_host: str | None = os.environ.get("DOCKER_HOST")
downgrade_sqlalchemy: bool = False
Expand Down
12 changes: 10 additions & 2 deletions dev/breeze/src/airflow_breeze/utils/selective_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
from functools import cached_property, lru_cache
from typing import Any, Dict, List, TypeVar

from airflow_breeze.branch_defaults import AIRFLOW_BRANCH, DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH
from airflow_breeze.global_constants import (
ALL_PYTHON_MAJOR_MINOR_VERSIONS,
APACHE_AIRFLOW_GITHUB_REPOSITORY,
Expand Down Expand Up @@ -62,6 +63,7 @@
FULL_TESTS_NEEDED_LABEL = "full tests needed"
DEBUG_CI_RESOURCES_LABEL = "debug ci resources"
USE_PUBLIC_RUNNERS_LABEL = "use public runners"
NON_COMMITTER_BUILD_LABEL = "non committer build"
UPGRADE_TO_NEWER_DEPENDENCIES_LABEL = "upgrade to newer dependencies"

ALL_CI_SELECTIVE_TEST_TYPES = (
Expand Down Expand Up @@ -350,8 +352,8 @@ class SelectiveChecks:
def __init__(
self,
files: tuple[str, ...] = (),
default_branch="main",
default_constraints_branch="constraints-main",
default_branch=AIRFLOW_BRANCH,
default_constraints_branch=DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH,
commit_ref: str | None = None,
pr_labels: tuple[str, ...] = (),
github_event: GithubEvents = GithubEvents.PULL_REQUEST,
Expand Down Expand Up @@ -1039,3 +1041,9 @@ def providers_compatibility_checks(self) -> str:
if check["python-version"] in self.python_versions
]
)

@cached_property
def is_committer_build(self):
if NON_COMMITTER_BUILD_LABEL in self._pr_labels:
return False
return self._github_actor in COMMITTERS
potiuk marked this conversation as resolved.
Show resolved Hide resolved
49 changes: 49 additions & 0 deletions dev/breeze/tests/test_selective_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -1719,3 +1719,52 @@ def test_mypy_matches(
pr_labels=pr_labels,
)
assert_outputs_are_printed(expected_outputs, str(stderr))


@pytest.mark.parametrize(
"files, expected_outputs, github_actor, pr_labels",
[
pytest.param(
("README.md",),
{
"is-committer-build": "false",
"runs-on": '["ubuntu-22.04"]',
},
"",
(),
id="Regular pr",
),
pytest.param(
("README.md",),
{
"is-committer-build": "true",
"runs-on": '["self-hosted", "Linux", "X64"]',
},
"potiuk",
(),
id="Committer regular PR",
),
pytest.param(
("README.md",),
{
"is-committer-build": "false",
"runs-on": '["self-hosted", "Linux", "X64"]',
},
"potiuk",
("non committer build",),
id="Committer regular PR - forcing non-committer build",
),
],
)
def test_pr_labels(
files: tuple[str, ...], expected_outputs: dict[str, str], github_actor: str, pr_labels: tuple[str, ...]
):
stderr = SelectiveChecks(
files=files,
commit_ref="HEAD",
default_branch="main",
github_actor=github_actor,
github_event=GithubEvents.PULL_REQUEST,
pr_labels=pr_labels,
)
assert_outputs_are_printed(expected_outputs, str(stderr))
3 changes: 1 addition & 2 deletions scripts/in_container/install_airflow_and_providers.py
Original file line number Diff line number Diff line change
Expand Up @@ -335,8 +335,7 @@ def find_installation_spec(
)
@click.option(
"--default-constraints-branch",
default="constraints-main",
show_default=True,
required=True,
envvar="DEFAULT_CONSTRAINTS_BRANCH",
help="Default constraints branch to use",
)
Expand Down
14 changes: 7 additions & 7 deletions scripts/in_container/run_generate_constraints.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,11 @@
from click import Choice
from in_container_utils import click, console, run_command

AIRFLOW_SOURCES = Path(__file__).resolve().parents[2]
AIRFLOW_SOURCE_DIR = Path(__file__).resolve().parents[2]

DEFAULT_BRANCH = os.environ.get("DEFAULT_BRANCH", "main")
PYTHON_VERSION = os.environ.get("PYTHON_MAJOR_MINOR_VERSION", "3.8")
GENERATED_PROVIDER_DEPENDENCIES_FILE = AIRFLOW_SOURCES / "generated" / "provider_dependencies.json"
GENERATED_PROVIDER_DEPENDENCIES_FILE = AIRFLOW_SOURCE_DIR / "generated" / "provider_dependencies.json"

ALL_PROVIDER_DEPENDENCIES = json.loads(GENERATED_PROVIDER_DEPENDENCIES_FILE.read_text())

Expand Down Expand Up @@ -145,7 +146,7 @@ def install_local_airflow_with_eager_upgrade(
"eager",
],
github_actions=config_params.github_actions,
cwd=AIRFLOW_SOURCES,
cwd=AIRFLOW_SOURCE_DIR,
check=True,
)

Expand Down Expand Up @@ -240,7 +241,7 @@ def uninstall_all_packages(config_params: ConfigParams):
result = run_command(
["pip", "freeze"],
github_actions=config_params.github_actions,
cwd=AIRFLOW_SOURCES,
cwd=AIRFLOW_SOURCE_DIR,
text=True,
check=True,
capture_output=True,
Expand All @@ -253,7 +254,7 @@ def uninstall_all_packages(config_params: ConfigParams):
run_command(
["pip", "uninstall", "--root-user-action", "ignore", "-y", *all_installed_packages],
github_actions=config_params.github_actions,
cwd=AIRFLOW_SOURCES,
cwd=AIRFLOW_SOURCE_DIR,
text=True,
check=True,
)
Expand Down Expand Up @@ -396,8 +397,7 @@ def generate_constraints_no_providers(config_params: ConfigParams) -> None:
)
@click.option(
"--default-constraints-branch",
default="constraints-main",
show_default=True,
required=True,
envvar="DEFAULT_CONSTRAINTS_BRANCH",
help="Branch to get constraints from",
)
Expand Down
Loading