diff --git a/dev/breeze/doc/ci/04_selective_checks.md b/dev/breeze/doc/ci/04_selective_checks.md index 9199ae7344f0e..e6c59eee8ca3f 100644 --- a/dev/breeze/doc/ci/04_selective_checks.md +++ b/dev/breeze/doc/ci/04_selective_checks.md @@ -28,7 +28,7 @@ - [Suspended providers](#suspended-providers) - [Selective check outputs](#selective-check-outputs) - [Committer vs. non-committer PRs](#committer-vs-non-committer-prs) - - [PR labels](#pr-labels) + - [Changing behaviours of the CI runs by setting labels](#changing-behaviours-of-the-ci-runs-by-setting-labels) @@ -267,17 +267,29 @@ This is controlled by `Selective checks <04_selective_checks.md>`__ that set app the build-info job of the workflow (see`is-committer-build` to `true`) if the actor is in the committer's list and can be overridden by `non committer build` label in the PR. -Also, for most of the jobs, committer builds by default use "Self-hosted" runners, while non-committer -builds use "Public" runners. For committers, this can be overridden by setting the -`use public runners` label in the PR. +## Changing behaviours of the CI runs by setting labels + +Also, currently for most of the jobs, committer builds by default use "Self-hosted" runners, while +non-committer builds use "Public" runners. For committers, this can be overridden by setting the +`use public runners` label in the PR. In the future when we might also switch committers to public runners. +Committers will be able to use `use self-hosted runners` label in the PR to force using self-hosted runners. +The `use public runners` label will still be available for committers and they will be able to set it for +builds that also have `canary` label set to also switch the `canary` builds to public runners. If you are testing CI workflow changes and want to test it for more complete matrix combinations generated by the jobs - you can set `all versions` label in the PR. This will run the PRs with the same combinations -of versions as the `canary` main build. Using `all versions` is automatically set when build or project -dependencies change in `pyproject.toml`. - -If you are testing CI workflow changes and change `pyproject.toml` or `generated/provider_dependencies.json` and you -want to limit the number of matrix combinations generated by +of versions as the `canary` main build. Using `all versions` is automatically set when build dependencies +change in `pyproject.toml` or when dependencies change for providers in `generated/provider_dependencies.json` +or when `hatch_build.py` changes. + +If you are running an `apache` PR, you can also set `canary` label for such PR and in this case, all the +`canary` properties of build will be used: `self-hosted` runners, `full tests needed` mode, `all versions` +as well as all canary-specific jobs will run there. You can modify this behaviour of the `canary` run by +applying `use public runners`, and `default versions only` labels to the PR as well which will still run +a `canary` equivalent build but with public runners an default Python/K8S versions only - respectively. + +If you are testing CI workflow changes and change `pyproject.toml` or `generated/provider_dependencies.json` +and you want to limit the number of matrix combinations generated by the jobs - you can set `default versions only` label in the PR. This will limit the number of versions used in the matrix to the default ones (default Python version and default Kubernetes version). @@ -288,30 +300,28 @@ used in the matrix to the latest ones (latest Python version and latest Kubernet You can also disable cache if you want to make sure your tests will run with image that does not have left-over package installed from the past cached image - by setting `disable image cache` label in the PR. -By default all outputs of successful parallel tests are not shown. You can enable them by setting +By default, all outputs of successful parallel tests are not shown. You can enable them by setting `include success outputs` label in the PR. This makes the logs of mostly successful tests a lot longer and more difficult to sift through, but it might be useful in case you want to compare successful and unsuccessful runs of the tests. -## PR labels - -As mentioned below, you can influence the outputs of selected checks by setting labels to the PR. Here is -am overview of possible labels and their meaning: - -| Label | Affected outputs | Meaning | -|-------------------------------|-------------------------------|-----------------------------------------------------------------------------------------------------------------| -| canary | is-canary-run | If set, the PR run from apache/airflow repo behaves as `canary` run (can only be run by maintainer). | -| debug ci resources | debug-ci-resources | If set, then debugging resources is enabled during parallel tests and you can see them in the output. | -| default versions only | all-versions, *-versions-* | If set, the number of Python and Kubernetes, DB versions used by the build will be limited to the default ones. | -| disable image cache | docker-cache | If set, the image cache is disables when building the image. | -| include success outputs | include-success-outputs | By default, outputs of successful parallel tests are not shown - enabling this flag will make then shown. | -| latest versions only | *-versions-*, *-versions-* | If set, the number of Python, Kubernetes, DB versions used by the build will be limited to the latest ones. | -| all versions | all-versions, *-versions-* | Run tests for all python and k8s versions. | -| full tests needed | full-tests-needed | Run complete set of tests (might be with default or all python/k8s versions) | -| non committer build | is-committer-build | If set then even for non-committer builds, the scripts used for images are used from target branch. | -| upgrade to newer dependencies | upgrade-to-newer-dependencies | If set to true (default false) then dependencies in the CI image build are upgraded to the newer ones. | -| use public runners | runs-on-as-json-default | Force using public runners even for Committer runs. | - +This table summarizes the labels you can use on PRs to control the selective checks and the CI runs: + +| Label | Affected outputs | Meaning | +|----------------------------------|----------------------------------|-------------------------------------------------------------------------------------------| +| all versions | all-versions, *-versions-* | Run tests for all python and k8s versions. | +| allow suspended provider changes | allow-suspended-provider-changes | Allow changes to suspended providers. | +| canary | is-canary-run | If set, the PR run from apache/airflow repo behaves as `canary` run. | +| debug ci resources | debug-ci-resources | If set, then debugging resources is enabled during parallel tests and you can see them. | +| default versions only | all-versions, *-versions-* | If set, the number of Python and Kubernetes, DB versions are limited to the default ones. | +| disable image cache | docker-cache | If set, the image cache is disables when building the image. | +| full tests needed | full-tests-needed | If set, complete set of tests are run | +| include success outputs | include-success-outputs | If set, outputs of successful parallel tests are shown not only failed outputs. | +| latest versions only | *-versions-*, *-versions-* | If set, the number of Python, Kubernetes, DB versions will be limited to the latest ones. | +| non committer build | is-committer-build | If set, the scripts used for images are used from target branch for committers. | +| upgrade to newer dependencies | upgrade-to-newer-dependencies | If set to true (default false) then dependencies in the CI image build are upgraded. | +| use public runners | runs-on-as-json-default | Force using public runners as default runners. | +| use self-hosted runners | runs-on-as-json-default | Force using self-hosted runners as default runners. | ----- diff --git a/dev/breeze/src/airflow_breeze/utils/selective_checks.py b/dev/breeze/src/airflow_breeze/utils/selective_checks.py index 9cbeacd3f0972..15919f6d66966 100644 --- a/dev/breeze/src/airflow_breeze/utils/selective_checks.py +++ b/dev/breeze/src/airflow_breeze/utils/selective_checks.py @@ -63,16 +63,17 @@ from airflow_breeze.utils.provider_dependencies import DEPENDENCIES, get_related_providers from airflow_breeze.utils.run_utils import run_command -FULL_TESTS_NEEDED_LABEL = "full tests needed" +ALL_VERSIONS_LABEL = "all versions" DEBUG_CI_RESOURCES_LABEL = "debug ci resources" -USE_PUBLIC_RUNNERS_LABEL = "use public runners" -NON_COMMITTER_BUILD_LABEL = "non committer build" DEFAULT_VERSIONS_ONLY_LABEL = "default versions only" -ALL_VERSIONS_LABEL = "all versions" -LATEST_VERSIONS_ONLY_LABEL = "latest versions only" DISABLE_IMAGE_CACHE_LABEL = "disable image cache" +FULL_TESTS_NEEDED_LABEL = "full tests needed" INCLUDE_SUCCESS_OUTPUTS_LABEL = "include success outputs" +LATEST_VERSIONS_ONLY_LABEL = "latest versions only" +NON_COMMITTER_BUILD_LABEL = "non committer build" UPGRADE_TO_NEWER_DEPENDENCIES_LABEL = "upgrade to newer dependencies" +USE_PUBLIC_RUNNERS_LABEL = "use public runners" +USE_SELF_HOSTED_RUNNERS_LABEL = "use self-hosted runners" ALL_CI_SELECTIVE_TEST_TYPES = ( @@ -1114,7 +1115,11 @@ def affected_providers_list_as_string(self) -> str | None: def runs_on_as_json_default(self) -> str: if self._github_repository == APACHE_AIRFLOW_GITHUB_REPOSITORY: if self._github_event in [GithubEvents.SCHEDULE, GithubEvents.PUSH]: + # Canary and Scheduled runs return RUNS_ON_SELF_HOSTED_RUNNER + if self._pr_labels and USE_PUBLIC_RUNNERS_LABEL in self._pr_labels: + # Forced public runners + return RUNS_ON_PUBLIC_RUNNER actor = self._github_actor if self._github_event in (GithubEvents.PULL_REQUEST, GithubEvents.PULL_REQUEST_TARGET): try: @@ -1129,8 +1134,23 @@ def runs_on_as_json_default(self) -> str: f"[info]Could not find the actor from pull request, " f"falling back to the actor who triggered the PR: {actor}[/]" ) - if actor in COMMITTERS and USE_PUBLIC_RUNNERS_LABEL not in self._pr_labels: + if ( + actor not in COMMITTERS + and self._pr_labels + and USE_SELF_HOSTED_RUNNERS_LABEL in self._pr_labels + ): + get_console().print( + f"[error]The PR has `{USE_SELF_HOSTED_RUNNERS_LABEL}` label, but " + f"{actor} is not a committer. This is not going to work.[/]" + ) + sys.exit(1) + if USE_SELF_HOSTED_RUNNERS_LABEL in self._pr_labels: + # Forced self-hosted runners + return RUNS_ON_SELF_HOSTED_RUNNER + if actor in COMMITTERS: return RUNS_ON_SELF_HOSTED_RUNNER + else: + return RUNS_ON_PUBLIC_RUNNER return RUNS_ON_PUBLIC_RUNNER @cached_property diff --git a/dev/breeze/tests/test_selective_checks.py b/dev/breeze/tests/test_selective_checks.py index a5af1ced6d3e0..ba6e12432c323 100644 --- a/dev/breeze/tests/test_selective_checks.py +++ b/dev/breeze/tests/test_selective_checks.py @@ -1624,13 +1624,17 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d @pytest.mark.parametrize( - "github_event, github_actor, github_repository, pr_labels, github_context_dict, default_runs_on_as_string, is_self_hosted_runner, is_airflow_runner, is_amd_runner, is_arm_runner, is_vm_runner, is_k8s_runner", + ( + "github_event, github_actor, github_repository, pr_labels, " + "github_context_dict, runs_on_as_json_default, is_self_hosted_runner, " + "is_airflow_runner, is_amd_runner, is_arm_runner, is_vm_runner, is_k8s_runner, exception" + ), [ pytest.param( GithubEvents.PUSH, "user", "apache/airflow", - [], + (), dict(), '["self-hosted", "Linux", "X64"]', "true", @@ -1639,13 +1643,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "true", "false", + False, id="Push event", ), pytest.param( GithubEvents.PUSH, "user", "private/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1654,13 +1659,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Push event for private repo", ), pytest.param( GithubEvents.PULL_REQUEST, "user", "apache/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1669,13 +1675,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request", ), pytest.param( GithubEvents.PULL_REQUEST, "user", "private/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1684,13 +1691,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request private repo", ), pytest.param( GithubEvents.PULL_REQUEST, COMMITTERS[0], "apache/airflow", - [], + (), dict(), '["self-hosted", "Linux", "X64"]', "true", @@ -1699,13 +1707,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "true", "false", + False, id="Pull request committer", ), pytest.param( GithubEvents.PULL_REQUEST, COMMITTERS[0], "apache/airflow", - [], + (), dict(event=dict(pull_request=dict(user=dict(login="user")))), '["ubuntu-22.04"]', "false", @@ -1714,13 +1723,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request committer pr non-committer", ), pytest.param( GithubEvents.PULL_REQUEST, COMMITTERS[0], "private/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1729,13 +1739,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request private repo committer", ), pytest.param( GithubEvents.PULL_REQUEST_TARGET, "user", "apache/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1744,13 +1755,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request target", ), pytest.param( GithubEvents.PULL_REQUEST_TARGET, "user", "private/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1759,6 +1771,7 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request target private repo", ), pytest.param( @@ -1774,13 +1787,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "true", "false", + False, id="Pull request target committer", ), pytest.param( GithubEvents.PULL_REQUEST, COMMITTERS[0], "apache/airflow", - [], + (), dict(event=dict(pull_request=dict(user=dict(login="user")))), '["ubuntu-22.04"]', "false", @@ -1789,13 +1803,14 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request target committer pr non-committer", ), pytest.param( GithubEvents.PULL_REQUEST_TARGET, COMMITTERS[0], "private/airflow", - [], + (), dict(), '["ubuntu-22.04"]', "false", @@ -1804,41 +1819,84 @@ def test_helm_tests_trigger_ci_build(files: tuple[str, ...], expected_outputs: d "false", "false", "false", + False, id="Pull request targe private repo committer", ), + pytest.param( + GithubEvents.PULL_REQUEST, + "user", + "apache/airflow", + ("use self-hosted runners",), + dict(), + '["ubuntu-22.04"]', + "false", + "false", + "true", + "false", + "false", + "false", + True, + id="Pull request by non committer with 'use self-hosted runners' label.", + ), + pytest.param( + GithubEvents.PULL_REQUEST, + COMMITTERS[0], + "apache/airflow", + ("use public runners",), + dict(), + '["ubuntu-22.04"]', + "false", + "false", + "true", + "false", + "false", + "false", + False, + id="Pull request by committer with 'use public runners' label.", + ), ], ) def test_runs_on( github_event: GithubEvents, github_actor: str, github_repository: str, - pr_labels: list[str], + pr_labels: tuple[str, ...], github_context_dict: dict[str, Any], - default_runs_on_as_string, + runs_on_as_json_default, is_self_hosted_runner: str, is_airflow_runner: str, is_amd_runner: str, is_arm_runner: str, is_vm_runner: str, is_k8s_runner: str, + exception: bool, ): - stderr = SelectiveChecks( - files=(), - commit_ref="", - github_repository=github_repository, - github_event=github_event, - github_actor=github_actor, - github_context_dict=github_context_dict, - pr_labels=(), - default_branch="main", - ) - assert_outputs_are_printed({"runs-on-as-json-default": default_runs_on_as_string}, str(stderr)) - assert_outputs_are_printed({"is-self-hosted-runner": is_self_hosted_runner}, str(stderr)) - assert_outputs_are_printed({"is-airflow-runner": is_airflow_runner}, str(stderr)) - assert_outputs_are_printed({"is-amd-runner": is_amd_runner}, str(stderr)) - assert_outputs_are_printed({"is-arm-runner": is_arm_runner}, str(stderr)) - assert_outputs_are_printed({"is-vm-runner": is_vm_runner}, str(stderr)) - assert_outputs_are_printed({"is-k8s-runner": is_k8s_runner}, str(stderr)) + def get_output() -> str: + return str( + SelectiveChecks( + files=(), + commit_ref="", + github_repository=github_repository, + github_event=github_event, + github_actor=github_actor, + github_context_dict=github_context_dict, + pr_labels=pr_labels, + default_branch="main", + ) + ) + + if exception: + with pytest.raises(SystemExit): + get_output() + else: + stderr = get_output() + assert_outputs_are_printed({"runs-on-as-json-default": runs_on_as_json_default}, str(stderr)) + assert_outputs_are_printed({"is-self-hosted-runner": is_self_hosted_runner}, str(stderr)) + assert_outputs_are_printed({"is-airflow-runner": is_airflow_runner}, str(stderr)) + assert_outputs_are_printed({"is-amd-runner": is_amd_runner}, str(stderr)) + assert_outputs_are_printed({"is-arm-runner": is_arm_runner}, str(stderr)) + assert_outputs_are_printed({"is-vm-runner": is_vm_runner}, str(stderr)) + assert_outputs_are_printed({"is-k8s-runner": is_k8s_runner}, str(stderr)) @pytest.mark.parametrize(