Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-6786] Added Kafka components differently #11520

Closed
wants to merge 8 commits into from
Closed

[AIRFLOW-6786] Added Kafka components differently #11520

wants to merge 8 commits into from

Conversation

dferguson992
Copy link

Dear Airflow Maintainers,

Please accept the following PR that

Add the KafkaProducerHook.
Add the KafkaConsumerHook.
Add the KafkaSensor which listens to messages with a specific topic.
Related Issue:
#1311

Issue link: AIRFLOW-6786

Make sure to mark the boxes below before creating PR: [x]

Description above provides context of the change
Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
Unit tests coverage for changes (not needed for documentation changes)
Commits follow "How to write a good git commit message"
Relevant documentation is updated including usage instructions.
I will engage committers as explained in Contribution Workflow Example.
For document-only changes commit message can start with [AIRFLOW-XXXX].
Reminder to contributors:

You must add an Apache License header to all new files
Please squash your commits when possible and follow the 7 rules of good Git commits
I am new to the community, I am not sure the files are at the right place or missing anything.

The sensor could be used as the first node of a dag where the second node can be a TriggerDagRunOperator. The messages are polled in a batch and the dag runs are dynamically generated.

Thanks!

Note, as per denied PR #1415, it is important to mention these integrations are not suitable for low-latency/high-throughput/streaming. For reference, #1415 (comment).

Co-authored-by: Dan Ferguson [email protected]
Co-authored-by: YuanfΞi Zhu

dferguson992 and others added 2 commits October 13, 2020 20:57
* Fix 'Upload documentation' step in CI (#10981)

* Pins temporarily moto to 1.3.14 (#10986)

As described in #10985, moto upgrade causes some AWS tests to fail.
Until we fix it, we pin the version to 1.3.14.

Part of #10985

* Allows to build production images for 1.10.2 and 1.10.1 Airflow (#10983)

Airflow below 1.10.2 required SLUGIFY_USES_TEXT_UNIDECODE env
variable to be set to yes.

Our production Dockerfile and Breeze supports building images
for any version of airflow >= 1.10.1 but it failed on
1.10.2 and 1.10.1 because this variable was not set.

You can now set the variable when building image manually
and Breeze does it automatically if image is 1.10.1 or 1.10.2

Fixes #10974

* The test_find_not_should_ignore_path is now in heisentests (#10989)

It seems that the test_find_not_should_ignore_path test has some
dependency on side-effects from other tests.

See #10988 - we are moving this test to heisentests until we
solve the issue.

* Unpin 'tornado' dep pulled in by flower (#10993)

'tornado' version was pinned in https://github.com/apache/airflow/pull/4815

The underlying issue is fixed for Py 3.5.2 so that is not a problem. Also since Airflow Master is already Py 3.6+ this does not apply to us.

* Simplify the K8sExecutor and K8sPodOperator (#10393)

* Simplify Airflow on Kubernetes Story

Removes thousands of lines of code that essentially ammount to us
re-creating the Kubernetes API. Will offer a faster, simpler
KubernetesExecutor for 2.0

* Fix podgen tests

* fix documentation

* simplify validate function

* @mik-laj comments

* spellcheck

* spellcheck

* Update airflow/executors/kubernetes_executor.py

Co-authored-by: Kaxil Naik <[email protected]>

Co-authored-by: Kaxil Naik <[email protected]>

* Add new teammate to Polidea (#11000)

* Fetching databricks host from connection if not supplied in extras. (#10762)

* Fetching databricks host from connection if not supplied in extras.

* Fixing formatting issue in databricks test

Co-authored-by: joshi95 <[email protected]>

* Remove redundant curly brace from breeze echo message (#11012)

Before:

```
❯ ./breeze --github-image-id 260274893

GitHub image id: 260274893}
```

After:

```
❯ ./breeze --github-image-id 260274893

GitHub image id: 260274893
```

* KubernetesJobWatcher no longer inherits from Process (#11017)

multiprocessing.Process is set up in a very unfortunate manner
that pretty much makes it impossible to test a class that inherits from
Process or use any of its internal functions. For this reason we decided
to seperate the actual process based functionality into a class member

* Replace PNG/text with SVG that includes name in proper typography (#11018)

* [AIP-34] TaskGroup: A UI task grouping concept as an alternative to SubDagOperator (#10153)

This commit introduces TaskGroup, which is a simple UI task grouping concept.

- TaskGroups can be collapsed/expanded in Graph View when clicked
- TaskGroups can be nested
- TaskGroups can be put upstream/downstream of tasks or other TaskGroups with >> and << operators
- Search box, hovering, focusing in Graph View treats TaskGroup properly. E.g. searching for tasks also highlights TaskGroup that contains matching task_id. When TaskGroup is expanded/collapsed, the affected TaskGroup is put in focus and moved to the centre of the graph.


What this commit does not do:

- This commit does not change or remove SubDagOperator. Although TaskGroup is intended as an alternative for SubDagOperator, deprecating SubDagOperator will need to be discussed/implemented in the future.
- This PR only implemented TaskGroup handling in the Graph View. In places such as Tree View, it will look like as-if 
- TaskGroup does not exist and all tasks are in the same flat DAG.

GitHub Issue: #8078
AIP: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator

* Support extra_args in S3Hook and GCSToS3Operator (#11001)

* Remove Edit Button from DagModel View (#11026)

* Increase typing coverage JDBC provider (#11021)

* Add typing to amazon provider EMR (#10910)

* Fix typo in DagFileProcessorAgent._last_parsing_stat_received_at (#11022)

`_last_parsing_stat_recieved_at` -> `_last_parsing_stat_received_at`

* Fix logo issues due to generic scripting selector use (#11028)

Resolves #11025

* Get Airflow configs with sensitive data from AWS Systems Manager (#11023)

Adds support to AWS SSM for feature added in https://github.com/apache/airflow/pull/9645

* Refactor rebase copy (#11030)

* Add D204 pydocstyle check (#11031)

* Starting breeze will run an init script after the environment is setup (#11029)

Added the possibility to run an init script

* Replace JS package toggle w/ pure CSS solution (#11035)

* Only gather KinD logs if tests fail (#11058)

* Separates out user documentation for production images. (#10998)

We have now much better user-facing documentation.

Only the parts interesting for users of the image are
separated out to the "docs" of Airflow.
The README and IMAGES.rst contains links to those
docs and internal details of the images respectively.

Fixes #10997.

* All versions in CI yamls are not hard-coded any more (#10959)

GitHub Actions allow to use `fromJson` method to read arrays
or even more complex json objects into the CI workflow yaml files.

This, connected with set::output commands, allows to read the
list of allowed versions as well as default ones from the
environment variables configured in
./scripts/ci/libraries/initialization.sh

This means that we can have one plece in which versions are
configured. We also need to do it in "breeze-complete" as this is
a standalone script that should not source anything we added
BATS tests to verify if the versions in breeze-complete
correspond with those defined in the initialization.sh

Also we do not limit tests any more in regular PRs now - we run
all combinations of available versions. Our tests run quite a
bit faster now so we should be able to run more complete
matrixes. We can still exclude individual values of the matrixes
if this is too much.

MySQL 8 is disabled from breeze for now. I plan a separate follow
up PR where we will run MySQL 8 tests (they were not run so far)

* Add Workflow to delete old artifacts (#11064)

* [Doc] Correct description for macro task_instance_key_str (#11062)

Correction based on code https://github.com/apache/airflow/blob/master/airflow/models/taskinstance.py

* Revert "KubernetesJobWatcher no longer inherits from Process (#11017)" (#11065)

This reverts commit 1539bd051cfbc41c1c7aa317fc7df82dab28f9f8.

* Add JSON schema validation for Helm values (#10664)

fixes #10634

* Get Airflow configs with sensitive data from CloudSecretManagerBackend (#11024)

* Add some tasks using BashOperator in TaskGroup example dag (#11072)

Previously all the tasks in airflow/example_dags/example_task_group.py were using DummyOperator which does not go to executor and is marked as success in Scheduler itself so it would be good to have some tasks that aren't dummy operator to properly test TaskGroup functionality

* Replace Airflow Slack Invite old link to short link (#11071)

Follow up to https://github.com/apache/airflow/pull/10034

https://apache-airflow-slack.herokuapp.com/ to https://s.apache.org/airflow-slack/

* Fix s.apache.org Slack link (#11078)

Remove ending / from s.apache.org Slack link

* Pandas behaviour for None changed in 1.1.2 (#11004)

In Pandas version 1.1.2 experimental NAN value started to be
returned instead of None in a number of places. That broke our tests.

Fixing the tests also requires the Pandas to be updated to be >=1.1.2

* Improves deletion of old artifacts. (#11079)

We introduced deletion of the old artifacts as this was
the suspected culprit of Kubernetes Job failures. It turned out
eventually that those Kubernetes Job failures were caused by
the #11017 change, but it's good to do housekeeping of the
artifacts anyway.

The delete workflow action introduced in a hurry had two problems:

* it runs for every fork if they sync master. This is a bit
  too invasive

* it fails continuously after 10 - 30 minutes every time
  as we have too many old artifacts to delete (GitHub has
  90 days retention policy so we have likely tens of
  thousands of artifacts to delete)

* it runs every hour and it causes occasional API rate limit
  exhaustion (because we have too many artifacts to loop trough)

This PR introduces filtering with the repo, changes the frequency
of deletion to be 4 times a day. Back of the envelope calculation
tops 4/day at 2500 artifacts to delete at every run so we have low risk 
of reaching 5000 API calls/hr rate limit. and adds script that we are
running manually to delete those excessive artifacts now. Eventually
when the number of artifacts goes down the regular job should delete
maybe a few hundreds of artifacts appearing within the 6 hours window
in normal circumstances and it should stop failing then.

* Requirements might get upgraded without setup.py change (#10784)

I noticed that when there is no setup.py changes, the constraints
are not upgraded automatically. This is because of the docker
caching strategy used - it simply does not even know that the
upgrade of pip should happen.

I believe this is really good (from security and incremental updates
POV to attempt to upgrade at every successfull merge (not that
the upgrade will not be committed if any of the tests fail and this
is only happening on every merge to master or scheduled run.

This way we will have more often but smaller constraint changes.

Depends on #10828

* Add D202 pydocstyle check (#11032)

* Add permissions for stable API (#10594)

Related Github Issue: https://github.com/apache/airflow/issues/8112

* Make Skipmixin handle empty branch properly (#10751)

closes: #10725

Make sure SkipMixin.skip_all_except() handles empty branches like this properly. When "task1" is followed, "join" must not be skipped even though it is considered to be immediately downstream of "branch".

* SkipMixin: Add missing session.commit() and test (#10421)

* Fix typo in STATIC_CODE_CHECKS.rst (#11094)

`realtive` -> `relative`

* Avoid redundant SET conversion (#11091)

* Avoid redundant SET conversion

get_accessible_dag_ids() returns a SET, so no need to apply set() again

* Add type annotation for get_accessible_dag_ids()

* Fix for pydocstyle D202 (#11096)

'issues' introduced in https://github.com/apache/airflow/pull/10594

* Security upgrade lodash from 4.17.19 to 4.17.20 (#11095)

Details: https://snyk.io/vuln/SNYK-JS-LODASH-590103

* Introducing flags to skip example dags and default connections (#11099)

* Add template fields renderers for better UI rendering (#11061)

This PR adds possibility to define template_fields_renderers for an
operator. In this way users will be able to provide information
what lexer should be used for rendering a particular field. This is
super useful for custom operator and gives more flexibility than
predefined keywords.

Co-authored-by: Kamil Olszewski <[email protected]>
Co-authored-by: Felix Uellendall <[email protected]>

* Fix sort-in-the-wild pre-commit on Mac (#11103)

* Fix typo in README (#11106)

* Add Opensignal to INTHEWILD.md (#11105)

* Revert "Introducing flags to skip example dags and default connections (#11099)" (#11110)

This reverts commit 0edc3dd57953da5c66a4b45d49c1426cc0295f9e.

* Update initialize-database.rst (#11109)

* Update initialize-database.rst

Remove ambiguity in the language as only MySQL, Postgres and SQLite are supported backends.

* Update docs/howto/initialize-database.rst

Co-authored-by: Jarek Potiuk <[email protected]>

Co-authored-by: Xiaodong DENG <[email protected]>
Co-authored-by: Jarek Potiuk <[email protected]>

* Increasing type coverage FTP (#11107)

* Adds timeout in CI/PROD waiting jobs (#11117)

In very rare cases, the waiting job might not be cancelled when
the "Build Image" job fails or gets cancelled on its own.

In the "Build Image" workflow we have this step:

- name: "Canceling the CI Build source workflow in case of failure!"
  if: cancelled() || failure()
  uses: potiuk/cancel-workflow-runs@v2
  with:
    token: ${{ secrets.GITHUB_TOKEN }}
    cancelMode: self
    sourceRunId: ${{ github.event.workflow_run.id }}

But when this step fails or gets cancelled on its own before
cancel is triggered, the "wait for image" steps could
run for up to 6 hours.

This change sets 50 minutes timeout for those jobs.

Fixes #11114

* Add Helm Chart linting (#11108)

* README Doc: Link to Airflow directory in ASF Directory (#11137)

`https://downloads.apache.org` -> `https://downloads.apache.org/airflow` (links to precise dir)

* Fix incorrect Usage of Optional[bool] (#11138)

Optional[bool] = Union[None, bool]

There were incorrect usages where the default was already set to
a boolean value but still Optional was used

* Fix FROM directive in docs/production-deployment.rst (#11139)

`FROM:` -> `FROM`

* Increasing type coverage for salesforce provide (#11135)

* Added support for encrypted private keys in SSHHook (#11097)

* Added support for encrypted private keys in SSHHook

* Fixed Styling issues and added unit testing

* fixed last pylint styling issue by adding newline to the end of the file

* re-fixed newline issue for pylint checks

* fixed pep8 styling issues and black formatted files to pass static checks

* added comma as per suggestion to fix static check

Co-authored-by: Nadim Younes <[email protected]>

* Fix error message when checking literalinclude in docs (#11140)

Before:
```
literalinclude directive is is prohibited for example DAGs
```

After:

```
literalinclude directive is prohibited for example DAGs
```

* Upgrade to latest isort & pydocstyle (#11142)

isort: from 5.4.2 to 5.5.3
pydocstyle: from 5.0.2 to 5.1.1

* Do not silently allow the use of undefined variables in jinja2 templates (#11016)

This can have *extremely* bad consequences. After this change, a jinja2
template like the one below will cause the task instance to fail, if the
DAG being executed is not a sub-DAG. This may also display an error on
the Rendered tab of the Task Instance page.

task_instance.xcom_pull('z', key='return_value', dag_id=dag.parent_dag.dag_id)

Prior to the change in this commit, the above template would pull the
latest value for task_id 'z', for the given execution_date, from *any DAG*.
If your task_ids between DAGs are all unique, or if DAGs using the same
task_id always have different execution_date values, this will appear to
act like dag_id=None.

Our current theory is SQLAlchemy/Python doesn't behave as expected when
comparing `jinja2.Undefined` to `None`.

* Move Backport Providers docs to our docsite (#11136)

* Fix user in helm chart pgbouncer deployment (#11143)

* Fixes celery deployments for Airflow 2.0 (#11129)

The celery flower and worker commands have changed in Airflow 2.0.
The Helm Chart supported only 1.10 version of those commands and
this PR fixes it by adding both variants of them.

* Fix gitSync user in the helm Chart (#11127)

There was a problem with user in Git Sync mode of the Helm Chart
in connection with the git sync image and official Airflow
image. Since we are using the official image, most of the
containers are run with the "50000" user, but the git-sync image
used by the git sync user is 65533 so we have to set it as
default. We also exposed that value as parameter, so that
another image could be used here as well.

* Fix incorrect Usage of Optional[str] & Optional[int] (#11141)

From https://docs.python.org/3/library/typing.html#typing.Optional

```
Optional[X] is equivalent to Union[X, None].
```

>Note that this is not the same concept as an optional argument, which is one that has a default. An optional argument with a default does not require the Optional qualifier on its type annotation just because it is optional.

There were incorrect usages where the default was already set to
a string or int value but still Optional was used

* Remove link to Dag Model view given the redundancy with DAG Details view (#11082)

* Make sure pgbouncer-exporter docker image is linux/amd64 (#11148)

Closes #11145

* Update to latest version of pbgouncer-exporter (#11150)

There was a problem with Mac version of pgbouncer exporter
created and released previously. This commit releases the
latest version making sure that Linux Go is used to build
the pgbouncer binary.

* Add new member to Polidea (#11153)

* Massively speed up the query returned by TI.filter_for_tis (#11147)

The previous query generated SQL like this:

```
WHERE (task_id = ? AND dag_id = ? AND execution_date = ?) OR (task_id = ? AND dag_id = ? AND execution_date = ?)
```

Which is fine for one or maybe even 100 TIs, but when testing DAGs at
extreme size (over 21k tasks!) this query was taking for ever (162s on
Postgres, 172s on MySQL 5.7)

By changing this query to this

```
WHERE task_id IN (?,?) AND dag_id = ? AND execution_date = ?
```

the time is reduced to 1s! (1.03s on Postgres, 1.19s on MySQL)

Even on 100 tis the reduction is large, but the overall time is not
significant (0.01451s -> 0.00626s on Postgres).

Times included SQLA query construction time (but not time for calling
filter_for_tis. So a like-for-like comparison), not just DB query time:

```python
ipdb> start_filter_20k = time.monotonic(); result_filter_20k = session.query(TI).filter(tis_filter).all(); end_filter_20k = time.monotonic()
ipdb> end_filter_20k - start_filter_20k
172.30647455298458
ipdb> in_filter = TI.dag_id == self.dag_id, TI.execution_date == self.execution_date, TI.task_id.in_([o.task_id for o in old_states.keys()]);
ipdb> start_20k_custom = time.monotonic(); result_custom_20k = session.query(TI).filter(in_filter).all(); end_20k_custom = time.monotonic()
ipdb> end_20k_custom - start_20k_custom
1.1882996069907676
```

I have also removed the check that was ensuring everything was of the
same type (all TaskInstance or all TaskInstanceKey) as it felt needless
- both types have the three required fields, so the "duck-typing"
approach at runtime (crash if doesn't have the required property)+mypy
checks felt Good Enough.

* Increase Type coverage for IMAP provider (#11154)

* Increasing type coverage for multiple provider (#11159)

* Optionally disables PIP cache from GitHub during the build (#11173)

This is first step of implementing the corporate-environment
friendly way of building images, where in the corporate
environment, this might not be possible to install the packages
using the GitHub cache initially.

Part of #11171

* Update UPDATING.md (#11172)

* New Breeze command start-airflow, it replaces the previous flag (#11157)

* Conditional MySQL Client installation (#11174)

This is the second step of making the Production Docker Image more
corporate-environment friendly, by making MySQL client installation
optional. Instaling MySQL Client on Debian requires to reach out
to oracle deb repositories which might not be approved by security
teams when you build the images. Also not everyone needs MySQL
client or might want to install their own MySQL client or MariaDB
client - from their own repositories.

This change makes the installation step separated out to
script (with prod/dev installation option). The prod/dev separation
is needed because MySQL needs to be installed with dev libraries
in the "Build" segment of the image (requiring build essentials
etc.) but in "Final" segment of the image only runtime libraries
are needed.

Part of #11171

Depends on #11173.

* Add example DAG and system test for MySQLToGCSOperator (#10990)

* Increase type coverage for five different providers (#11170)

* Increasing type coverage for five different providers
* Added more strict type

* Adds Kubernetes Service Account for the webserver (#11131)

Webserver did not have a Kubernetes Service Account defined and
while we do not strictly need to use the service account for
anything now, having the Service Account defined allows to
define various capabilities for the webserver.

For example when you are in the GCP environment, you can map
the Kubernetes service account into a GCP one, using
Workload Identity without the need to define any secrets
and performing additional authentication.
Then you can have that GCP service account get
the permissions to write logs to GCS bucket. Similar mechanisms
exist in AWS and it also opens up on-premises configuration.

See more at
https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity

Co-authored-by: Jacob Ferriero <[email protected]>

Co-authored-by: Jacob Ferriero <[email protected]>

* Allow overrides for pod_template_file (#11162)

* Allow overrides for pod_template_file

A pod_template_file should be treated as a *template* not a steadfast
rule.

This PR ensures that users can override individual values set by the
pod_template_file s.t. the same file can be used for multiple tasks.

* fix podtemplatetest

* fix name

* Enables Kerberos sidecar support (#11130)

Some of the users of Airflow are using Kerberos to authenticate
their worker workflows. Airflow has a basic support for Kerberos
for some of the operators and it has support to refresh the
temporary Kerberos tokens via `airflow kerberos` command.

This change adds support for the Kerberos side-car that connects
to the Kerberos Key Distribution Center and retrieves the
token using Keytab that should be deployed as Kubernetes Secret.

It uses shared volume to share the temporary token. The nice
thing about setting it up as a sidecar is that the Keytab
is never shared with the workers - the secret is only mounted
by the sidecar and the workers have only access to the temporary
token.

Depends on #11129

* Make kill log in DagFileProcessorProcess more informative (#11124)

* Show the location of the queries when the assert_queries_count fails. (#11186)

Example output (I forced one of the existing tests to fail)

```
E   AssertionError: The expected number of db queries is 3. The current number is 2.
E
E   Recorded query locations:
E   	scheduler_job.py:_run_scheduler_loop>scheduler_job.py:_emit_pool_metrics>pool.py:slots_stats:94:	1
E   	scheduler_job.py:_run_scheduler_loop>scheduler_job.py:_emit_pool_metrics>pool.py:slots_stats:101:	1
```

This makes it a bit easier to see what the queries are, without having
to re-run with full query tracing and then analyze the logs.

* Improve Google Transfer header in documentation index file  (#11166)

* Fix typos in Dockerfile.ci (#11187)

Fixed some spellings

* Remove Unnecessary comprehension in 'any' builtin function (#11188)

The inbuilt functions `any()` support short-circuiting (evaluation stops as soon as the overall return value of the function is known), but this behavior is lost if you use comprehension. This affects performance.

* Optionally tags image when building with Breeze (#11181)

Breeze tags the image based on the default python version,
branch, type of the image, but you might want to tag the image
in the same command especially in automated cases of building
the image via CI scripts or security teams that tag the imge
based on external factors (build time, person etc.).

This is part of #11171 which makes the image easier to build in
corporate environments.

* in_container bats pre-commit hook and updated bats-tests hook (#11179)

* Fixes image tag readonly failure (#11194)

The image builds fine, but produces an unnecessary error message.

Bug Introduced in c9a34d2ef9ccf6c18b379bbcb81b9381027eb803

* More customizable build process for Docker images (#11176)

* Allows more customizations for image building.

This is the third (and not last) part of making the Production
image more corporate-environment friendly. It's been prepared
for the request of one of the big Airflow user (company) that
has rather strict security requirements when it comes to
preparing and building images. They are committed to
synchronizing with the progress of Apache Airflow 2.0 development
and making the image customizable so that they can build it using
only sources controlled by them internally was one of the important
requirements for them.

This change adds the possibilty of customizing various steps in
the build process:

* adding custom scripts to be run before installation of both
  build image and runtime image. This allows for example to
  add installing custom GPG keys, and adding custom sources.

* customizing the way NodeJS and Yarn are installed in the
  build image segment - as they might rely on their own way
  of installation.

* adding extra packages to be installed during both build and
  dev segment build steps. This is crucial to achieve the same
  size optimizations as the original image.

* defining additional environment variables (for example
  environment variables that indicate acceptance of the EULAs
  in case of installing proprietary packages that require
  EULA acceptance - both in the build image and runtime image
  (again the goal is to keep the image optimized for size)

The image build process remains the same when no customization
options are specified, but having those options increases
flexibility of the image build process in corporate environments.

This is part of #11171.

This change also fixes some of the issues opened and raised by
other users of the Dockerfile.

Fixes: #10730
Fixes: #10555
Fixes: #10856

Input from those issues has been taken into account when this
change was designed so that the cases described in those issues
could be implemented. Example from one of the issue landed as
an example way of building highly customized Airflow Image
using those customization options.

Depends on #11174

* Update IMAGES.rst

Co-authored-by: Kamil Breguła <[email protected]>

* [AIRFLOW-5545] Fixes recursion in DAG cycle testing (#6175)

* Fixes an issue where cycle detection uses recursion

and stack overflows after about 1000 tasks

(cherry picked from commit 63f1a180a17729aa937af642cfbf4ddfeccd1b9f)

* reduce test length

* slightly more efficient

* Update airflow/utils/dag_cycle_tester.py

Co-authored-by: Kaxil Naik <[email protected]>

* slightly more efficient

* actually works this time

Co-authored-by: Daniel Imberman <[email protected]>
Co-authored-by: Kaxil Naik <[email protected]>

* Add amazon glacier to GCS transfer operator (#10947)

Add Amazon Glacier to GCS transfer operator, Glacier job operator and sensor.

* Strict type coverage for Oracle and Yandex provider  (#11198)

* type coverage for yandex provider

* type coverage for oracle provider

* import optimisation and mypy fix

* import optimisation

* static check fix

* Strict type checking for SSH (#11216)

* Replace get accessible dag ids (#11027)

* Kubernetes executor can adopt tasks from other schedulers (#10996)

* KubernetesExecutor can adopt tasks from other schedulers

* simplify

* recreate tables properly

* fix pylint

Co-authored-by: Daniel Imberman <[email protected]>

* Optional import error tracebacks in web ui (#10663)

This PR allows for partial import error tracebacks to be exposed on the UI, if enabled. This extra context can be very helpful for users without access to the parsing logs to determine why their DAGs are failing to import properly.

* Strict type check for multiple providers (#11229)

* Fix typo in command in CI.rst (#11233)

* Add Python version to Breeze cmd (#11228)

* Use more meaningfull message for DagBag timeouts (#11235)

Instead of 'Timeout, PID: 1234' we can use something more meaningful
that will help users understand the logs.

* Prepare Backport release 2020.09.07 (#11238)

* Airflow 2.0 UI Overhaul/Refresh (#11195)

Resolves #10953.

A refreshed UI for the 2.0 release. The existing "theming" is a bit long in the tooth and this PR attempts to give it a modern look and some freshness to compliment all of the new features under the hood.

The majority of the changes to UI have been done through updates to the Bootstrap theme contained in bootstrap-theme.css. These are simply overrides to the default stylings that are packaged with Bootstrap.

* Restore description for provider packages. (#11239)

The change #10445 caused empty descriptions for all packages.

This change restores it and also makes sure package creation works
when there is no README.md

* Small updates to provider preparation docs. (#11240)

* Fixed month in backport packages to October (#11242)

* Add task adoption to CeleryKubernetesExecutor (#11244)

Routes task adoption based on queue name to CeleryExecutor
or KubernetesExecutor

Co-authored-by: Daniel Imberman <[email protected]>

* Remove redundant parentheses (#11248)

* Fix Broken Markdown links in Providers README TOC (#11249)

* Add option to bulk clear DAG Runs in Browse DAG Runs page (#11226)

closes: #11076

* Update yamllint & isort pre-commit hooks (#11252)

yamllint: v1.24.2 -> v1.25.0
isort: 5.5.3 -> 5.5.4

* Ensure target_dedicated_nodes or enable_auto_scale is set in AzureBatchOperator (#11251)

* Add s3 key to template fields for s3/redshift transfer operators (#10890)

* Add missing "example" tag on example DAG (#11253)

`example_task_group` and `example_nested_branch_dag` didn't have the example tag while all the other ones do have it

* Breeze: Fix issue with pulling an image via ID (#11255)

* Move test tools from tests.utils to tests.test_utils (#10889)

* Add Github Code Scanning (#11211)

Github just released Github Code Scanning:
https://github.blog/2020-09-30-code-scanning-is-now-available/

* Add Hacktoberfest topic to the repo (#11258)

* Add operator link to access DAG triggered by TriggerDagRunOperator (#11254)

This commit adds TriggerDagRunLink which allows users to access
easily access in Web UI a DAG triggered by TriggerDagRunOperator

* The bats script for CI image is now placed in the docker folder (#11262)

The script was previously placed in scripts/ci which caused
a bit of a problem in 1-10-test branch where PRs were using
scripts/ci from the v1-10-test HEAD but they were missing
the ci script from the PR.

The scripts "ci" are parts of the host scripts that are
always taken from master when the image is built, but
all the other stuff should be taken from "docker"
folder - which will be taken from the PR.

* Limits CodeQL workflow to run only in the Apache Airflow repo (#11264)

It has been raised quite a few times that workflow added in forked
repositories might be pretty invasive for the forks - especially
when it comes to scheduled workflows as they might eat quota
or at least jobs for those organisations/people who fork
repositories.

This is not strictly necessary because Recently GitHub recognized this as being
a problem and introduced new rules for scheduled workflows. But for people who
are already forked, it would be nice to not run those actions. It is enough
that the CodeQL check is done when PR is opened to the "apache/airflow"
repository.

Quote from the emails received by Github (no public URL explaining it yet):

> Scheduled workflows will be disabled by default in forks of public repos and in
public repos with no activity for 60 consecutive days.  We’re making two
changes to the usage policy for GitHub Actions. These changes will enable
GitHub Actions to scale with the incredible adoption we’ve seen from the GitHub
community. Here’s a quick overview:

> * Starting today, scheduled workflows will be disabled by default in new forks of
public repositories.
> * Scheduled workflows will be disabled in public repos with
no activity for 60 consecutive days.

* Enable MySQL 8 CI jobs (#11247)

closes https://github.com/apache/airflow/issues/11164

* Improve running and canceliling of the PR-triggered builds. (#11268)

The PR builds are now better handled with regards to both
running (using merge-request) and canceling (with cancel notifications).

First of all we are using merged commit from the PR, not the original commit
from the PR.

Secondly - the workflow run notifies the original PR with comment
stating that the image is being built in a separate workflow -
including the link to that workflow.

Thirdly - when canceling duplicate PRs or PRs with failed
jobs, the workflow will add a comment to the PR stating the
reason why the PR is being cancelled.

Last but not least, we also add cancel job for the CodeQL duplicate
messages. They run for ~ 12 miinutes so it makes perfect sense to
also cancel those CodeQL jobs for which someone pushed fixups in a
quick succession.

Fixes: #10471

* Fix link to static checks in CONTRIBUTING.rst (#11271)

* fix job deletion (#11272)

* Allow labels in KubernetesPodOperator to be templated (#10796)

* Access task type via the property, not dundervars (#11274)

We don't currently create TIs form serialized dags, but we are about to
start -- at which point some of these cases would have just shown
"SerializedBaseOperator", rather than the _real_ class name.

The other changes are just for "consistency" -- we should always get the
task type from this property, not via `__class__.__name__`.

I haven't set up a pre-commit rule for this as using this dunder
accessor is used elsewhere on things that are not BaseOperator
instances, and detecting that is hard to do in a pre-commit rule.

* When sending tasks to celery from a sub-process, reset signal handlers (#11278)

Since these processes are spawned from SchedulerJob after it has
registered it's signals, if any of them got signaled they would have the
behaviour of killing the ProcessorAgent process group! (MP has a default
spawn of fork on Linux, so they inherit all previous state -- signals,
and access to the `_process.pid` inside the ProcessorAgent instance)

This behaviour is not what we want for these multiprocess.Pool processes.

This _may_ be a source of the long-standing "scheduler is alive but not
scheduling any jobs. Maybe.

* Switched to Run Checks for Building Images. (#11276)

Replaces the annoying comments with "workflow_run" links
with Run Checks. Now we will be able to see the "Build Image"
checks in the "Checks" section including their status and direct
link to the steps running the image builds as "Details" link.

Unfortunately Github Actions do not handle well the links to
details - even if you provide details_url to link to the other
run, the "Build Image" checks appear in the original workflow,
that's why we had to introduce another link in the summary of
the Build Image check that links to the actual workflow.

* Single/Multi-Namespace mode for helm chart (#11034)

* Multi-Namespace mode for helm chart

Users should not REQUIRE a ClusterRole/ClusterRolebinding
to run airflow via helm. This change will allow "single" and "multi"
namespace modes so users can add airflow to managed kubernetes clusters

* add namespace to role

* add rolebinding too

* add docs

* add values.schema.json change

* Add LocalToAzureDataLakeStorageOperator (#10814)

* Add CeleryKubernetesExecutor to helm chart (#11288)

Users of the CeleryKubernetesExecutor will require both
Celery and Kubernetes features to launch tasks.

This PR will also serve as the basis for integration tests for this
executor

Co-authored-by: Daniel Imberman <[email protected]>

* Strict type check for all hooks in amazon (#11250)

* Replaces depreated set-env with env file (#11292)

Github Actions deprecated the set-env action due to moderate security
vulnerability they found.

https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/

This commit replaces set-env with env file as explained in

https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#environment-files

* Breeze start-airflow command wasn't able to initialize the db in 1.10.x (#11207)

* Add type annotations to ZendeskHook, update unit test (#10888)

* Add type annotations to ZendeskHook

__What__

* Add correct type annotations to ZendeskHook and each method
* Update one unit test to call an empty dictionary rather than a
NoneType since the argument should be a dictionary

__Why__

* Building out type annotations is good for the code base
* The query parameter is accessed with an index at one point, which
means that it cannot be a None type, but should rather be defaulted to
an empty dictionary if not provided

* Remove useless return

* Add acl_policy parameter to GCSToS3Operator (#10804) (#10829)

* add releasing airflow docs to dev readme (#11245)

* Prevent race condition in trying to collect result from DagFileProcessor (#11306)

A rare race condition was noticed in the Scheduler HA PR where the
"test_dags_with_system_exit" test would occasionally fail with the
following symptoms:

- The pipe was "readable" as returned by
  `multiprocessing.connection.wait`
- On reading it yielded an EOFError, meaning the other side had closed
  the connection
- But the process was still alive/running

This previously would result in the Manager process dying with an error.

This PR makes a few changes:

- It ensures that the pipe is simplex, not duplex (we only ever send one
  data) as this is simpler
- We ensure that the "other" end of the pipe is correctly closed in both
  parent and child processes. Without this the pipe would be kept open
  (sometimes) until the child process had closed anyway.
- When we get an EOFError on reading and the process is still alive, we
  give it a few seconds to shut down cleanly, and then kill it.

* Bump tenacity to 6.2 (#11313)

* Move latest_only_operator.py to latest_only.py (#11178) (#11304)

* Adds --no-rbac-ui flag for Breeze airflow 1.10 installation (#11315)

When installing airflow 1.10 via breeze we now enable rbac
by default, but we can disable it with --no-rbac-ui flag.

This is useful to test different variants of 1.10 when testing
release candidataes in connection with the 'start-airflow'
command.

* Add remaining community guidelines to CONTRIBUTING.rst (#11312)

We are cleaning up the docs from CWiki and this is what's left of
community guidelines that were maintained there.

Fixes #10181

* Improve handling of job_id in BigQuery operators (#11287)

Make autogenerated job_id more unique by using microseconds and hash of configuration. Replace dots in job_id.
Closes: #11280

* Prints nicer message in case of git push errors (#11320)

We started to get more often "unknown blob" kind of errors when
pushing the images to GitHub Registry. While this is clearly a
GitHub issue, it's frequency of occurence and unclear message
make it a good candidate to write additional message with
instructions to the users, especially that now they have
an easy way to get to that information via status checks and
links leading to the log file, when this problem happens during
image building process.

This way users will know that they should simply rebase or
amend/force-push their change to fix it.

* Add AzureFileShareToGCSOperator (#10991)

* Automatically upgrade old default navbar color (#11322)

As part of #11195 we re-styled the UI, changing a lot of the default
colours to make them look more modern. However for anyone upgrading and
keeping their airflow.cfg from 1.10 to 2.0 they would end up with things
looking a bit ugly, as the old navbar color would be kept.

This uses the existing config value upgrade feature to automatically
change the old default colour in to the new default colour.

* Pin versions of "untrusted" 3rd-party GitHub Actions (#11319)

According to https://docs.github.com/en/free-pro-team@latest/actions/learn-github-actions/security-hardening-for-github-actions#using-third-party-actionsa
it's best practice not to use tags in case of untrusted
3rd-party actions in order to avoid potential attacks.

* Moves Commiter's guide to CONTRIBUTING.rst (#11314)

I decided to move it to CONTRIBUTING.rst as is it is an important
documentation on what policies we have agreed to as community and
also it is a great resource for the contributor to learn what are
the committer's responsibilities.

Fixes: #10179

* Add environment variables documentation to cli-ref.rst. (#10970)

Co-authored-by: Fai Hegberg <[email protected]>

* Update link for Announcement Page (#11337)

* Strict type check for azure hooks (#11342)

* Adds --install-wheels flag to breeze command line (#11317)

If this flag is specified it will look for wheel packages placed in dist
folder and it will install the wheels from there after installing
Airflow. This is useful for testing backport packages as well as in the
future for testing provider packages for 2.0.

* Improve code quality of SLA mechanism in SchedulerJob (#11257)

* Improve Committer's guide docs (#11338)

* Add Azure Blob Storage to GCS transfer operator (#11321)

* Better message when Building Image fails or gets cancelled. (#11333)

* Revert "Adds --install-wheels flag to breeze command line (#11317)" (#11348)

This reverts commit de07d135ae1bda3f71dd83951bcfafc2b3ad9f89.

* Fix command to run tmux with breeze in BREEZE.rst (#11340)

`breeze --start-airflow` -> `breeze start-airflow`

* Improve instructions to install Airflow Version (#11339)

The instructions can be replaced by `./breeze start-airflow` command

* Reduce "start-up" time for tasks in LocalExecutor (#11327)

Spawning a whole new python process and then re-loading all of Airflow
is expensive. All though this time fades to insignificance for long
running tasks, this delay gives a "bad" experience for new users when
they are just trying out Airflow for the first time.

For the LocalExecutor this cuts the "queued time" down from 1.5s to 0.1s
on average.

* Bump cache version for kubernetes tests (#11355)

Seems that the k8s cache for virtualenv got broken during the
recent problems. This commits bumps the cache version to make
it afresh

* Better diagnostics when there are problems with Kerberos (#11353)

* Fix to make y-axis of Tries chart visible (#10071)

Co-authored-by: Venkatesh Selvaraj <[email protected]>

* Bugfix: Error in SSHOperator when command is None (#11361)

closes https://github.com/apache/airflow/issues/10656

* Allways use Airlfow db in FAB (#11364)

* Use only-if-needed upgrade strategy for PRs (#11363)

Currently, upgrading dependencies in setup.py still runs with previous versions of the package for the PR which fails.

This will change to upgrade only the package that is required for the PRs

* Fix DagBag bug when a dag has invalid schedule_interval (#11344)

* Adding ElastiCache Hook for creating, describing and deleting replication groups (#8701)

* Fix regression in DataflowTemplatedJobStartOperator (#11167)

* Strict type check for Microsoft  (#11359)

* Reduce "start-up" time for tasks in CeleryExecutor (#11372)

This is similar to #11327, but for Celery this time.

The impact is not quite as pronounced here (for simple dags at least)
but takes the average queued to start delay from 1.5s to 0.4s

* Set start_date, end_date & duration for tasks failing without DagRun (#11358)

* Replace nuke with useful information on error page (#11346)

This PR replaces nuke asciiart with text about reporting a bug.
As we are no longer using asciiarts this PR removes it.

* Users can specify sub-secrets and paths k8spodop (#11369)

Allows users to specify items for specific key path projections
when using the airflow.kubernetes.secret.Secret class

* Add capability of adding service account annotations to Helm Chart (#11387)

We can now add annotations to the service accounts in a generic
way. This allows for example to add Workflow Identitty in GKE
environment but it is not limited to it.

Co-authored-by: Kamil Breguła <[email protected]>

Co-authored-by: Jacob Ferriero <[email protected]>
Co-authored-by: Kamil Breguła <[email protected]>

* Add pypirc initialization (#11386)

This PR needs to be merged first in order to handle the #11385
which requires .pypirc to be created before dockerfile gets build.

This means that the script change needs to be merged to master
first in this PR.

* Fully support running more than one scheduler concurrently (#10956)

* Fully support running more than one scheduler concurrently.

This PR implements scheduler HA as proposed in AIP-15. The high level
design is as follows:

- Move all scheduling decisions into SchedulerJob (requiring DAG
  serialization in the scheduler)
- Use row-level locks to ensure schedulers don't stomp on each other
  (`SELECT ... FOR UPDATE`)
- Use `SKIP LOCKED` for better performance when multiple schedulers are
  running. (Mysql < 8 and MariaDB don't support this)
- Scheduling decisions are not tied to the parsing speed, but can
  operate just on the database

*DagFileProcessorProcess*:

Previously this component was responsible for more than just parsing the
DAG files as it's name might imply. It also was responsible for creating
DagRuns, and also making scheduling decisions of TIs, sending them from
"None" to "scheduled" state.

This commit changes it so that the DagFileProcessorProcess now will
update the SerializedDAG row for this DAG, and make no scheduling
decisions itself.

To make the scheduler's job easier (so that it can make as many
decisions as possible without having to load the possibly-large
SerializedDAG row) we store/update some columns on the DagModel table:

- `next_dagrun`: The execution_date of the next dag run that should be created (or
  None)
- `next_dagrun_create_after`: The earliest point at which the next dag
  run can be created

Pre-computing these values (and updating them every time the DAG is
parsed) reduce the overall load on the DB as many decisions can be taken
by selecting just these two columns/the small DagModel row.

In case of max_active_runs, or `@once` these columns will be set to
null, meaning "don't create any dag runs"

*SchedulerJob*

The SchedulerJob used to only queue/send tasks to the executor after
they were parsed, and returned from the DagFileProcessorProcess.

This PR breaks the link between parsing and enqueuing of tasks, instead
of looking at DAGs as they are parsed, we now:

-  store a new datetime column, `last_scheduling_decision` on DagRun
  table, signifying when a scheduler last examined a DagRun
- Each time around the loop the scheduler will get (and lock) the next
  _n_ DagRuns via `DagRun.next_dagruns_to_examine`, prioritising DagRuns
  which haven't been touched by a scheduler in the longest period
- SimpleTaskInstance etc have been almost entirely removed now, as we
  use the serialized versions

* Move callbacks execution from Scheduler loop to DagProcessorProcess

* Don’t run verify_integrity if the Serialized DAG hasn’t changed

dag_run.verify_integrity is slow, and we don't want to call it every time, just when the dag structure changes (which we can know now thanks to DAG Serialization)

* Add escape hatch to disable newly added "SELECT ... FOR UPDATE" queries

We are worried that these extra uses of row-level locking will cause
problems on MySQL 5.x (most likely deadlocks) so we are providing users
an "escape hatch" to be able to make these queries non-locking -- this
means that only a singe scheduler should be run, but being able to run
one is better than having the scheduler crash.

Co-authored-by: Kaxil Naik <[email protected]>

* Revert "Revert "Adds --install-wheels flag to breeze command line (#11317)" (#11348)" (#11356)

This reverts commit f67e6cb805ebb88db1ca2c995de690dc21138b6b.

* Replaced basestring with str in the Exasol hook (#11360)

* [airflow/providers/cncf/kubernetes] correct hook methods name (#11008)

* Fix airflow_local_settings.py showing up as directory (#10999)

Fixes a bug where the airflow_local_settings.py mounts as a volume
if there is no value (this causes k8sExecutor pods to fail)

* Fix case of JavaScript. (#10957)

* Add tests for Custom cluster policy (#11381)

The custom ClusterPolicyViolation has been added in #10282
This one adds more comprehensive test to it.

Co-authored-by: Jacob Ferriero <[email protected]>

* KubernetesPodOperator should retry log tailing in case of interruption (#11325)

* KubernetesPodOperator can retry log tailing in case of interruption

* fix failing test

* change read_pod_logs method formatting

* KubernetesPodOperator retry log tailing based on last read log timestamp

* fix test_parse_log_line test  formatting

* add docstring to parse_log_line method

* fix kubernetes integration test

* fix tests (#11368)

* Constraints and PIP packages can be installed from local sources (#11382)

* Constraints and PIP packages can be installed from local sources

This is the final part of implementing #11171 based on feedback
from enterprise customers we worked with. They want to have
a capability of building the image using binary wheel packages
that are locally available and the official Dockerfile. This means
that besides the official APT sources the Dockerfile build should
not needd GitHub, nor any other external files pulled from outside
including PIP repository.

This change also includes documentation on how to prepare set of
such binaries ready for inspection and review by security teams
in Enterprise environment. Such sets of "known-working-binary-whl"
files can then be separately committed, tracked and scrutinized
in an artifact repository of such an Enterprise.

Fixes: #11171

* Update docs/production-deployment.rst

* Push and schedule duplicates are not cancelled. (#11397)

The push and schedule builds should not be cancelled even if
they are duplicates. By seing which of the master merges
failed, we have better visibility on which merge caused
a problem and we can trace it's origin faster even if the builds
will take longer overall.

Scheduled builds also serve it's purpose and they should
be always run to completion.

* Remove redundant parentheses from Python files (#10967)

* Fixes automated upgrade to latest constraints. (#11399)

Wrong if query in the GitHub action caused upgrade to latest
constraints did not work for a while.

* Fixes cancelling of too many workflows. (#11403)

A problem was introduced in #11397 where a bit too many "Build Image"
jobs is being cancelled by subsequent Build Image run. For now it
cancels all the Build Image jobs that are running :(.

* Fix spelling (#11401)

* Fix spelling (#11404)

* Workarounds "unknown blob" issue by introducing retries (#11411)

We have started to experience "unknown_blob" errors intermittently
recently with GitHub Docker registry. We might eventually need
to migrate to GCR (which eventually is going to replace the
Docker Registry for GitHub:

The ticket is opened to the Apache Infrastructure to enable
access to the GCR and to make some statements about Access
Rights management for GCR https://issues.apache.org/jira/projects/INFRA/issues/INFRA-20959
Also a ticket to GitHub Support has been raised about it
https://support.github.com/ticket/personal/0/861667 as we
cannot delete our public images in Docker registry.

But until this happens, the workaround might help us
to handle the situations where we got intermittent errors
while pushing to the registry. This seems to be a common
error, when NGINX proxy is used to proxy Github Registry so
it is likely that retrying will workaround the issue.

* Add capability of customising PyPI sources (#11385)

* Add capability of customising PyPI sources

This change adds capability of customising installation of PyPI
modules via custom .pypirc file. This might allow to install
dependencies from in-house, vetted registry of PyPI

* Moving the test to quarantine. (#11405)

I've seen the test being flaky and failing intermittently several times.

Moving it to quarantine for now.

* Optionally set null marker in csv exports in BaseSQLToGCSOperator (#11409)

* Fixes SHA used for cancel-workflow-action (#11400)

The SHA of cancel-workflow-action in #11397 was pointing to previous
(3.1) version of the action. This PR fixes it to point to the
right (3.2) version.

* Split tests to more sub-types (#11402)

We seem to have a problem with running all tests at once - most
likely due to some resource problems in our CI, therefore it makes
sense to split the tests into more batches. This is not yet full
implementation of selective tests but it is going in this direction
by splitting to Core/Providers/API/CLI tests. The full selective
tests approach will be implemented as part of #10507 issue.

This split is possible thanks to #10422 which moved building image
to a separate workflow - this way each image is only built once
and it is uploaded to a shared registry, where it is quickly
downloaded from rather than built by all the jobs separately - this
way we can have many more jobs as there is very little per-job
overhead before the tests start runnning.

* Fix incorrect typing, remove hardcoded argument values and improve code in AzureContainerInstancesOperator (#11408)

* Fix constraints generation script (#11412)

Constraints generation script was broken by recent changes
in naming of constraints URL variables and moving generation
of the link to the Dockerfile

This change restores the script's behaviour.

* Fix spelling in CeleryExecutor (#11407)

* Add more info about dag_concurrency (#11300)

* Update MySQLToS3Operator's s3_bucket to template_fields (#10778)

* Change prefix of AwsDynamoDB hook module (#11209)

* align import path of AwsDynamoDBHook in aws providers

Co-authored-by: Tomek Urbaszek <[email protected]>

* Strict type check for google ads and cloud hooks (#11390)

* Mutual SSL added in PGBouncer configuration in the Chart (#11384)

Adds SSL configuration for PGBouncer in the Helm Chart. PGBouncer
is crucial to handle the big number of connections that airflow
opens for the database, but often the database is outside of the
Kubernetes Cluster or generally the environment where Airflow is
installed and PGBouncer needs to connect securely to such database.

This PR adds capability of seting CA/Certificate and Private Key
in the PGBouncer configuration that allows for mTLS authentication
(both client and server are authenticated) and secure connection
even over public network.

* Merge Airflow and Backport Packages preparation instructions (#11310)

This commit extracts common parts of Apache Airflow package
preparation and Backport Packages preparation.

Common parts were extracted as prerequisites, the release process
has been put in chronological order, some details about preparing
backport packages have been moved to a separate README.md
in the Backport Packages to not confuse release instructions
with tool instructions.

* Fix syntax highlightling for concurrency in configurations doc (#11438)

`concurrency` -> ``concurrency`` since it is rendered in rst

* Fix typo in airflow/utils/dag_processing.py (#11445)

`availlablle` -> `available`

* Selective tests - depends on files changed in the commit. (#11417)

This is final step of implementing #10507 - selective tests.
Depending on files changed by the incoming commit, only subset of
the tests are exucted. The conditions below are evaluated in the
sequence specified below as well:

* In case of "push" and "schedule" type of events, all tests
  are executed.

* If no important files and folders changed - no tests are executed.
  This is a typical case for doc-only changes.

* If any of the environment files (Dockerfile/setup.py etc.) all
  tests are executed.

* If no "core/other" files are changed, only the relevant types
  of tests are executed:

  * API - if any of the API files/tests changed
  * CLI - if any of the CLI files/tests changed
  * WWW - if any of the WWW files/tests changed
  * Providers - if any of the Providers files/tests changed

* Integration Heisentests, Quarantined, Postgres and MySQL
  runs are always run unless all tests are skipped like in
  case of doc-only changes.

* If "Kubernetes" related files/tests are changed, the
  "Kubernetes" tests with Kind are run. Note that those tests
  are run separately using Host environment and those tests
  are stored in "kubernetes_tests" folder.

* If some of the core/other files change, all tests are run. This
  is calculated by substracting all the files count calculated
  above from the total count of important files.

Fixes: #10507

* Fix correct Sphinx return type for DagFileProcessorProcess.result (#11444)

* Use augmented assignment (#11449)

`tf_count += 1` instead of `tf_count = tf_count + 1`

* Remove redundant None provided as default to dict.get() (#11448)

* Fix spelling (#11453)

* Refactor celery worker command (#11336)

This commit does small refactor of the way we star celery worker.
In this way it will be easier to migrate to Celery 5.0.

* Move the test_process_dags_queries_count test to quarantine (#11455)

The test (test_process_dags_queries_count)
randomly produces bigger number of counts. Example here:

https://github.com/apache/airflow/runs/1239572585#step:6:421

* Google cloud operator strict type check (#11450)

import optimisation

* Increase timeout for waiting for images (#11460)

Now, when we have many more jobs to run, it might happen that
when a lot of PRs are submitted one-after-the-other there might
be longer waiting time for building the image.

There is only one waiting job per image type, so it does not
cost a lot to wait a bit longer, in order to avoid cancellation
after 50 minutes of waiting.

* Add more testing methods to dev/README.md (#11458)

* Adds missing schema for kerberos sidecar configuration (#11413)

* Adds missing schema for kerberos sidecar configuration

The kerberos support added in #11130 did not have schema added
to the values.yml. This PR fixes it.

Co-authored-by: Jacob Ferriero <[email protected]>

* Update chart/values.schema.json

Co-authored-by: Jacob Ferriero <[email protected]>

* Rename backport packages to provider packages (#11459)

In preparation for adding provider packages to 2.0 line we
are renaming backport packages to provider packages.

We want to implement this in stages - first to rename the
packages, then split-out backport/2.0 providers as part of
the #11421 issue.

* Add option to enable TCP keepalive for communication with Kubernetes API (#11406)

* Add option to enable TCP keepalive mechanism for communication with Kubernetes API

* Add keepalive default options to default_airflow.cfg

* Add reference to PR

* Quote parameters names in configuration

* Add problematic words to spelling_wordlist.txt

* Enables back duplicate cancelling on push/schedule (#11471)

We disabled duplicate cancelling on push/schedule in #11397
but then it causes a lot of extra strain in case several commits
are merged in quick succession. The master merges are always
full builds and take a lot of time, but if we merge PRs
quickly, the subsequent merge cancels the previous ones.

This has the negative consequence that we might not know who
broke the master build, but this happens rarely enough to suffer
the pain at expense of much less strained queue in GitHub Actions.

* Fix typo in docker-context-files/README.md (#11473)

`th` -> `the`

* added type hints for aws cloud formation (#11470)

* Mask Password in Log table when using the CLI (#11468)

* Mount volumes and volumemounts into scheduler and workers (#11426)

* Mount arbitrary volumes and volumeMounts to scheduler and worker

Allows users to mount volumes to scheduler and workers

* tested

* Bump FAB to 3.1 (#11475)

FAB released a new version today - https://pypi.org/project/Flask-AppBuilder/3.1.0/ which removes the annoying missing font file format for glyphicons error

* Allow multiple schedulers in helm chart (#11330)

* Allow multiple schedulers in helm chart

* schema

* add docs

* add to readme

Co-authored-by: Daniel Imberman <[email protected]>

* Fix Harcoded Airflow version (#11483)

This test will fail or will need fixing whenever we release new Airflow
version

* Add link on External Task Sensor to navigate to target dag (#11481)

Co-authored-by: Kaz Ukigai <[email protected]>

* Spend less time waiting for LocalTaskJob's subprocss process to finish (#11373)

* Spend less time waiting for LocalTaskJob's subprocss process to finish

This is about is about a 20% speed up for short running tasks!

This change doesn't affect the "duration" reported in the TI table, but
does affect the time before the slot is freeded up from the executor -
which does affect overall task/dag throughput.

(All these tests are with the same BashOperator tasks, just running `echo 1`.)

**Before**

```
Task airflow.executors.celery_executor.execute_command[5e0bb50c-de6b-4c78-980d-f8d535bbd2aa] succeeded in 6.597011625010055s: None
Task airflow.executors.celery_executor.execute_command[0a39ec21-2b69-414c-a11b-05466204bcb3] succeeded in 6.604327297012787s: None

```

**After**

```
Task airflow.executors.celery_executor.execute_command[57077539-e7ea-452c-af03-6393278a2c34] succeeded in 1.7728257849812508s: None
Task airflow.executors.celery_executor.execute_command[9aa4a0c5-e310-49ba-a1aa-b0760adfce08] succeeded in 1.7124666879535653s: None
```

**After, including change from #11372**

```
Task airflow.executors.celery_executor.execute_command[35822fc6-932d-4a8a-b1d5-43a8b35c52a5] succeeded in 0.5421732050017454s: None
Task airflow.executors.celery_executor.execute_command[2ba46c47-c868-4c3a-80f8-40adaf03b720] succeeded in 0.5469810889917426s: None
```

* Add endpoints for task instances (#9597)

* Enable serialization by default (#11491)

We actually need to make serialization the default, but this is an
interim measure for Airflow2.0.0.alpha1 reease

Since many of the tests will fail with it enabled (they need fixing up
to ensure DAGs are in the serializated table) as a hacky measure we have
set it back to false in the tests.

* Add missing values entries to Parameters in chart/README.md (#11477)

* Rename "functional DAGs" to "Decorated Flows" (#11497)

Functional DAGs were so called because the DAG is "made up of funcitons"
but this AIP adds much more than just the task decorator change -- it
adds nicer XCom use, and in many cases automatic dependencies between
tasks.

"Functional" also invokes "functional programming" which this isn't.

* Prevent text-selection of scheduler interval when selecting DAG ID (#11503)

* Mark Smart Sensor as an early-access feature (#11499)

* Fix spelling for Airbnb (#11505)

* Added support for provider packages for Airflow 2.0 (#11487)

* Separate changes/readmes for backport and regular providers

We have now separate release notes for backport provider
packages and regular provider packages.

They have different versioning - backport provider
packages with CALVER, regular provider packages with
semver.

* Added support for provider packages for Airflow 2.0

This change consists of the following changes:

* adds provider package support for 2.0
* adds generation of package readme and change notes
* versions are for now hard-coded to 0.0.1 for first release
* adds automated tests for installation of the packages
* rename backport package readmes/changes to BACKPORT_*
* adds regulaar packge readmes/changes
* updates documentation on generating the provider packaes
* adds CI tests for the packages
* maintains backport packages generation with --backports flag

Fixes #11421
Fixes #11424

* Airflow tutorial to use Decorated Flows (#11308)

Created a new Airflow tutorial to use Decorated Flows (a.k.a. functional
DAGs). Also created a DAG to perform the same operations without using
functional DAGs to be compatible with Airflow 1.10.x and to show the
difference.

* Apply suggestions from code review

It makes sense to simplify the return variables being passed around without needlessly conv…
@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:dev-tools labels Oct 14, 2020
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@mik-laj mik-laj removed the area:API Airflow's REST/HTTP API label Oct 19, 2020
@kaxil
Copy link
Member

kaxil commented Nov 4, 2020

Can you please rebase your PR on latest Master since we have applied Black and PyUpgrade on Master.

It will help if your squash your commits into single commit first so that there are less conflicts.

@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Dec 25, 2020
@github-actions github-actions bot closed this Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants