[AIRFLOW-6786] Added Kafka components differently #11520

dferguson992 · 2020-10-14T02:47:46Z

Dear Airflow Maintainers,

Please accept the following PR that

Add the KafkaProducerHook.
Add the KafkaConsumerHook.
Add the KafkaSensor which listens to messages with a specific topic.
Related Issue:
#1311

Issue link: AIRFLOW-6786

Make sure to mark the boxes below before creating PR: [x]

Description above provides context of the change
Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
Unit tests coverage for changes (not needed for documentation changes)
Commits follow "How to write a good git commit message"
Relevant documentation is updated including usage instructions.
I will engage committers as explained in Contribution Workflow Example.
For document-only changes commit message can start with [AIRFLOW-XXXX].
Reminder to contributors:

You must add an Apache License header to all new files
Please squash your commits when possible and follow the 7 rules of good Git commits
I am new to the community, I am not sure the files are at the right place or missing anything.

The sensor could be used as the first node of a dag where the second node can be a TriggerDagRunOperator. The messages are polled in a batch and the dag runs are dynamically generated.

Thanks!

Note, as per denied PR #1415, it is important to mention these integrations are not suitable for low-latency/high-throughput/streaming. For reference, #1415 (comment).

Co-authored-by: Dan Ferguson [email protected]
Co-authored-by: YuanfΞi Zhu

* Fix 'Upload documentation' step in CI (#10981) * Pins temporarily moto to 1.3.14 (#10986) As described in #10985, moto upgrade causes some AWS tests to fail. Until we fix it, we pin the version to 1.3.14. Part of #10985 * Allows to build production images for 1.10.2 and 1.10.1 Airflow (#10983) Airflow below 1.10.2 required SLUGIFY_USES_TEXT_UNIDECODE env variable to be set to yes. Our production Dockerfile and Breeze supports building images for any version of airflow >= 1.10.1 but it failed on 1.10.2 and 1.10.1 because this variable was not set. You can now set the variable when building image manually and Breeze does it automatically if image is 1.10.1 or 1.10.2 Fixes #10974 * The test_find_not_should_ignore_path is now in heisentests (#10989) It seems that the test_find_not_should_ignore_path test has some dependency on side-effects from other tests. See #10988 - we are moving this test to heisentests until we solve the issue. * Unpin 'tornado' dep pulled in by flower (#10993) 'tornado' version was pinned in https://github.com/apache/airflow/pull/4815 The underlying issue is fixed for Py 3.5.2 so that is not a problem. Also since Airflow Master is already Py 3.6+ this does not apply to us. * Simplify the K8sExecutor and K8sPodOperator (#10393) * Simplify Airflow on Kubernetes Story Removes thousands of lines of code that essentially ammount to us re-creating the Kubernetes API. Will offer a faster, simpler KubernetesExecutor for 2.0 * Fix podgen tests * fix documentation * simplify validate function * @mik-laj comments * spellcheck * spellcheck * Update airflow/executors/kubernetes_executor.py Co-authored-by: Kaxil Naik <[email protected]> Co-authored-by: Kaxil Naik <[email protected]> * Add new teammate to Polidea (#11000) * Fetching databricks host from connection if not supplied in extras. (#10762) * Fetching databricks host from connection if not supplied in extras. * Fixing formatting issue in databricks test Co-authored-by: joshi95 <[email protected]> * Remove redundant curly brace from breeze echo message (#11012) Before: ``` ❯ ./breeze --github-image-id 260274893 GitHub image id: 260274893} ``` After: ``` ❯ ./breeze --github-image-id 260274893 GitHub image id: 260274893 ``` * KubernetesJobWatcher no longer inherits from Process (#11017) multiprocessing.Process is set up in a very unfortunate manner that pretty much makes it impossible to test a class that inherits from Process or use any of its internal functions. For this reason we decided to seperate the actual process based functionality into a class member * Replace PNG/text with SVG that includes name in proper typography (#11018) * [AIP-34] TaskGroup: A UI task grouping concept as an alternative to SubDagOperator (#10153) This commit introduces TaskGroup, which is a simple UI task grouping concept. - TaskGroups can be collapsed/expanded in Graph View when clicked - TaskGroups can be nested - TaskGroups can be put upstream/downstream of tasks or other TaskGroups with >> and << operators - Search box, hovering, focusing in Graph View treats TaskGroup properly. E.g. searching for tasks also highlights TaskGroup that contains matching task_id. When TaskGroup is expanded/collapsed, the affected TaskGroup is put in focus and moved to the centre of the graph. What this commit does not do: - This commit does not change or remove SubDagOperator. Although TaskGroup is intended as an alternative for SubDagOperator, deprecating SubDagOperator will need to be discussed/implemented in the future. - This PR only implemented TaskGroup handling in the Graph View. In places such as Tree View, it will look like as-if - TaskGroup does not exist and all tasks are in the same flat DAG. GitHub Issue: #8078 AIP: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator * Support extra_args in S3Hook and GCSToS3Operator (#11001) * Remove Edit Button from DagModel View (#11026) * Increase typing coverage JDBC provider (#11021) * Add typing to amazon provider EMR (#10910) * Fix typo in DagFileProcessorAgent._last_parsing_stat_received_at (#11022) `_last_parsing_stat_recieved_at` -> `_last_parsing_stat_received_at` * Fix logo issues due to generic scripting selector use (#11028) Resolves #11025 * Get Airflow configs with sensitive data from AWS Systems Manager (#11023) Adds support to AWS SSM for feature added in https://github.com/apache/airflow/pull/9645 * Refactor rebase copy (#11030) * Add D204 pydocstyle check (#11031) * Starting breeze will run an init script after the environment is setup (#11029) Added the possibility to run an init script * Replace JS package toggle w/ pure CSS solution (#11035) * Only gather KinD logs if tests fail (#11058) * Separates out user documentation for production images. (#10998) We have now much better user-facing documentation. Only the parts interesting for users of the image are separated out to the "docs" of Airflow. The README and IMAGES.rst contains links to those docs and internal details of the images respectively. Fixes #10997. * All versions in CI yamls are not hard-coded any more (#10959) GitHub Actions allow to use `fromJson` method to read arrays or even more complex json objects into the CI workflow yaml files. This, connected with set::output commands, allows to read the list of allowed versions as well as default ones from the environment variables configured in ./scripts/ci/libraries/initialization.sh This means that we can have one plece in which versions are configured. We also need to do it in "breeze-complete" as this is a standalone script that should not source anything we added BATS tests to verify if the versions in breeze-complete correspond with those defined in the initialization.sh Also we do not limit tests any more in regular PRs now - we run all combinations of available versions. Our tests run quite a bit faster now so we should be able to run more complete matrixes. We can still exclude individual values of the matrixes if this is too much. MySQL 8 is disabled from breeze for now. I plan a separate follow up PR where we will run MySQL 8 tests (they were not run so far) * Add Workflow to delete old artifacts (#11064) * [Doc] Correct description for macro task_instance_key_str (#11062) Correction based on code https://github.com/apache/airflow/blob/master/airflow/models/taskinstance.py * Revert "KubernetesJobWatcher no longer inherits from Process (#11017)" (#11065) This reverts commit 1539bd051cfbc41c1c7aa317fc7df82dab28f9f8. * Add JSON schema validation for Helm values (#10664) fixes #10634 * Get Airflow configs with sensitive data from CloudSecretManagerBackend (#11024) * Add some tasks using BashOperator in TaskGroup example dag (#11072) Previously all the tasks in airflow/example_dags/example_task_group.py were using DummyOperator which does not go to executor and is marked as success in Scheduler itself so it would be good to have some tasks that aren't dummy operator to properly test TaskGroup functionality * Replace Airflow Slack Invite old link to short link (#11071) Follow up to https://github.com/apache/airflow/pull/10034 https://apache-airflow-slack.herokuapp.com/ to https://s.apache.org/airflow-slack/ * Fix s.apache.org Slack link (#11078) Remove ending / from s.apache.org Slack link * Pandas behaviour for None changed in 1.1.2 (#11004) In Pandas version 1.1.2 experimental NAN value started to be returned instead of None in a number of places. That broke our tests. Fixing the tests also requires the Pandas to be updated to be >=1.1.2 * Improves deletion of old artifacts. (#11079) We introduced deletion of the old artifacts as this was the suspected culprit of Kubernetes Job failures. It turned out eventually that those Kubernetes Job failures were caused by the #11017 change, but it's good to do housekeeping of the artifacts anyway. The delete workflow action introduced in a hurry had two problems: * it runs for every fork if they sync master. This is a bit too invasive * it fails continuously after 10 - 30 minutes every time as we have too many old artifacts to delete (GitHub has 90 days retention policy so we have likely tens of thousands of artifacts to delete) * it runs every hour and it causes occasional API rate limit exhaustion (because we have too many artifacts to loop trough) This PR introduces filtering with the repo, changes the frequency of deletion to be 4 times a day. Back of the envelope calculation tops 4/day at 2500 artifacts to delete at every run so we have low risk of reaching 5000 API calls/hr rate limit. and adds script that we are running manually to delete those excessive artifacts now. Eventually when the number of artifacts goes down the regular job should delete maybe a few hundreds of artifacts appearing within the 6 hours window in normal circumstances and it should stop failing then. * Requirements might get upgraded without setup.py change (#10784) I noticed that when there is no setup.py changes, the constraints are not upgraded automatically. This is because of the docker caching strategy used - it simply does not even know that the upgrade of pip should happen. I believe this is really good (from security and incremental updates POV to attempt to upgrade at every successfull merge (not that the upgrade will not be committed if any of the tests fail and this is only happening on every merge to master or scheduled run. This way we will have more often but smaller constraint changes. Depends on #10828 * Add D202 pydocstyle check (#11032) * Add permissions for stable API (#10594) Related Github Issue: https://github.com/apache/airflow/issues/8112 * Make Skipmixin handle empty branch properly (#10751) closes: #10725 Make sure SkipMixin.skip_all_except() handles empty branches like this properly. When "task1" is followed, "join" must not be skipped even though it is considered to be immediately downstream of "branch". * SkipMixin: Add missing session.commit() and test (#10421) * Fix typo in STATIC_CODE_CHECKS.rst (#11094) `realtive` -> `relative` * Avoid redundant SET conversion (#11091) * Avoid redundant SET conversion get_accessible_dag_ids() returns a SET, so no need to apply set() again * Add type annotation for get_accessible_dag_ids() * Fix for pydocstyle D202 (#11096) 'issues' introduced in https://github.com/apache/airflow/pull/10594 * Security upgrade lodash from 4.17.19 to 4.17.20 (#11095) Details: https://snyk.io/vuln/SNYK-JS-LODASH-590103 * Introducing flags to skip example dags and default connections (#11099) * Add template fields renderers for better UI rendering (#11061) This PR adds possibility to define template_fields_renderers for an operator. In this way users will be able to provide information what lexer should be used for rendering a particular field. This is super useful for custom operator and gives more flexibility than predefined keywords. Co-authored-by: Kamil Olszewski <[email protected]> Co-authored-by: Felix Uellendall <[email protected]> * Fix sort-in-the-wild pre-commit on Mac (#11103) * Fix typo in README (#11106) * Add Opensignal to INTHEWILD.md (#11105) * Revert "Introducing flags to skip example dags and default connections (#11099)" (#11110) This reverts commit 0edc3dd57953da5c66a4b45d49c1426cc0295f9e. * Update initialize-database.rst (#11109) * Update initialize-database.rst Remove ambiguity in the language as only MySQL, Postgres and SQLite are supported backends. * Update docs/howto/initialize-database.rst Co-authored-by: Jarek Potiuk <[email protected]> Co-authored-by: Xiaodong DENG <[email protected]> Co-authored-by: Jarek Potiuk <[email protected]> * Increasing type coverage FTP (#11107) * Adds timeout in CI/PROD waiting jobs (#11117) In very rare cases, the waiting job might not be cancelled when the "Build Image" job fails or gets cancelled on its own. In the "Build Image" workflow we have this step: - name: "Canceling the CI Build source workflow in case of failure!" if: cancelled() || failure() uses: potiuk/cancel-workflow-runs@v2 with: token: ${{ secrets.GITHUB_TOKEN }} cancelMode: self sourceRunId: ${{ github.event.workflow_run.id }} But when this step fails or gets cancelled on its own before cancel is triggered, the "wait for image" steps could run for up to 6 hours. This change sets 50 minutes timeout for those jobs. Fixes #11114 * Add Helm Chart linting (#11108) * README Doc: Link to Airflow directory in ASF Directory (#11137) `https://downloads.apache.org` -> `https://downloads.apache.org/airflow` (links to precise dir) * Fix incorrect Usage of Optional[bool] (#11138) Optional[bool] = Union[None, bool] There were incorrect usages where the default was already set to a boolean value but still Optional was used * Fix FROM directive in docs/production-deployment.rst (#11139) `FROM:` -> `FROM` * Increasing type coverage for salesforce provide (#11135) * Added support for encrypted private keys in SSHHook (#11097) * Added support for encrypted private keys in SSHHook * Fixed Styling issues and added unit testing * fixed last pylint styling issue by adding newline to the end of the file * re-fixed newline issue for pylint checks * fixed pep8 styling issues and black formatted files to pass static checks * added comma as per suggestion to fix static check Co-authored-by: Nadim Younes <[email protected]> * Fix error message when checking literalinclude in docs (#11140) Before: ``` literalinclude directive is is prohibited for example DAGs ``` After: ``` literalinclude directive is prohibited for example DAGs ``` * Upgrade to latest isort & pydocstyle (#11142) isort: from 5.4.2 to 5.5.3 pydocstyle: from 5.0.2 to 5.1.1 * Do not silently allow the use of undefined variables in jinja2 templates (#11016) This can have *extremely* bad consequences. After this change, a jinja2 template like the one below will cause the task instance to fail, if the DAG being executed is not a sub-DAG. This may also display an error on the Rendered tab of the Task Instance page. task_instance.xcom_pull('z', key='return_value', dag_id=dag.parent_dag.dag_id) Prior to the change in this commit, the above template would pull the latest value for task_id 'z', for the given execution_date, from *any DAG*. If your task_ids between DAGs are all unique, or if DAGs using the same task_id always have different execution_date values, this will appear to act like dag_id=None. Our current theory is SQLAlchemy/Python doesn't behave as expected when comparing `jinja2.Undefined` to `None`. * Move Backport Providers docs to our docsite (#11136) * Fix user in helm chart pgbouncer deployment (#11143) * Fixes celery deployments for Airflow 2.0 (#11129) The celery flower and worker commands have changed in Airflow 2.0. The Helm Chart supported only 1.10 version of those commands and this PR fixes it by adding both variants of them. * Fix gitSync user in the helm Chart (#11127) There was a problem with user in Git Sync mode of the Helm Chart in connection with the git sync image and official Airflow image. Since we are using the official image, most of the containers are run with the "50000" user, but the git-sync image used by the git sync user is 65533 so we have to set it as default. We also exposed that value as parameter, so that another image could be used here as well. * Fix incorrect Usage of Optional[str] & Optional[int] (#11141) From https://docs.python.org/3/library/typing.html#typing.Optional ``` Optional[X] is equivalent to Union[X, None]. ``` >Note that this is not the same concept as an optional argument, which is one that has a default. An optional argument with a default does not require the Optional qualifier on its type annotation just because it is optional. There were incorrect usages where the default was already set to a string or int value but still Optional was used * Remove link to Dag Model view given the redundancy with DAG Details view (#11082) * Make sure pgbouncer-exporter docker image is linux/amd64 (#11148) Closes #11145 * Update to latest version of pbgouncer-exporter (#11150) There was a problem with Mac version of pgbouncer exporter created and released previously. This commit releases the latest version making sure that Linux Go is used to build the pgbouncer binary. * Add new member to Polidea (#11153) * Massively speed up the query returned by TI.filter_for_tis (#11147) The previous query generated SQL like this: ``` WHERE (task_id = ? AND dag_id = ? AND execution_date = ?) OR (task_id = ? AND dag_id = ? AND execution_date = ?) ``` Which is fine for one or maybe even 100 TIs, but when testing DAGs at extreme size (over 21k tasks!) this query was taking for ever (162s on Postgres, 172s on MySQL 5.7) By changing this query to this ``` WHERE task_id IN (?,?) AND dag_id = ? AND execution_date = ? ``` the time is reduced to 1s! (1.03s on Postgres, 1.19s on MySQL) Even on 100 tis the reduction is large, but the overall time is not significant (0.01451s -> 0.00626s on Postgres). Times included SQLA query construction time (but not time for calling filter_for_tis. So a like-for-like comparison), not just DB query time: ```python ipdb> start_filter_20k = time.monotonic(); result_filter_20k = session.query(TI).filter(tis_filter).all(); end_filter_20k = time.monotonic() ipdb> end_filter_20k - start_filter_20k 172.30647455298458 ipdb> in_filter = TI.dag_id == self.dag_id, TI.execution_date == self.execution_date, TI.task_id.in_([o.task_id for o in old_states.keys()]); ipdb> start_20k_custom = time.monotonic(); result_custom_20k = session.query(TI).filter(in_filter).all(); end_20k_custom = time.monotonic() ipdb> end_20k_custom - start_20k_custom 1.1882996069907676 ``` I have also removed the check that was ensuring everything was of the same type (all TaskInstance or all TaskInstanceKey) as it felt needless - both types have the three required fields, so the "duck-typing" approach at runtime (crash if doesn't have the required property)+mypy checks felt Good Enough. * Increase Type coverage for IMAP provider (#11154) * Increasing type coverage for multiple provider (#11159) * Optionally disables PIP cache from GitHub during the build (#11173) This is first step of implementing the corporate-environment friendly way of building images, where in the corporate environment, this might not be possible to install the packages using the GitHub cache initially. Part of #11171 * Update UPDATING.md (#11172) * New Breeze command start-airflow, it replaces the previous flag (#11157) * Conditional MySQL Client installation (#11174) This is the second step of making the Production Docker Image more corporate-environment friendly, by making MySQL client installation optional. Instaling MySQL Client on Debian requires to reach out to oracle deb repositories which might not be approved by security teams when you build the images. Also not everyone needs MySQL client or might want to install their own MySQL client or MariaDB client - from their own repositories. This change makes the installation step separated out to script (with prod/dev installation option). The prod/dev separation is needed because MySQL needs to be installed with dev libraries in the "Build" segment of the image (requiring build essentials etc.) but in "Final" segment of the image only runtime libraries are needed. Part of #11171 Depends on #11173. * Add example DAG and system test for MySQLToGCSOperator (#10990) * Increase type coverage for five different providers (#11170) * Increasing type coverage for five different providers * Added more strict type * Adds Kubernetes Service Account for the webserver (#11131) Webserver did not have a Kubernetes Service Account defined and while we do not strictly need to use the service account for anything now, having the Service Account defined allows to define various capabilities for the webserver. For example when you are in the GCP environment, you can map the Kubernetes service account into a GCP one, using Workload Identity without the need to define any secrets and performing additional authentication. Then you can have that GCP service account get the permissions to write logs to GCS bucket. Similar mechanisms exist in AWS and it also opens up on-premises configuration. See more at https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity Co-authored-by: Jacob Ferriero <[email protected]> Co-authored-by: Jacob Ferriero <[email protected]> * Allow overrides for pod_template_file (#11162) * Allow overrides for pod_template_file A pod_template_file should be treated as a *template* not a steadfast rule. This PR ensures that users can override individual values set by the pod_template_file s.t. the same file can be used for multiple tasks. * fix podtemplatetest * fix name * Enables Kerberos sidecar support (#11130) Some of the users of Airflow are using Kerberos to authenticate their worker workflows. Airflow has a basic support for Kerberos for some of the operators and it has support to refresh the temporary Kerberos tokens via `airflow kerberos` command. This change adds support for the Kerberos side-car that connects to the Kerberos Key Distribution Center and retrieves the token using Keytab that should be deployed as Kubernetes Secret. It uses shared volume to share the temporary token. The nice thing about setting it up as a sidecar is that the Keytab is never shared with the workers - the secret is only mounted by the sidecar and the workers have only access to the temporary token. Depends on #11129 * Make kill log in DagFileProcessorProcess more informative (#11124) * Show the location of the queries when the assert_queries_count fails. (#11186) Example output (I forced one of the existing tests to fail) ``` E AssertionError: The expected number of db queries is 3. The current number is 2. E E Recorded query locations: E scheduler_job.py:_run_scheduler_loop>scheduler_job.py:_emit_pool_metrics>pool.py:slots_stats:94: 1 E scheduler_job.py:_run_scheduler_loop>scheduler_job.py:_emit_pool_metrics>pool.py:slots_stats:101: 1 ``` This makes it a bit easier to see what the queries are, without having to re-run with full query tracing and then analyze the logs. * Improve Google Transfer header in documentation index file (#11166) * Fix typos in Dockerfile.ci (#11187) Fixed some spellings * Remove Unnecessary comprehension in 'any' builtin function (#11188) The inbuilt functions `any()` support short-circuiting (evaluation stops as soon as the overall return value of the function is known), but this behavior is lost if you use comprehension. This affects performance. * Optionally tags image when building with Breeze (#11181) Breeze tags the image based on the default python version, branch, type of the image, but you might want to tag the image in the same command especially in automated cases of building the image via CI scripts or security teams that tag the imge based on external factors (build time, person etc.). This is part of #11171 which makes the image easier to build in corporate environments. * in_container bats pre-commit hook and updated bats-tests hook (#11179) * Fixes image tag readonly failure (#11194) The image builds fine, but produces an unnecessary error message. Bug Introduced in c9a34d2ef9ccf6c18b379bbcb81b9381027eb803 * More customizable build process for Docker images (#11176) * Allows more customizations for image building. This is the third (and not last) part of making the Production image more corporate-environment friendly. It's been prepared for the request of one of the big Airflow user (company) that has rather strict security requirements when it comes to preparing and building images. They are committed to synchronizing with the progress of Apache Airflow 2.0 development and making the image customizable so that they can build it using only sources controlled by them internally was one of the important requirements for them. This change adds the possibilty of customizing various steps in the build process: * adding custom scripts to be run before installation of both build image and runtime image. This allows for example to add installing custom GPG keys, and adding custom sources. * customizing the way NodeJS and Yarn are installed in the build image segment - as they might rely on their own way of installation. * adding extra packages to be installed during both build and dev segment build steps. This is crucial to achieve the same size optimizations as the original image. * defining additional environment variables (for example environment variables that indicate acceptance of the EULAs in case of installing proprietary packages that require EULA acceptance - both in the build image and runtime image (again the goal is to keep the image optimized for size) The image build process remains the same when no customization options are specified, but having those options increases flexibility of the image build process in corporate environments. This is part of #11171. This change also fixes some of the issues opened and raised by other users of the Dockerfile. Fixes: #10730 Fixes: #10555 Fixes: #10856 Input from those issues has been taken into account when this change was designed so that the cases described in those issues could be implemented. Example from one of the issue landed as an example way of building highly customized Airflow Image using those customization options. Depends on #11174 * Update IMAGES.rst Co-authored-by: Kamil Breguła <[email protected]> * [AIRFLOW-5545] Fixes recursion in DAG cycle testing (#6175) * Fixes an issue where cycle detection uses recursion and stack overflows after about 1000 tasks (cherry picked from commit 63f1a180a17729aa937af642cfbf4ddfeccd1b9f) * reduce test length * slightly more efficient * Update airflow/utils/dag_cycle_tester.py Co-authored-by: Kaxil Naik <[email protected]> * slightly more efficient * actually works this time Co-authored-by: Daniel Imberman <[email protected]> Co-authored-by: Kaxil Naik <[email protected]> * Add amazon glacier to GCS transfer operator (#10947) Add Amazon Glacier to GCS transfer operator, Glacier job operator and sensor. * Strict type coverage for Oracle and Yandex provider (#11198) * type coverage for yandex provider * type coverage for oracle provider * import optimisation and mypy fix * import optimisation * static check fix * Strict type checking for SSH (#11216) * Replace get accessible dag ids (#11027) * Kubernetes executor can adopt tasks from other schedulers (#10996) * KubernetesExecutor can adopt tasks from other schedulers * simplify * recreate tables properly * fix pylint Co-authored-by: Daniel Imberman <[email protected]> * Optional import error tracebacks in web ui (#10663) This PR allows for partial import error tracebacks to be exposed on the UI, if enabled. This extra context can be very helpful for users without access to the parsing logs to determine why their DAGs are failing to import properly. * Strict type check for multiple providers (#11229) * Fix typo in command in CI.rst (#11233) * Add Python version to Breeze cmd (#11228) * Use more meaningfull message for DagBag timeouts (#11235) Instead of 'Timeout, PID: 1234' we can use something more meaningful that will help users understand the logs. * Prepare Backport release 2020.09.07 (#11238) * Airflow 2.0 UI Overhaul/Refresh (#11195) Resolves #10953. A refreshed UI for the 2.0 release. The existing "theming" is a bit long in the tooth and this PR attempts to give it a modern look and some freshness to compliment all of the new features under the hood. The majority of the changes to UI have been done through updates to the Bootstrap theme contained in bootstrap-theme.css. These are simply overrides to the default stylings that are packaged with Bootstrap. * Restore description for provider packages. (#11239) The change #10445 caused empty descriptions for all packages. This change restores it and also makes sure package creation works when there is no README.md * Small updates to provider preparation docs. (#11240) * Fixed month in backport packages to October (#11242) * Add task adoption to CeleryKubernetesExecutor (#11244) Routes task adoption based on queue name to CeleryExecutor or KubernetesExecutor Co-authored-by: Daniel Imberman <[email protected]> * Remove redundant parentheses (#11248) * Fix Broken Markdown links in Providers README TOC (#11249) * Add option to bulk clear DAG Runs in Browse DAG Runs page (#11226) closes: #11076 * Update yamllint & isort pre-commit hooks (#11252) yamllint: v1.24.2 -> v1.25.0 isort: 5.5.3 -> 5.5.4 * Ensure target_dedicated_nodes or enable_auto_scale is set in AzureBatchOperator (#11251) * Add s3 key to template fields for s3/redshift transfer operators (#10890) * Add missing "example" tag on example DAG (#11253) `example_task_group` and `example_nested_branch_dag` didn't have the example tag while all the other ones do have it * Breeze: Fix issue with pulling an image via ID (#11255) * Move test tools from tests.utils to tests.test_utils (#10889) * Add Github Code Scanning (#11211) Github just released Github Code Scanning: https://github.blog/2020-09-30-code-scanning-is-now-available/ * Add Hacktoberfest topic to the repo (#11258) * Add operator link to access DAG triggered by TriggerDagRunOperator (#11254) This commit adds TriggerDagRunLink which allows users to access easily access in Web UI a DAG triggered by TriggerDagRunOperator * The bats script for CI image is now placed in the docker folder (#11262) The script was previously placed in scripts/ci which caused a bit of a problem in 1-10-test branch where PRs were using scripts/ci from the v1-10-test HEAD but they were missing the ci script from the PR. The scripts "ci" are parts of the host scripts that are always taken from master when the image is built, but all the other stuff should be taken from "docker" folder - which will be taken from the PR. * Limits CodeQL workflow to run only in the Apache Airflow repo (#11264) It has been raised quite a few times that workflow added in forked repositories might be pretty invasive for the forks - especially when it comes to scheduled workflows as they might eat quota or at least jobs for those organisations/people who fork repositories. This is not strictly necessary because Recently GitHub recognized this as being a problem and introduced new rules for scheduled workflows. But for people who are already forked, it would be nice to not run those actions. It is enough that the CodeQL check is done when PR is opened to the "apache/airflow" repository. Quote from the emails received by Github (no public URL explaining it yet): > Scheduled workflows will be disabled by default in forks of public repos and in public repos with no activity for 60 consecutive days. We’re making two changes to the usage policy for GitHub Actions. These changes will enable GitHub Actions to scale with the incredible adoption we’ve seen from the GitHub community. Here’s a quick overview: > * Starting today, scheduled workflows will be disabled by default in new forks of public repositories. > * Scheduled workflows will be disabled in public repos with no activity for 60 consecutive days. * Enable MySQL 8 CI jobs (#11247) closes https://github.com/apache/airflow/issues/11164 * Improve running and canceliling of the PR-triggered builds. (#11268) The PR builds are now better handled with regards to both running (using merge-request) and canceling (with cancel notifications). First of all we are using merged commit from the PR, not the original commit from the PR. Secondly - the workflow run notifies the original PR with comment stating that the image is being built in a separate workflow - including the link to that workflow. Thirdly - when canceling duplicate PRs or PRs with failed jobs, the workflow will add a comment to the PR stating the reason why the PR is being cancelled. Last but not least, we also add cancel job for the CodeQL duplicate messages. They run for ~ 12 miinutes so it makes perfect sense to also cancel those CodeQL jobs for which someone pushed fixups in a quick succession. Fixes: #10471 * Fix link to static checks in CONTRIBUTING.rst (#11271) * fix job deletion (#11272) * Allow labels in KubernetesPodOperator to be templated (#10796) * Access task type via the property, not dundervars (#11274) We don't currently create TIs form serialized dags, but we are about to start -- at which point some of these cases would have just shown "SerializedBaseOperator", rather than the _real_ class name. The other changes are just for "consistency" -- we should always get the task type from this property, not via `__class__.__name__`. I haven't set up a pre-commit rule for this as using this dunder accessor is used elsewhere on things that are not BaseOperator instances, and detecting that is hard to do in a pre-commit rule. * When sending tasks to celery from a sub-process, reset signal handlers (#11278) Since these processes are spawned from SchedulerJob after it has registered it's signals, if any of them got signaled they would have the behaviour of killing the ProcessorAgent process group! (MP has a default spawn of fork on Linux, so they inherit all previous state -- signals, and access to the `_process.pid` inside the ProcessorAgent instance) This behaviour is not what we want for these multiprocess.Pool processes. This _may_ be a source of the long-standing "scheduler is alive but not scheduling any jobs. Maybe. * Switched to Run Checks for Building Images. (#11276) Replaces the annoying comments with "workflow_run" links with Run Checks. Now we will be able to see the "Build Image" checks in the "Checks" section including their status and direct link to the steps running the image builds as "Details" link. Unfortunately Github Actions do not handle well the links to details - even if you provide details_url to link to the other run, the "Build Image" checks appear in the original workflow, that's why we had to introduce another link in the summary of the Build Image check that links to the actual workflow. * Single/Multi-Namespace mode for helm chart (#11034) * Multi-Namespace mode for helm chart Users should not REQUIRE a ClusterRole/ClusterRolebinding to run airflow via helm. This change will allow "single" and "multi" namespace modes so users can add airflow to managed kubernetes clusters * add namespace to role * add rolebinding too * add docs * add values.schema.json change * Add LocalToAzureDataLakeStorageOperator (#10814) * Add CeleryKubernetesExecutor to helm chart (#11288) Users of the CeleryKubernetesExecutor will require both Celery and Kubernetes features to launch tasks. This PR will also serve as the basis for integration tests for this executor Co-authored-by: Daniel Imberman <[email protected]> * Strict type check for all hooks in amazon (#11250) * Replaces depreated set-env with env file (#11292) Github Actions deprecated the set-env action due to moderate security vulnerability they found. https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/ This commit replaces set-env with env file as explained in https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#environment-files * Breeze start-airflow command wasn't able to initialize the db in 1.10.x (#11207) * Add type annotations to ZendeskHook, update unit test (#10888) * Add type annotations to ZendeskHook __What__ * Add correct type annotations to ZendeskHook and each method * Update one unit test to call an empty dictionary rather than a NoneType since the argument should be a dictionary __Why__ * Building out type annotations is good for the code base * The query parameter is accessed with an index at one point, which means that it cannot be a None type, but should rather be defaulted to an empty dictionary if not provided * Remove useless return * Add acl_policy parameter to GCSToS3Operator (#10804) (#10829) * add releasing airflow docs to dev readme (#11245) * Prevent race condition in trying to collect result from DagFileProcessor (#11306) A rare race condition was noticed in the Scheduler HA PR where the "test_dags_with_system_exit" test would occasionally fail with the following symptoms: - The pipe was "readable" as returned by `multiprocessing.connection.wait` - On reading it yielded an EOFError, meaning the other side had closed the connection - But the process was still alive/running This previously would result in the Manager process dying with an error. This PR makes a few changes: - It ensures that the pipe is simplex, not duplex (we only ever send one data) as this is simpler - We ensure that the "other" end of the pipe is correctly closed in both parent and child processes. Without this the pipe would be kept open (sometimes) until the child process had closed anyway. - When we get an EOFError on reading and the process is still alive, we give it a few seconds to shut down cleanly, and then kill it. * Bump tenacity to 6.2 (#11313) * Move latest_only_operator.py to latest_only.py (#11178) (#11304) * Adds --no-rbac-ui flag for Breeze airflow 1.10 installation (#11315) When installing airflow 1.10 via breeze we now enable rbac by default, but we can disable it with --no-rbac-ui flag. This is useful to test different variants of 1.10 when testing release candidataes in connection with the 'start-airflow' command. * Add remaining community guidelines to CONTRIBUTING.rst (#11312) We are cleaning up the docs from CWiki and this is what's left of community guidelines that were maintained there. Fixes #10181 * Improve handling of job_id in BigQuery operators (#11287) Make autogenerated job_id more unique by using microseconds and hash of configuration. Replace dots in job_id. Closes: #11280 * Prints nicer message in case of git push errors (#11320) We started to get more often "unknown blob" kind of errors when pushing the images to GitHub Registry. While this is clearly a GitHub issue, it's frequency of occurence and unclear message make it a good candidate to write additional message with instructions to the users, especially that now they have an easy way to get to that information via status checks and links leading to the log file, when this problem happens during image building process. This way users will know that they should simply rebase or amend/force-push their change to fix it. * Add AzureFileShareToGCSOperator (#10991) * Automatically upgrade old default navbar color (#11322) As part of #11195 we re-styled the UI, changing a lot of the default colours to make them look more modern. However for anyone upgrading and keeping their airflow.cfg from 1.10 to 2.0 they would end up with things looking a bit ugly, as the old navbar color would be kept. This uses the existing config value upgrade feature to automatically change the old default colour in to the new default colour. * Pin versions of "untrusted" 3rd-party GitHub Actions (#11319) According to https://docs.github.com/en/free-pro-team@latest/actions/learn-github-actions/security-hardening-for-github-actions#using-third-party-actionsa it's best practice not to use tags in case of untrusted 3rd-party actions in order to avoid potential attacks. * Moves Commiter's guide to CONTRIBUTING.rst (#11314) I decided to move it to CONTRIBUTING.rst as is it is an important documentation on what policies we have agreed to as community and also it is a great resource for the contributor to learn what are the committer's responsibilities. Fixes: #10179 * Add environment variables documentation to cli-ref.rst. (#10970) Co-authored-by: Fai Hegberg <[email protected]> * Update link for Announcement Page (#11337) * Strict type check for azure hooks (#11342) * Adds --install-wheels flag to breeze command line (#11317) If this flag is specified it will look for wheel packages placed in dist folder and it will install the wheels from there after installing Airflow. This is useful for testing backport packages as well as in the future for testing provider packages for 2.0. * Improve code quality of SLA mechanism in SchedulerJob (#11257) * Improve Committer's guide docs (#11338) * Add Azure Blob Storage to GCS transfer operator (#11321) * Better message when Building Image fails or gets cancelled. (#11333) * Revert "Adds --install-wheels flag to breeze command line (#11317)" (#11348) This reverts commit de07d135ae1bda3f71dd83951bcfafc2b3ad9f89. * Fix command to run tmux with breeze in BREEZE.rst (#11340) `breeze --start-airflow` -> `breeze start-airflow` * Improve instructions to install Airflow Version (#11339) The instructions can be replaced by `./breeze start-airflow` command * Reduce "start-up" time for tasks in LocalExecutor (#11327) Spawning a whole new python process and then re-loading all of Airflow is expensive. All though this time fades to insignificance for long running tasks, this delay gives a "bad" experience for new users when they are just trying out Airflow for the first time. For the LocalExecutor this cuts the "queued time" down from 1.5s to 0.1s on average. * Bump cache version for kubernetes tests (#11355) Seems that the k8s cache for virtualenv got broken during the recent problems. This commits bumps the cache version to make it afresh * Better diagnostics when there are problems with Kerberos (#11353) * Fix to make y-axis of Tries chart visible (#10071) Co-authored-by: Venkatesh Selvaraj <[email protected]> * Bugfix: Error in SSHOperator when command is None (#11361) closes https://github.com/apache/airflow/issues/10656 * Allways use Airlfow db in FAB (#11364) * Use only-if-needed upgrade strategy for PRs (#11363) Currently, upgrading dependencies in setup.py still runs with previous versions of the package for the PR which fails. This will change to upgrade only the package that is required for the PRs * Fix DagBag bug when a dag has invalid schedule_interval (#11344) * Adding ElastiCache Hook for creating, describing and deleting replication groups (#8701) * Fix regression in DataflowTemplatedJobStartOperator (#11167) * Strict type check for Microsoft (#11359) * Reduce "start-up" time for tasks in CeleryExecutor (#11372) This is similar to #11327, but for Celery this time. The impact is not quite as pronounced here (for simple dags at least) but takes the average queued to start delay from 1.5s to 0.4s * Set start_date, end_date & duration for tasks failing without DagRun (#11358) * Replace nuke with useful information on error page (#11346) This PR replaces nuke asciiart with text about reporting a bug. As we are no longer using asciiarts this PR removes it. * Users can specify sub-secrets and paths k8spodop (#11369) Allows users to specify items for specific key path projections when using the airflow.kubernetes.secret.Secret class * Add capability of adding service account annotations to Helm Chart (#11387) We can now add annotations to the service accounts in a generic way. This allows for example to add Workflow Identitty in GKE environment but it is not limited to it. Co-authored-by: Kamil Breguła <[email protected]> Co-authored-by: Jacob Ferriero <[email protected]> Co-authored-by: Kamil Breguła <[email protected]> * Add pypirc initialization (#11386) This PR needs to be merged first in order to handle the #11385 which requires .pypirc to be created before dockerfile gets build. This means that the script change needs to be merged to master first in this PR. * Fully support running more than one scheduler concurrently (#10956) * Fully support running more than one scheduler concurrently. This PR implements scheduler HA as proposed in AIP-15. The high level design is as follows: - Move all scheduling decisions into SchedulerJob (requiring DAG serialization in the scheduler) - Use row-level locks to ensure schedulers don't stomp on each other (`SELECT ... FOR UPDATE`) - Use `SKIP LOCKED` for better performance when multiple schedulers are running. (Mysql < 8 and MariaDB don't support this) - Scheduling decisions are not tied to the parsing speed, but can operate just on the database *DagFileProcessorProcess*: Previously this component was responsible for more than just parsing the DAG files as it's name might imply. It also was responsible for creating DagRuns, and also making scheduling decisions of TIs, sending them from "None" to "scheduled" state. This commit changes it so that the DagFileProcessorProcess now will update the SerializedDAG row for this DAG, and make no scheduling decisions itself. To make the scheduler's job easier (so that it can make as many decisions as possible without having to load the possibly-large SerializedDAG row) we store/update some columns on the DagModel table: - `next_dagrun`: The execution_date of the next dag run that should be created (or None) - `next_dagrun_create_after`: The earliest point at which the next dag run can be created Pre-computing these values (and updating them every time the DAG is parsed) reduce the overall load on the DB as many decisions can be taken by selecting just these two columns/the small DagModel row. In case of max_active_runs, or `@once` these columns will be set to null, meaning "don't create any dag runs" *SchedulerJob* The SchedulerJob used to only queue/send tasks to the executor after they were parsed, and returned from the DagFileProcessorProcess. This PR breaks the link between parsing and enqueuing of tasks, instead of looking at DAGs as they are parsed, we now: - store a new datetime column, `last_scheduling_decision` on DagRun table, signifying when a scheduler last examined a DagRun - Each time around the loop the scheduler will get (and lock) the next _n_ DagRuns via `DagRun.next_dagruns_to_examine`, prioritising DagRuns which haven't been touched by a scheduler in the longest period - SimpleTaskInstance etc have been almost entirely removed now, as we use the serialized versions * Move callbacks execution from Scheduler loop to DagProcessorProcess * Don’t run verify_integrity if the Serialized DAG hasn’t changed dag_run.verify_integrity is slow, and we don't want to call it every time, just when the dag structure changes (which we can know now thanks to DAG Serialization) * Add escape hatch to disable newly added "SELECT ... FOR UPDATE" queries We are worried that these extra uses of row-level locking will cause problems on MySQL 5.x (most likely deadlocks) so we are providing users an "escape hatch" to be able to make these queries non-locking -- this means that only a singe scheduler should be run, but being able to run one is better than having the scheduler crash. Co-authored-by: Kaxil Naik <[email protected]> * Revert "Revert "Adds --install-wheels flag to breeze command line (#11317)" (#11348)" (#11356) This reverts commit f67e6cb805ebb88db1ca2c995de690dc21138b6b. * Replaced basestring with str in the Exasol hook (#11360) * [airflow/providers/cncf/kubernetes] correct hook methods name (#11008) * Fix airflow_local_settings.py showing up as directory (#10999) Fixes a bug where the airflow_local_settings.py mounts as a volume if there is no value (this causes k8sExecutor pods to fail) * Fix case of JavaScript. (#10957) * Add tests for Custom cluster policy (#11381) The custom ClusterPolicyViolation has been added in #10282 This one adds more comprehensive test to it. Co-authored-by: Jacob Ferriero <[email protected]> * KubernetesPodOperator should retry log tailing in case of interruption (#11325) * KubernetesPodOperator can retry log tailing in case of interruption * fix failing test * change read_pod_logs method formatting * KubernetesPodOperator retry log tailing based on last read log timestamp * fix test_parse_log_line test formatting * add docstring to parse_log_line method * fix kubernetes integration test * fix tests (#11368) * Constraints and PIP packages can be installed from local sources (#11382) * Constraints and PIP packages can be installed from local sources This is the final part of implementing #11171 based on feedback from enterprise customers we worked with. They want to have a capability of building the image using binary wheel packages that are locally available and the official Dockerfile. This means that besides the official APT sources the Dockerfile build should not needd GitHub, nor any other external files pulled from outside including PIP repository. This change also includes documentation on how to prepare set of such binaries ready for inspection and review by security teams in Enterprise environment. Such sets of "known-working-binary-whl" files can then be separately committed, tracked and scrutinized in an artifact repository of such an Enterprise. Fixes: #11171 * Update docs/production-deployment.rst * Push and schedule duplicates are not cancelled. (#11397) The push and schedule builds should not be cancelled even if they are duplicates. By seing which of the master merges failed, we have better visibility on which merge caused a problem and we can trace it's origin faster even if the builds will take longer overall. Scheduled builds also serve it's purpose and they should be always run to completion. * Remove redundant parentheses from Python files (#10967) * Fixes automated upgrade to latest constraints. (#11399) Wrong if query in the GitHub action caused upgrade to latest constraints did not work for a while. * Fixes cancelling of too many workflows. (#11403) A problem was introduced in #11397 where a bit too many "Build Image" jobs is being cancelled by subsequent Build Image run. For now it cancels all the Build Image jobs that are running :(. * Fix spelling (#11401) * Fix spelling (#11404) * Workarounds "unknown blob" issue by introducing retries (#11411) We have started to experience "unknown_blob" errors intermittently recently with GitHub Docker registry. We might eventually need to migrate to GCR (which eventually is going to replace the Docker Registry for GitHub: The ticket is opened to the Apache Infrastructure to enable access to the GCR and to make some statements about Access Rights management for GCR https://issues.apache.org/jira/projects/INFRA/issues/INFRA-20959 Also a ticket to GitHub Support has been raised about it https://support.github.com/ticket/personal/0/861667 as we cannot delete our public images in Docker registry. But until this happens, the workaround might help us to handle the situations where we got intermittent errors while pushing to the registry. This seems to be a common error, when NGINX proxy is used to proxy Github Registry so it is likely that retrying will workaround the issue. * Add capability of customising PyPI sources (#11385) * Add capability of customising PyPI sources This change adds capability of customising installation of PyPI modules via custom .pypirc file. This might allow to install dependencies from in-house, vetted registry of PyPI * Moving the test to quarantine. (#11405) I've seen the test being flaky and failing intermittently several times. Moving it to quarantine for now. * Optionally set null marker in csv exports in BaseSQLToGCSOperator (#11409) * Fixes SHA used for cancel-workflow-action (#11400) The SHA of cancel-workflow-action in #11397 was pointing to previous (3.1) version of the action. This PR fixes it to point to the right (3.2) version. * Split tests to more sub-types (#11402) We seem to have a problem with running all tests at once - most likely due to some resource problems in our CI, therefore it makes sense to split the tests into more batches. This is not yet full implementation of selective tests but it is going in this direction by splitting to Core/Providers/API/CLI tests. The full selective tests approach will be implemented as part of #10507 issue. This split is possible thanks to #10422 which moved building image to a separate workflow - this way each image is only built once and it is uploaded to a shared registry, where it is quickly downloaded from rather than built by all the jobs separately - this way we can have many more jobs as there is very little per-job overhead before the tests start runnning. * Fix incorrect typing, remove hardcoded argument values and improve code in AzureContainerInstancesOperator (#11408) * Fix constraints generation script (#11412) Constraints generation script was broken by recent changes in naming of constraints URL variables and moving generation of the link to the Dockerfile This change restores the script's behaviour. * Fix spelling in CeleryExecutor (#11407) * Add more info about dag_concurrency (#11300) * Update MySQLToS3Operator's s3_bucket to template_fields (#10778) * Change prefix of AwsDynamoDB hook module (#11209) * align import path of AwsDynamoDBHook in aws providers Co-authored-by: Tomek Urbaszek <[email protected]> * Strict type check for google ads and cloud hooks (#11390) * Mutual SSL added in PGBouncer configuration in the Chart (#11384) Adds SSL configuration for PGBouncer in the Helm Chart. PGBouncer is crucial to handle the big number of connections that airflow opens for the database, but often the database is outside of the Kubernetes Cluster or generally the environment where Airflow is installed and PGBouncer needs to connect securely to such database. This PR adds capability of seting CA/Certificate and Private Key in the PGBouncer configuration that allows for mTLS authentication (both client and server are authenticated) and secure connection even over public network. * Merge Airflow and Backport Packages preparation instructions (#11310) This commit extracts common parts of Apache Airflow package preparation and Backport Packages preparation. Common parts were extracted as prerequisites, the release process has been put in chronological order, some details about preparing backport packages have been moved to a separate README.md in the Backport Packages to not confuse release instructions with tool instructions. * Fix syntax highlightling for concurrency in configurations doc (#11438) `concurrency` -> ``concurrency`` since it is rendered in rst * Fix typo in airflow/utils/dag_processing.py (#11445) `availlablle` -> `available` * Selective tests - depends on files changed in the commit. (#11417) This is final step of implementing #10507 - selective tests. Depending on files changed by the incoming commit, only subset of the tests are exucted. The conditions below are evaluated in the sequence specified below as well: * In case of "push" and "schedule" type of events, all tests are executed. * If no important files and folders changed - no tests are executed. This is a typical case for doc-only changes. * If any of the environment files (Dockerfile/setup.py etc.) all tests are executed. * If no "core/other" files are changed, only the relevant types of tests are executed: * API - if any of the API files/tests changed * CLI - if any of the CLI files/tests changed * WWW - if any of the WWW files/tests changed * Providers - if any of the Providers files/tests changed * Integration Heisentests, Quarantined, Postgres and MySQL runs are always run unless all tests are skipped like in case of doc-only changes. * If "Kubernetes" related files/tests are changed, the "Kubernetes" tests with Kind are run. Note that those tests are run separately using Host environment and those tests are stored in "kubernetes_tests" folder. * If some of the core/other files change, all tests are run. This is calculated by substracting all the files count calculated above from the total count of important files. Fixes: #10507 * Fix correct Sphinx return type for DagFileProcessorProcess.result (#11444) * Use augmented assignment (#11449) `tf_count += 1` instead of `tf_count = tf_count + 1` * Remove redundant None provided as default to dict.get() (#11448) * Fix spelling (#11453) * Refactor celery worker command (#11336) This commit does small refactor of the way we star celery worker. In this way it will be easier to migrate to Celery 5.0. * Move the test_process_dags_queries_count test to quarantine (#11455) The test (test_process_dags_queries_count) randomly produces bigger number of counts. Example here: https://github.com/apache/airflow/runs/1239572585#step:6:421 * Google cloud operator strict type check (#11450) import optimisation * Increase timeout for waiting for images (#11460) Now, when we have many more jobs to run, it might happen that when a lot of PRs are submitted one-after-the-other there might be longer waiting time for building the image. There is only one waiting job per image type, so it does not cost a lot to wait a bit longer, in order to avoid cancellation after 50 minutes of waiting. * Add more testing methods to dev/README.md (#11458) * Adds missing schema for kerberos sidecar configuration (#11413) * Adds missing schema for kerberos sidecar configuration The kerberos support added in #11130 did not have schema added to the values.yml. This PR fixes it. Co-authored-by: Jacob Ferriero <[email protected]> * Update chart/values.schema.json Co-authored-by: Jacob Ferriero <[email protected]> * Rename backport packages to provider packages (#11459) In preparation for adding provider packages to 2.0 line we are renaming backport packages to provider packages. We want to implement this in stages - first to rename the packages, then split-out backport/2.0 providers as part of the #11421 issue. * Add option to enable TCP keepalive for communication with Kubernetes API (#11406) * Add option to enable TCP keepalive mechanism for communication with Kubernetes API * Add keepalive default options to default_airflow.cfg * Add reference to PR * Quote parameters names in configuration * Add problematic words to spelling_wordlist.txt * Enables back duplicate cancelling on push/schedule (#11471) We disabled duplicate cancelling on push/schedule in #11397 but then it causes a lot of extra strain in case several commits are merged in quick succession. The master merges are always full builds and take a lot of time, but if we merge PRs quickly, the subsequent merge cancels the previous ones. This has the negative consequence that we might not know who broke the master build, but this happens rarely enough to suffer the pain at expense of much less strained queue in GitHub Actions. * Fix typo in docker-context-files/README.md (#11473) `th` -> `the` * added type hints for aws cloud formation (#11470) * Mask Password in Log table when using the CLI (#11468) * Mount volumes and volumemounts into scheduler and workers (#11426) * Mount arbitrary volumes and volumeMounts to scheduler and worker Allows users to mount volumes to scheduler and workers * tested * Bump FAB to 3.1 (#11475) FAB released a new version today - https://pypi.org/project/Flask-AppBuilder/3.1.0/ which removes the annoying missing font file format for glyphicons error * Allow multiple schedulers in helm chart (#11330) * Allow multiple schedulers in helm chart * schema * add docs * add to readme Co-authored-by: Daniel Imberman <[email protected]> * Fix Harcoded Airflow version (#11483) This test will fail or will need fixing whenever we release new Airflow version * Add link on External Task Sensor to navigate to target dag (#11481) Co-authored-by: Kaz Ukigai <[email protected]> * Spend less time waiting for LocalTaskJob's subprocss process to finish (#11373) * Spend less time waiting for LocalTaskJob's subprocss process to finish This is about is about a 20% speed up for short running tasks! This change doesn't affect the "duration" reported in the TI table, but does affect the time before the slot is freeded up from the executor - which does affect overall task/dag throughput. (All these tests are with the same BashOperator tasks, just running `echo 1`.) **Before** ``` Task airflow.executors.celery_executor.execute_command[5e0bb50c-de6b-4c78-980d-f8d535bbd2aa] succeeded in 6.597011625010055s: None Task airflow.executors.celery_executor.execute_command[0a39ec21-2b69-414c-a11b-05466204bcb3] succeeded in 6.604327297012787s: None ``` **After** ``` Task airflow.executors.celery_executor.execute_command[57077539-e7ea-452c-af03-6393278a2c34] succeeded in 1.7728257849812508s: None Task airflow.executors.celery_executor.execute_command[9aa4a0c5-e310-49ba-a1aa-b0760adfce08] succeeded in 1.7124666879535653s: None ``` **After, including change from #11372** ``` Task airflow.executors.celery_executor.execute_command[35822fc6-932d-4a8a-b1d5-43a8b35c52a5] succeeded in 0.5421732050017454s: None Task airflow.executors.celery_executor.execute_command[2ba46c47-c868-4c3a-80f8-40adaf03b720] succeeded in 0.5469810889917426s: None ``` * Add endpoints for task instances (#9597) * Enable serialization by default (#11491) We actually need to make serialization the default, but this is an interim measure for Airflow2.0.0.alpha1 reease Since many of the tests will fail with it enabled (they need fixing up to ensure DAGs are in the serializated table) as a hacky measure we have set it back to false in the tests. * Add missing values entries to Parameters in chart/README.md (#11477) * Rename "functional DAGs" to "Decorated Flows" (#11497) Functional DAGs were so called because the DAG is "made up of funcitons" but this AIP adds much more than just the task decorator change -- it adds nicer XCom use, and in many cases automatic dependencies between tasks. "Functional" also invokes "functional programming" which this isn't. * Prevent text-selection of scheduler interval when selecting DAG ID (#11503) * Mark Smart Sensor as an early-access feature (#11499) * Fix spelling for Airbnb (#11505) * Added support for provider packages for Airflow 2.0 (#11487) * Separate changes/readmes for backport and regular providers We have now separate release notes for backport provider packages and regular provider packages. They have different versioning - backport provider packages with CALVER, regular provider packages with semver. * Added support for provider packages for Airflow 2.0 This change consists of the following changes: * adds provider package support for 2.0 * adds generation of package readme and change notes * versions are for now hard-coded to 0.0.1 for first release * adds automated tests for installation of the packages * rename backport package readmes/changes to BACKPORT_* * adds regulaar packge readmes/changes * updates documentation on generating the provider packaes * adds CI tests for the packages * maintains backport packages generation with --backports flag Fixes #11421 Fixes #11424 * Airflow tutorial to use Decorated Flows (#11308) Created a new Airflow tutorial to use Decorated Flows (a.k.a. functional DAGs). Also created a DAG to perform the same operations without using functional DAGs to be compatible with Airflow 1.10.x and to show the difference. * Apply suggestions from code review It makes sense to simplify the return variables being passed around without needlessly conv…

github-actions · 2020-10-14T14:55:53Z