[AIRFLOW-2156] Parallelize Celery Executor task state fetching #3830

KevinYang21 · 2018-09-01T00:37:13Z

Jira

My PR addresses the following Airflow Jira issues and references them in the PR title.
- https://issues.apache.org/jira/browse/AIRFLOW-2156

Description

This change is mostly authored by @aoen. I am merely doing the tests/small fix and PR publishing due to job change of him.

The change would bring Airflow to meet 5 min scheduling SLA with 30k running tasks according to Airbnb production traffic and stress tests. The performance of the celery querying step is hugely improved by 15x+ with 16 processors and can potentially be fast with more processors( checkout our stress test results below).

Here are some details about my PR, including screenshots of any UI changes:

The scheduling delay is generated by a monitoring task that is running on the cluster with 5m interval. It will compare the current timestamp against the expected scheduling timestamp(execution_date + 5m) and send the time diff in min as one data point on the metric graph, e.g. monitoring task with execution_date 2018-01-01T00:00:00 started at 2018-01-01T00:06:00 will put a 1m scheduling delay data point onto the metric graph.

Notes:

Syncing no longer happens at the end of the celery executor executions (e.g. if scheduler shuts down). The sync does not actually guarantee that tasks finished anyways and prolongs the ending protocol.
There is no timeout on the subprocesses but that wasn't the case before this change either
What about logging for the multiprocessing tasks? Well it's ok to skip them, they aren't currently logged either.

Tests

My PR adds the following unit tests:
tests/executors/test_celery_executor.py:CeleryExecutorTest.test_exception_propagation

Commits

My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters (not including Jira issue reference)
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Documentation

In case of new functionality, my PR adds documentation that describes how to use it.
- When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

Passes git diff upstream/master -u -- "*.py" | flake8 --diff

codecov-io · 2018-09-01T02:01:50Z

Codecov Report

Merging #3830 into master will decrease coverage by 0.01%.
The diff coverage is 72.34%.

@@            Coverage Diff            @@
##           master   #3830      +/-   ##
=========================================
- Coverage   77.51%   77.5%   -0.02%     
=========================================
  Files         200     200              
  Lines       15815   15852      +37     
=========================================
+ Hits        12259   12286      +27     
- Misses       3556    3566      +10

Impacted Files	Coverage Δ
airflow/executors/celery_executor.py	`80.61% <72.34%> (-3%)`	⬇️
airflow/models.py	`88.74% <0%> (-0.05%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7cd3a85...9b977b0. Read the comment docs.

feng-tao · 2018-09-01T03:54:38Z

Nice job! Are we aiming this improvement for 1.10.1?

KevinYang21 · 2018-09-01T04:16:04Z

@feng-tao Not sure about what's our timeline for 1.10.1 but there are two other PRs about the same topic coming in the next week and they are less mature( in terms of time they've been running in our cluster). I would hope if we make this into 1.10.1 we make three of them in together. ( With three of them together, conservatively Airflow should be able to handle 30k concurrent running tasks and 4k DAG files while meeting 5 mins, even in extreme cases like 30k tasks need to be scheduled at the same time)

@aoen @Fokko @bolkedebruin @mistercrunch @afernandez @saguziel @YingboWang PTAL

Fokko · 2018-09-02T21:23:00Z

Thanks @yrqls21 I don't have a celery executor at hand to give this a try, so I need some more time. @kaxil @fenglu-g This might be interesting for Cloud composer as well.

kaxil · 2018-09-03T08:42:35Z

@Fokko Definitely. Good works @yrqls21 . Will have to test it this thoroughly though.

@Fokko The Codecov seems to still fail, unfortunately :(

Edit: I guess the PR was probably created before the Codecov fix, hence the error. My assumption is because the Codecov works finen other PRs

KevinYang21 · 2018-09-04T21:57:29Z

@kaxil Tyvm. We definitely should test thoroughly. Just to provide a data point here, the change has been running in Airbnb production for 2+ months plus more times in stress test cluster( we're running 1.8 + celery executor).

For the Codecov, should I rebase to fix it?

feng-tao

I wonder what is the stat used(scheduler delay) in the perf test graph?

overall make sense. Great job.

feng-tao · 2018-09-04T22:28:13Z

airflow/executors/celery_executor.py

+        # roughly uniform in size
+        chunksize = self._num_tasks_per_process()
+
+        self.log.debug("Waiting for inquiries to complete...")


the logging usage doesn't seem to be consistent(self.logger vs self.log)

My bad here, we are still on 1.8. Will update.

feng-tao · 2018-09-04T22:51:22Z

airflow/executors/celery_executor.py

+        self.traceback = exception_traceback
+
+
+def fetch_celery_task_state(celery_task):


does the network call happen here? Based on http://docs.celeryproject.org/en/latest/reference/celery.result.html, it uses celery_task.collect to fetch the result. I don't quite get the how the network call happens in this func though.

And do we retry to gather the task state?

celery.task.state will make the network call to fetch the task state from celery. collect() and get() are blocking methods that trying to get the result of the task. Here I don't retry but just expect the next fetching attempt will success, to make the logic simpler as we don't retry previously( tho fetching in parallel may have bigger chance to fail, wait for retry on the next loop does not hurt too much).

feng-tao · 2018-09-04T22:55:53Z

airflow/config_templates/default_airflow.cfg

@@ -380,6 +380,9 @@ flower_port = 5555
 # Default queue that tasks get assigned to and that worker listen on.
 default_queue = default

+# How many processes CeleryExecutor uses to sync task state.
+sync_parallelism = 16


I think the default value should be 1 which to keep backward compatible(unless we mention it in updating.md). User could always modify this value to take advantage of this feature. And if we set the default value to 16, I wonder what is the configuration for the box where the test is running.

I can tune it back to 1 but I can hardly call it backward compatible, since we are using subprocess to fetch and the functionality doesn't actually change. About the test env, we were running 16 as mentioned--sry for missing that.

Does multiprocessing do a fork, or an exec of a whole new python process? If it's exec then the memory consumption is quite drasticly higher, and this should default to 1 or 2. If it's a fork and we can take advantage of COW to reduce memory then I guess this is okay.

Still probably worth a note in UPDATING.md about this new setting and how people should tweak it based on their scheduler node size.

Couldn't we get the number of cpu's? There are already a lot of parameters to tune, adding more will make it only more complex

We can decide on the default value but I do think it is better to make it configurable, we can put comments stating clearly about if they need to tune or how to tune. In an ideal world matching the # of processes with # of (v)cores maximize the performance but in our case there might be other processes running( e.g. task runner processes from LocalExecutor, DAG parsing processes, processes from other services, e.g. webserver).

Agreed, but set it to max(1, cores - 1), this is a quite common pattern.

👍 Will update

tests/executors/test_celery_executor.py

KevinYang21 · 2018-09-05T01:16:10Z

@feng-tao Thank you very much for reviewing. The scheduling delay is generated by a monitoring task that is running on the cluster with 5m interval. It will compare the current timestamp against the expected scheduling timestamp(execution_date + 5m) and send the time diff in min as one data point on the metric graph, e.g. monitoring task with execution_date 2018-01-01T00:00:00 started at 2018-01-01T00:06:00 will put a 1m scheduling delay data point onto the metric graph.(I'll update the description too)

feng-tao · 2018-09-05T20:29:15Z

LGTM, but let's wait for couples of days and see if others have the chance to review / run it.

KevinYang21 · 2018-09-05T20:30:26Z

@feng-tao Definitely. Thank you very much!

ashb · 2018-09-06T09:05:52Z

Let's not pull this in to 1.10.1 -- it very much doesn't fall in the the category of "bug fix or small features"

ashb

Samller change than I thought from the subject of the PR, so less against putting this in 1.10.1, but it still feels like quite a large change for a point release. Thoughts?

ashb · 2018-09-06T09:08:36Z

airflow/executors/celery_executor.py

+        self.traceback = exception_traceback
+
+
+def fetch_celery_task_state(celery_task):


Does this have to be a single param. Could we instead have this as:

def fetch_celery_task_state(task_id, async_result):

No it does not have to be a single param. Doing it in this way so that we don't need to unpack the tasks dict and thus looks cleaner.

ashb · 2018-09-06T09:09:36Z

airflow/executors/celery_executor.py

+    def __init__(self):
+        super(CeleryExecutor, self).__init__()
+
+        # Parallelize Celery requests here since Celery does not support parallelization.


I understand what this comment means, but seeing "Celery does not support parallelization" is... odd.

What about Parallelize Celery requests here since Celery does not support parallelization natively?

# Celery doesn't support querying the state of multiple tasks in parallel (which can become a bottleneck on bigger clusters) so we use a multiprocessing pool to speed this up

ashb · 2018-09-06T09:13:07Z

airflow/executors/celery_executor.py

+    """
+
+    try:
+        res = (celery_task[0], celery_task[1].state)


I'm guessing that it's accessing the .state property that causes Celery to make the network request? Could you add a comment here saying so? (Otherwise this whole function looks odd and pointless

Exactly. Will add the comment there.

ashb · 2018-09-06T09:15:14Z

airflow/executors/celery_executor.py

+                self.log.error(
+                    CELERY_FETCH_ERR_MSG_HEADER + ", ignoring it:{}\n{}\n".format(
+                        key_and_state.exception, key_and_state.traceback))
+            else:


Rather than an else: block here, if we add a continue on the line above then we can un-indent the whole block following this (L182-202) - I aim for shallower indents/less nesting where possible as I find it easier to follow the code as a result.

I don't have a preference here on the style, will update with continue.

ashb · 2018-09-06T09:16:50Z

airflow/executors/celery_executor.py

+                            self.last_state[key] = state
+                except Exception as e:
+                    self.log.error("Error syncing the Celery executor, ignoring "
+                                   "it:\n{}\n{}".format(e, traceback.format_exc()))


self.log.exception("Error syncing the Celery executor, ignoring it:") and it will include the exception and traceback automatically?

I did not know log.exception would automatically include the traceback. Thank you for the tip! Will update to self.log.exception("Error syncing the Celery executor, ignoring it.")

ashb · 2018-09-06T09:20:56Z

airflow/config_templates/default_airflow.cfg

@@ -380,6 +380,9 @@ flower_port = 5555
 # Default queue that tasks get assigned to and that worker listen on.
 default_queue = default

+# How many processes CeleryExecutor uses to sync task state.
+sync_parallelism = 16


Does multiprocessing do a fork, or an exec of a whole new python process? If it's exec then the memory consumption is quite drasticly higher, and this should default to 1 or 2. If it's a fork and we can take advantage of COW to reduce memory then I guess this is okay.

Still probably worth a note in UPDATING.md about this new setting and how people should tweak it based on their scheduler node size.

ashb · 2018-09-06T09:23:09Z

airflow/executors/celery_executor.py

+                       len(self.tasks), num_processes)
+
+        # Recreate the process pool each sync in case processes in the pool die
+        self._sync_pool = Pool(processes=num_processes)


Do we need to do any special handling to ensure of the SQLA connection when using multiprocssing? (I guess to make sure we don't end up with 16 extra connections to the DB that we don't need)

I believe multiprocessing with use os.fork() on unix systems and thus we can take advantage of COW to reduce ram usage. However AFAIK on Windows child process will reimport all module level imports and thus may require some extra ram. But I don't think the extra imports will create a big burden on the ram usage( or maybe our scheduler box is just too big :P). I'll add an entry in UPDATING.md regarding this config line.

About the SQLA connection, you actually have the point. On Windows we might ended up configuring extra 16 connection pools while reimporting. And since subprocesses spun up by multiprocessing module do not run atexit() we might leave some hanging connections there in theory. However from my observation and test, SQLA initializes connections lazily and thus we at most have empty pool in the subprocesses.

I might be wrong about the Windows thing and SQLA lazy initialization thing, open to discuss better handling if that is the case.

FYI this is the test script/result I was playing with:

▶ cat test.py import os from multiprocessing import Pool print('execute module code') def test_func(num): print(num) if __name__ == '__main__': pool = Pool(4) results = pool.map(test_func, [1,2,3,4], 1) pool.close() pool.join() ▶ python test.py execute module code 1 2 3 4 -------------- mimic Windows behavior ---------------- ▶ cat test.py import os from multiprocessing import Pool print('execute module code') def test_func(num): from airflow import settings print(num) print(settings.engine.pool.status()) if __name__ == '__main__': pool = Pool(4) results = pool.map(test_func, [1,2,3,4], 1) pool.close() pool.join() ▶ python test.py execute module code execute module code execute module code execute module code execute module code airflow.settings [2018-09-07 17:37:24,218] {{settings.py:148}} DEBUG - Setting up DB connection pool (PID 80204) airflow.settings [2018-09-07 17:37:24,218] {{settings.py:148}} DEBUG - Setting up DB connection pool (PID 80202) airflow.settings [2018-09-07 17:37:24,218] {{settings.py:148}} DEBUG - Setting up DB connection pool (PID 80201) airflow.settings [2018-09-07 17:37:24,218] {{settings.py:148}} DEBUG - Setting up DB connection pool (PID 80203) airflow.settings [2018-09-07 17:37:24,219] {{settings.py:176}} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=3600 airflow.settings [2018-09-07 17:37:24,219] {{settings.py:176}} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=3600 airflow.settings [2018-09-07 17:37:24,219] {{settings.py:176}} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=3600 airflow.settings [2018-09-07 17:37:24,219] {{settings.py:176}} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=3600 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 airflow.utils.log.logging_mixin.LoggingMixin [2018-09-07 17:37:24,332] {{__init__.py:42}} DEBUG - Cannot import due to doesn't look like a module path airflow.utils.log.logging_mixin.LoggingMixin [2018-09-07 17:37:24,332] {{__init__.py:42}} DEBUG - Cannot import due to doesn't look like a module path airflow.utils.log.logging_mixin.LoggingMixin [2018-09-07 17:37:24,332] {{__init__.py:42}} DEBUG - Cannot import due to doesn't look like a module path airflow.utils.log.logging_mixin.LoggingMixin [2018-09-07 17:37:24,332] {{__init__.py:42}} DEBUG - Cannot import due to doesn't look like a module path 4 1 2 3 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0 Pool size: 5 Connections in pool: 0 Current Overflow: -5 Current Checked out connections: 0

KevinYang21 · 2018-09-08T02:19:47Z

@ashb Thank you for reviewing. Forgot to mention about the releasing part. I agree with you on not to release it in 1.10.1. Besides your point, this PR is more meaningful with some follow up PRs that I'm working on and provide a better story on scheduler scaling, which can then go into a bigger release.

KevinYang21 · 2018-09-09T07:10:11Z

Thanks @ashb for pointing out, I've changed the commit title and PR title to Parallelize Celery Executor task state fetching to better reflect the content of the change. Also open for discussion on the default value of sync_parallelism( given the small difference different value can cause let's keep it short). I vote for 16, it has proven to provide great performance, ~1s running time for querying 30k running task states( of course this also heavily depends on the network, celery broker, backend etc, we use redis for broker and backend), and is not a crazy number that will create much burden on the box.

feng-tao · 2018-09-10T16:36:40Z

I would prefer the default value to be 1.

KevinYang21 · 2018-09-10T19:02:54Z

@feng-tao Would you elaborate a bit more on the reason please? It feels weird to introduce a new performance improvement with no improvement by default. As mentioned by @Fokko, there might have already been too many config people need to tune.

feng-tao · 2018-09-10T19:46:20Z

@yrqls21 , sorry, didn't see you change the default from 16 to a value depending on numbers of cores. I am +1 for the current way of setting the default value.

feng-tao · 2018-09-11T16:12:14Z

LGTM @yrqls21 , merging now.

Fokko · 2018-09-12T08:43:16Z

I've closed the Jira ticket.

[AIRFLOW-XXX] Remove residual line in Changelog (#3814) [AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (#3828) [AIRFLOW-2709] Improve error handling in Databricks hook (#3570) * Use float for default value * Use status code to determine whether an error is retryable * Fix wrong type in assertion * Fix style to prevent lines from exceeding 90 characters * Fix wrong way of checking exception type [AIRFLOW-2854] kubernetes_pod_operator add more configuration items (#3697) * kubernetes_pod_operator add more configuration items * fix test_kubernetes_pod_operator test_faulty_service_account failure case * fix review comment issues * pod_operator add hostnetwork config * add doc example [AIRFLOW-2994] Fix command status check in Qubole Check operator (#3790) [AIRFLOW-2928] Use uuid4 instead of uuid1 (#3779) for better randomness. [AIRFLOW-2949] Add syntax highlight for single quote strings (#3795) * AIRFLOW-2949: Add syntax highlight for single quote strings * AIRFLOW-2949: Also updated new UI main.css [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference [AIRFLOW-2984] Convert operator dates to UTC (#3822) Tasks can have start_dates or end_dates separately from the DAG. These need to be converted to UTC otherwise we cannot use them for calculation the next execution date. [AIRFLOW-2779] Make GHE auth third party licensed (#3803) This reinstates the original license. [AIRFLOW-XXX] Add Format to list of companies (#3824) [AIRFLOW-2900] Show code for packaged DAGs (#3749) [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (#3821) [AIRFLOW-XXX] Fix Docstrings for Operators (#3820) [AIRFLOW-2951] Update dag_run table end_date when state change (#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. [AIRFLOW-2145] fix deadlock on clearing running TI (#3657) a `shutdown` task is not considered be `unfinished`, so a dag run can deadlock when all `unfinished` downstreams are all waiting on a task that's in the `shutdown` state. fix this by considering `shutdown` to be `unfinished`, since it's not truly a terminal state [AIRFLOW-XXX] Fix typo in docstring of gcs_to_bq (#3833) [AIRFLOW-2476] Allow tabulate up to 0.8.2 (#3835) [AIRFLOW-XXX] Fix typos in faq.rst (#3837) [AIRFLOW-2979] Make celery_result_backend conf Backwards compatible (#3832) (#2806) Renamed `celery_result_backend` to `result_backend` and broke backwards compatibility. [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI (#3804) [AIRFLOW-491] Add feature to pass extra api configs to BQ Hook (#3733) [AIRFLOW-208] Add badge to show supported Python versions (#3839) [AIRFLOW-3007] Update backfill example in Scheduler docs The scheduler docs at https://airflow.apache.org/scheduler.html#backfill-and-catchup use deprecated way of passing `schedule_interval`. `schedule_interval` should be pass to DAG as a separate parameter and not as a default arg. [AIRFLOW-3005] Replace 'Airbnb Airflow' with 'Apache Airflow' (#3845) [AIRFLOW-3002] Fix variable & tests in GoogleCloudBucketHelper (#3843) [AIRFLOW-2991] Log path to driver output after Dataproc job (#3827) [AIRFLOW-XXX] Fix python3 and flake8 errors in dev/airflow-jira This is a script that checks if the Jira's marked as fixed in a release are actually merged in - getting this working is helpful to me in preparing 1.10.1 [AIRFLOW-3006] Add note on using None for schedule_interval [AIRFLOW-3003] Pull the krb5 image instead of building (#3844) Pull the image instead of building it, this will speed up the CI process since we don't have to build it every time. [AIRFLOW-2883] Add import and export for pool cli using JSON [AIRFLOW-2847] Remove legacy imports support for plugins (#3692) [AIRFLOW-1998] Implemented DatabricksRunNowOperator for jobs/run-now … (#3813) Add functionality to kick of a Databricks job right away. * Per feedback: fixed a documentation error, reintegrated the execute and on_kill onto the objects. * Fixed a documentation issue. [AIRFLOW-3021] Add Censys to who uses Airflow list > Censys > Find and analyze every reachable server and device on the Internet > https://censys.io/ closes AIRFLOW-3021 https://issues.apache.org/jira/browse/AIRFLOW-3021 [AIRFLOW-3018] Fix Minor issues in Documentation Add Branch to Company List [AIRFLOW-3023] Fix docstring datatypes [AIRFLOW-3008] Move Kubernetes example DAGs to contrib [AIRFLOW-2997] Support cluster fields in bigquery (#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). [AIRFLOW-XXX] Redirect FAQ `airflow[crypto]` to How-to Guides. [AIRFLOW-XXX] Remove redundant space in Kerberos (#3866) [AIRFLOW-3028] Update Text & Images in Readme.md [AIRFLOW-1917] Trim extra newline and trailing whitespace from log (#3862) [AIRFLOW-2985] Operators for S3 object copying/deleting (#3823) 1. Copying: Under the hood, it's `boto3.client.copy_object()`. It can only handle the situation in which the S3 connection used can access both source and destination bucket/key. 2. Deleting: 2.1 Under the hood, it's `boto3.client.delete_objects()`. It supports either deleting one single object or multiple objects. 2.2 If users try to delete a non-existent object, the request will still succeed, but there will be an entry 'Errors' in the response. There may also be other reasons which may cause similar 'Errors' ( request itself would succeed without explicit exception). So an argument `silent_on_errors` is added to let users decide if this sort of 'Errors' should fail the operator. The corresponding methods are added into S3Hook, and these two operators are 'wrappers' of these methods. [AIRFLOW-3030] Fix CLI docs (#3872) [AIRFLOW-XXX] Update kubernetes.rst docs (#3875) Update kubernetes.rst with correct KubernetesPodOperator inputs for the volumes. [AIRFLOW-XXX] Add Enigma to list of companies [AIRFLOW-2965] CLI tool to show the next execution datetime Cover different cases - schedule_interval is "@once" or None, then following_schedule method would always return None - If dag is paused, print reminder - If latest_execution_date is not found, print warning saying not applicable. [AIRFLOW-XXX] Add Bombora Inc using Airflow [AIRFLOW-2156] Parallelize Celery Executor task state fetching (#3830) [AIRFLOW-XXX] Move Dag level access control out of 1.10 section (#3882) It isn't in 1.10 (and wasn't in this section when the PR was created). [AIRFLOW-3040] Enable ProBot to clean up stale Pull Requests (#3883) [AIRFLOW-3012] Fix Bug when passing emails for SLA [AIRFLOW-2797] Create Google Dataproc cluster with custom image (#3871) [AIRFLOW-XXX] Updated README to include CAVA Addressed comments in PR with appropriate refactoring of s3-sftp operators. Added s3-sftp operator links [AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators #3828 Rearranged input parameters for sftp_to_s3_operator. [AIRFLOW-2988] Run specifically python2 for dataflow (#3826) Apache beam does not yet support python3, so it's best to run dataflow jobs with python2 specifically until python3 support is complete (BEAM-1251), in case if the user's 'python' in PATH is python3. [AIRFLOW-3035] Allow custom 'job_error_states' in dataproc ops (#3884) Allow caller to pass in custom list of Dataproc job states into the DataProc*Operator classes that should result in the _DataProcJob.raise_error() method raising an Exception. [AIRFLOW-3034]: Readme updates : Add Slack & Twitter, remove Gitter [AIRFLOW-3056] Add happn to Airflow user list [AIRFLOW-3052] Add logo options to Airflow (#3892) [AIRFLOW-3060] DAG context manager fails to exit properly in certain circumstances [AIRFLOW-2524] Add SageMaker Batch Inference (#3767) * Fix for comments * Fix sensor test * Update non_terminal_states and failed_states to static variables of SageMakerHook Add SageMaker Transform Operator & Sensor Co-authored-by: srrajeev-aws <[email protected]> [AIRFLOW-2772] Fix Bug in BigQuery hook for Partitioned Table (#3901) [AIRFLOW-XXX] Added Jeitto as one of happy Airflow users! (#3902) [AIRFLOW-XXX] Add Jeitto as one happy Airflow user! [AIRFLOW-3044] Dataflow operators accept templated job_name param (#3887) * Default value of new job_name param is templated task_id, to match the existing behavior as much as possible. * Change expected value in test_mlengine_operator_utils.py to match default for new job_name param. [AIRFLOW-2707] Validate task_log_reader on upgrade from <=1.9 (#3881) We changed the default logging config and config from 1.9 to 1.10, but anyone who upgrades and has an existing airflow.cfg won't know they need to change this value - instead they will get nothing displayed in the UI (ajax request fails) and see "'NoneType' object has no attribute 'read'" in the error log. This validates that config section at start up, and seamlessly upgrades the old previous value. [AIRFLOW-3025] Enable specifying dns and dns_search options for DockerOperator (#3860) Enable specifying dns and dns_search options for DockerOperator [AIRFLOW-1298] Clear UPSTREAM_FAILED using the clean cli (#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' [AIRFLOW-3059] Log how many rows are read from Postgres (#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. [AIRFLOW-XXX] Fix typo in docs/timezone.rst (#3904) [AIRFLOW-3070] Refine web UI authentication-related docs (#3863) [AIRFLOW-3068] Remove deprecated imports [AIRFLOW-3036] Add relevant ECS options to ECS operator. (#3908) The ECS operator currently supports only a subset of available options for running ECS tasks. This patch adds all ECS options that could be relevant to airflow; options that wouldn't make sense here, like `count`, were skipped. [AIRFLOW-1195] Add feature to clear tasks in Parent Dag (#3907) [AIRFLOW-3073] Add note-Profiling feature not supported in new webserver (#3909) Adhoc queries and Charts features are no longer supported in new FAB-based webserver and UI. But this is not mentioned at all in the doc "Data Profiling" (https://airflow.incubator.apache.org/profiling.html) This commit adds a note to remind users for this. [AIRFLOW-XXX] Fix SlackWebhookOperator docs (#3915) The docs refer to `conn_id` while the actual argument is `http_conn_id`. [AIRFLOW-1441] Fix inconsistent tutorial code (#2466) [AIRFLOW-XXX] Add 90 Seconds to companies [AIRFLOW-3096] Reduce DaysUntilStale for probot/stale [AIRFLOW-3096] Further reduce DaysUntilStale for probo/stale [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role (#3913) [AIRFLOW-3090] Demote dag start/stop log messages to debug (#3920) [AIRFLOW-2407] Use feature detection for reload() (#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 [AIRFLOW-2747] Explicit re-schedule of sensors (#3596) * [AIRFLOW-2747] Explicit re-schedule of sensors Add `mode` property to sensors. If set to `reschedule` an AirflowRescheduleException is raised instead of sleeping which sets the task back to state `NONE`. Reschedules are recorded in new `task_schedule` table and visualized in the Gantt view. New TI dependency checks if a sensor task is ready to be re-scheduled. * Reformat sqlalchemy imports * Make `_handle_reschedule` private * Remove print * Add comment * Add comment * Don't record reschule request in test mode [AIRFLOW-XXX] Fix a wrong sample bash command, a display issue & a few typos (#3924) [AIRFLOW-3090] Make No tasks to consider for execution debug (#3923) During normal operation, it is not necessary to see the message. This can only be useful when debugging an issue. AIRFLOW-2952 Fix Kubernetes CI (#3922) The current dockerised CI pipeline doesn't run minikube and the Kubernetes integration tests. This starts a Kubernetes cluster using minikube and runs k8s integration tests using docker-compose. [AIRFLOW-2918] Fix Flake8 violations (#3931) [AIRFLOW-3076] Remove preloading of MySQL testdata (#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future [AIRFLOW-2887] Added BigQueryCreateEmptyDatasetOperator and create_emty_dataset to bigquery_hook (#3876) [AIRFLOW-2918] Remove unused imports [AIRFLOW-3099] Stop Missing Section Errors for optional sections (#3934) [AIRFLOW-3090] Specify path of key file in log message (#3921) [AIRFLOW-3067] Display www_rbac Flask flash msg properly (#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. [AIRFLOW-3109] Bugfix to allow user/op roles to clear task intance via UI by default add show statements to hql filtering. [AIRFLOW-3051] Change CLI to make users ops similar to connections The ability to manipulate users from the command line is a bit clunky. Currently 'airflow create_user' and 'airflow delete_user' and 'airflow list_users'. It seems that these ought to be made more like connections, so that it becomes 'airflow users list ...', 'airflow users delete ...' and 'airflow users create ...' [AIRFLOW-3009] Import Hashable from collection.abc to fix Python 3.7 deprecation warning (#3849) [AIRFLOW-XXX] Add Tesla as an Apache Airflow user (#3947) [AIRFLOW-3111] Fix instructions in UPDATING.md and remove comment (#3944) artifacts in default_airflow.cfg - fixed incorrect instructions in UPDATING.md regarding core.log_filename_template and elasticsearch.elasticsearch_log_id_template - removed comments referencing "additional curly braces" from default_airflow.cfg since they're irrelevant to the rendered airflow.cfg [AIRFLOW-3117] Add instructions to allow GPL dependency (#3949) The installation instructions failed to mention how to proceed with the GPL dependency. For those who are not concerned by GPL, it is useful to know how to proceed with GPL dependency. [AIRFLOW-XXX] Add Square to the companies lists [AIRFLOW-XXX] Add Fathom Health to readme [AIRFLOW-XXX] Pin Click to 6.7 to Fix CI (#3962) [AIRFLOW-XXX] Fix SlackWebhookOperator execute method comment (#3963) [AIRFLOW-3100][AIRFLOW-3101] Improve docker compose local testing (#3933) [AIRFLOW-3127] Fix out-dated doc for Celery SSL (#3967) Now in `airflow.cfg`, for Celery-SSL, the item names are "ssl_active", "ssl_key", "ssl_cert", and "ssl_cacert". (since PR https://github.com/apache/incubator-airflow/pull/2806/files) But in the documentation https://airflow.incubator.apache.org/security.html?highlight=celery or https://github.com/apache/incubator-airflow/blob/master/docs/security.rst, it's "CELERY_SSL_ACTIVE", "CELERY_SSL_KEY", "CELERY_SSL_CERT", and "CELERY_SSL_CACERT", which is out-dated and may confuse readers. [AIRFLOW-XXX] Fix PythonVirtualenvOperator tests (#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. [AIRFLOW-2952] Fix Kubernetes CI (#3957) - Update outdated cli command to create user - Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/` - Update the path to copy K8s dags [AIRFLOW-3104] Add .airflowignore info into doc (#3939) .airflowignore is a nice feature, but it was not mentioned at all in the documentation. [AIRFLOW-3130] Add CLI docs for users command [AIRFLOW-XXX] Add Delete for CLI Example in UPDATING.md [AIRFLOW-3123] Use a stack for DAG context management (#3956) [AIRFLOW-3125] Monitor Task Instances creation rates (#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. [AIRFLOW-3129] Backfill mysql hook unit tests. (#3970) [AIRFLOW-3124] Fix RBAC webserver debug mode (#3958) [AIRFLOW-XXX] Add Compass to companies list (#3972) We're using Airflow at Compass now. [AIRFLOW-XXX] Speed up DagBagTest cases (#3974) I noticed that many of the tests of DagBags operate on a specific DAG only, and don't need to load the example or test dags. By not loading the dags we don't need to this shaves about 10-20s of test time. [AIRFLOW-2912] Add Deploy and Delete operators for GCF (#3969) Both Deploy and Delete operators interact with Google Cloud Functions to manage functions. Both are idempotent and make use of GcfHook - hook that encapsulates communication with GCP over GCP API. [AIRFLOW-1390] Update Alembic to 0.9 (#3935) [AIRFLOW-2238] Update PR tool to remove outdated info (#3978) [AIRFLOW-XXX] Don't spam test logs with "bad cron expression" messages (#3973) We needed these test dags to check the behaviour of invalid cron expressions, but by default we were loading them every time we create a DagBag (which many, many tests to). Instead we ignore these known-bad dags by default, and the test checking those (tests/models.py:DagBagTest.test_process_file_cron_validity_check) is already explicitly processing those DAGs directly, so it remains tested. [AIRFLOW-XXX] Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. [AIRFLOW-3079] Improve migration scripts to support MSSQL Server (#3964) There were two problems for MSSQL. First, 'timestamp' data type in MSSQL Server is essentially a row-id, and not a timezone enabled date/time stamp. Second, alembic creates invalid SQL when applying the 0/1 constraint to boolean values. MSSQL should enforce this constraint by simply asserting a boolean value. [AIRFLOW-XXX] Add DoorDash to README.md (#3980) DoorDash uses Airflow https://softwareengineeringdaily.com/2018/09/28/doordash/ [AIRFLOW-3062] Add Qubole in integration docs (#3946) [AIRFLOW-3129] Improve test coverage of airflow.models. (#3982) [AIRFLOW-2574] Cope with '%' in SQLA DSN when running migrations (#3787) Alembic uses a ConfigParser like Airflow does, and "%% is a special value in there, so we need to escape it. As per the Alembic docs: > Note that this value is passed to ConfigParser.set, which supports > variable interpolation using pyformat (e.g. `%(some_value)s`). A raw > percent sign not part of an interpolation symbol must therefore be > escaped, e.g. `%%` [AIRFLOW-3137] Make ProxyFix middleware optional. (#3983) The ProxyFix middleware should only be used when airflow is running behind a trusted proxy. This patch adds a `USE_PROXY_FIX` flag that defaults to `False`. [AIRFLOW-3004] Add config disabling scheduler cron (#3899) [AIRFLOW-3103][AIRFLOW-3147] Update flask-appbuilder (#3937) [AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators #3828 Added apply_default decorator. Added test for operators [AIRFLOW-XXX] Fixing the issue in Documentation (#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation [AIRFLOW-3088] Include slack-compatible emoji image [AIRFLOW-3161] fix TaskInstance log link in RBAC UI [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (#3995) "Parameters" are used to help render the SQL command. But in this operator, only "schema" and "table" are needed. There is no SQL command to render. By checking the code,we can also find argument "parameters" is never really used. (Fix a minor issue in the docstring as well) [AIRFLOW-3159] Update GCS logging docs for latest code (#3952) Formatted code [AIRFLOW-XXX] Fixing the issue in Documentation (#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation [AIRFLOW-3088] Include slack-compatible emoji image [AIRFLOW-3161] fix TaskInstance log link in RBAC UI [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (#3995) "Parameters" are used to help render the SQL command. But in this operator, only "schema" and "table" are needed. There is no SQL command to render. By checking the code,we can also find argument "parameters" is never really used. (Fix a minor issue in the docstring as well) [AIRFLOW-3159] Update GCS logging docs for latest code (#3952) [AIRFLOW-2930] Fix celery excecutor scheduler crash (#3784) Caused by an update in PR #3740. execute_command.apply_async(args=command, ...) -command is a list of short unicode strings and the above code pass multiple arguments to a function defined as taking only one argument. -command = ["airflow", "run", "dag323",...] -args = command = ["airflow", "run", "dag323", ...] -execute_command("airflow","run","dag3s3", ...) will be error and exit. [AIRFLOW-2854] kubernetes_pod_operator add more configuration items (#3697) * kubernetes_pod_operator add more configuration items * fix test_kubernetes_pod_operator test_faulty_service_account failure case * fix review comment issues * pod_operator add hostnetwork config * add doc example [AIRFLOW-2994] Fix command status check in Qubole Check operator (#3790) [AIRFLOW-2949] Add syntax highlight for single quote strings (#3795) * AIRFLOW-2949: Add syntax highlight for single quote strings * AIRFLOW-2949: Also updated new UI main.css [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference [AIRFLOW-2779] Make GHE auth third party licensed (#3803) This reinstates the original license. [AIRFLOW-XXX] Add Format to list of companies (#3824) [AIRFLOW-2900] Show code for packaged DAGs (#3749) [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (#3821) [AIRFLOW-2951] Update dag_run table end_date when state change (#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. [AIRFLOW-2145] fix deadlock on clearing running TI (#3657) a `shutdown` task is not considered be `unfinished`, so a dag run can deadlock when all `unfinished` downstreams are all waiting on a task that's in the `shutdown` state. fix this by considering `shutdown` to be `unfinished`, since it's not truly a terminal state [AIRFLOW-XXX] Fix typo in docstring of gcs_to_bq (#3833) [AIRFLOW-2476] Allow tabulate up to 0.8.2 (#3835) [AIRFLOW-XXX] Fix typos in faq.rst (#3837) [AIRFLOW-2979] Make celery_result_backend conf Backwards compatible (#3832) (#2806) Renamed `celery_result_backend` to `result_backend` and broke backwards compatibility. [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI (#3804) [AIRFLOW-3007] Update backfill example in Scheduler docs The scheduler docs at https://airflow.apache.org/scheduler.html#backfill-and-catchup use deprecated way of passing `schedule_interval`. `schedule_interval` should be pass to DAG as a separate parameter and not as a default arg. [AIRFLOW-3005] Replace 'Airbnb Airflow' with 'Apache Airflow' (#3845) [AIRFLOW-3002] Fix variable & tests in GoogleCloudBucketHelper (#3843) [AIRFLOW-2991] Log path to driver output after Dataproc job (#3827) [AIRFLOW-XXX] Fix python3 and flake8 errors in dev/airflow-jira This is a script that checks if the Jira's marked as fixed in a release are actually merged in - getting this working is helpful to me in preparing 1.10.1 [AIRFLOW-2883] Add import and export for pool cli using JSON [AIRFLOW-3021] Add Censys to who uses Airflow list > Censys > Find and analyze every reachable server and device on the Internet > https://censys.io/ closes AIRFLOW-3021 https://issues.apache.org/jira/browse/AIRFLOW-3021 Add Branch to Company List [AIRFLOW-3008] Move Kubernetes example DAGs to contrib [AIRFLOW-2997] Support cluster fields in bigquery (#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). [AIRFLOW-XXX] Redirect FAQ `airflow[crypto]` to How-to Guides. [AIRFLOW-XXX] Remove redundant space in Kerberos (#3866) [AIRFLOW-3028] Update Text & Images in Readme.md [AIRFLOW-1917] Trim extra newline and trailing whitespace from log (#3862) [AIRFLOW-2985] Operators for S3 object copying/deleting (#3823) 1. Copying: Under the hood, it's `boto3.client.copy_object()`. It can only handle the situation in which the S3 connection used can access both source and destination bucket/key. 2. Deleting: 2.1 Under the hood, it's `boto3.client.delete_objects()`. It supports either deleting one single object or multiple objects. 2.2 If users try to delete a non-existent object, the request will still succeed, but there will be an entry 'Errors' in the response. There may also be other reasons which may cause similar 'Errors' ( request itself would succeed without explicit exception). So an argument `silent_on_errors` is added to let users decide if this sort of 'Errors' should fail the operator. The corresponding methods are added into S3Hook, and these two operators are 'wrappers' of these methods. [AIRFLOW-3030] Fix CLI docs (#3872) [AIRFLOW-XXX] Update kubernetes.rst docs (#3875) Update kubernetes.rst with correct KubernetesPodOperator inputs for the volumes. [AIRFLOW-XXX] Add Enigma to list of companies [AIRFLOW-2965] CLI tool to show the next execution datetime Cover different cases - schedule_interval is "@once" or None, then following_schedule method would always return None - If dag is paused, print reminder - If latest_execution_date is not found, print warning saying not applicable. [AIRFLOW-XXX] Add Bombora Inc using Airflow [AIRFLOW-XXX] Move Dag level access control out of 1.10 section (#3882) It isn't in 1.10 (and wasn't in this section when the PR was created). [AIRFLOW-3012] Fix Bug when passing emails for SLA [AIRFLOW-2797] Create Google Dataproc cluster with custom image (#3871) [AIRFLOW-XXX] Updated README to include CAVA [AIRFLOW-3035] Allow custom 'job_error_states' in dataproc ops (#3884) Allow caller to pass in custom list of Dataproc job states into the DataProc*Operator classes that should result in the _DataProcJob.raise_error() method raising an Exception. [AIRFLOW-3034]: Readme updates : Add Slack & Twitter, remove Gitter [AIRFLOW-3056] Add happn to Airflow user list [AIRFLOW-3052] Add logo options to Airflow (#3892) [AIRFLOW-2524] Add SageMaker Batch Inference (#3767) * Fix for comments * Fix sensor test * Update non_terminal_states and failed_states to static variables of SageMakerHook Add SageMaker Transform Operator & Sensor Co-authored-by: srrajeev-aws <[email protected]> [AIRFLOW-XXX] Added Jeitto as one of happy Airflow users! (#3902) [AIRFLOW-XXX] Add Jeitto as one happy Airflow user! [AIRFLOW-3044] Dataflow operators accept templated job_name param (#3887) * Default value of new job_name param is templated task_id, to match the existing behavior as much as possible. * Change expected value in test_mlengine_operator_utils.py to match default for new job_name param. [AIRFLOW-2707] Validate task_log_reader on upgrade from <=1.9 (#3881) We changed the default logging config and config from 1.9 to 1.10, but anyone who upgrades and has an existing airflow.cfg won't know they need to change this value - instead they will get nothing displayed in the UI (ajax request fails) and see "'NoneType' object has no attribute 'read'" in the error log. This validates that config section at start up, and seamlessly upgrades the old previous value. [AIRFLOW-3025] Enable specifying dns and dns_search options for DockerOperator (#3860) Enable specifying dns and dns_search options for DockerOperator [AIRFLOW-1298] Clear UPSTREAM_FAILED using the clean cli (#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' [AIRFLOW-3059] Log how many rows are read from Postgres (#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. [AIRFLOW-XXX] Fix typo in docs/timezone.rst (#3904) [AIRFLOW-3068] Remove deprecated imports [AIRFLOW-3036] Add relevant ECS options to ECS operator. (#3908) The ECS operator currently supports only a subset of available options for running ECS tasks. This patch adds all ECS options that could be relevant to airflow; options that wouldn't make sense here, like `count`, were skipped. [AIRFLOW-1195] Add feature to clear tasks in Parent Dag (#3907) [AIRFLOW-3073] Add note-Profiling feature not supported in new webserver (#3909) Adhoc queries and Charts features are no longer supported in new FAB-based webserver and UI. But this is not mentioned at all in the doc "Data Profiling" (https://airflow.incubator.apache.org/profiling.html) This commit adds a note to remind users for this. [AIRFLOW-XXX] Fix SlackWebhookOperator docs (#3915) The docs refer to `conn_id` while the actual argument is `http_conn_id`. [AIRFLOW-1441] Fix inconsistent tutorial code (#2466) [AIRFLOW-XXX] Add 90 Seconds to companies [AIRFLOW-3096] Further reduce DaysUntilStale for probo/stale [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role (#3913) [AIRFLOW-3090] Demote dag start/stop log messages to debug (#3920) [AIRFLOW-2407] Use feature detection for reload() (#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 [AIRFLOW-XXX] Fix a wrong sample bash command, a display issue & a few typos (#3924) [AIRFLOW-3090] Make No tasks to consider for execution debug (#3923) During normal operation, it is not necessary to see the message. This can only be useful when debugging an issue. AIRFLOW-2952 Fix Kubernetes CI (#3922) The current dockerised CI pipeline doesn't run minikube and the Kubernetes integration tests. This starts a Kubernetes cluster using minikube and runs k8s integration tests using docker-compose. [AIRFLOW-2918] Fix Flake8 violations (#3931) [AIRFLOW-3076] Remove preloading of MySQL testdata (#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future [AIRFLOW-2918] Remove unused imports [AIRFLOW-3099] Stop Missing Section Errors for optional sections (#3934) [AIRFLOW-3090] Specify path of key file in log message (#3921) [AIRFLOW-3067] Display www_rbac Flask flash msg properly (#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. [AIRFLOW-3109] Bugfix to allow user/op roles to clear task intance via UI by default add show statements to hql filtering. [AIRFLOW-3051] Change CLI to make users ops similar to connections The ability to manipulate users from the command line is a bit clunky. Currently 'airflow create_user' and 'airflow delete_user' and 'airflow list_users'. It seems that these ought to be made more like connections, so that it becomes 'airflow users list ...', 'airflow users delete ...' and 'airflow users create ...' [AIRFLOW-3009] Import Hashable from collection.abc to fix Python 3.7 deprecation warning (#3849) [AIRFLOW-XXX] Add Tesla as an Apache Airflow user (#3947) [AIRFLOW-3111] Fix instructions in UPDATING.md and remove comment (#3944) artifacts in default_airflow.cfg - fixed incorrect instructions in UPDATING.md regarding core.log_filename_template and elasticsearch.elasticsearch_log_id_template - removed comments referencing "additional curly braces" from default_airflow.cfg since they're irrelevant to the rendered airflow.cfg [AIRFLOW-3117] Add instructions to allow GPL dependency (#3949) The installation instructions failed to mention how to proceed with the GPL dependency. For those who are not concerned by GPL, it is useful to know how to proceed with GPL dependency. [AIRFLOW-XXX] Add Square to the companies lists [AIRFLOW-XXX] Add Fathom Health to readme [AIRFLOW-XXX] Pin Click to 6.7 to Fix CI (#3962) [AIRFLOW-XXX] Fix SlackWebhookOperator execute method comment (#3963) [AIRFLOW-3100][AIRFLOW-3101] Improve docker compose local testing (#3933) [AIRFLOW-3127] Fix out-dated doc for Celery SSL (#3967) Now in `airflow.cfg`, for Celery-SSL, the item names are "ssl_active", "ssl_key", "ssl_cert", and "ssl_cacert". (since PR https://github.com/apache/incubator-airflow/pull/2806/files) But in the documentation https://airflow.incubator.apache.org/security.html?highlight=celery or https://github.com/apache/incubator-airflow/blob/master/docs/security.rst, it's "CELERY_SSL_ACTIVE", "CELERY_SSL_KEY", "CELERY_SSL_CERT", and "CELERY_SSL_CACERT", which is out-dated and may confuse readers. [AIRFLOW-XXX] Fix PythonVirtualenvOperator tests (#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. [AIRFLOW-2952] Fix Kubernetes CI (#3957) - Update outdated cli command to create user - Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/` - Update the path to copy K8s dags [AIRFLOW-3104] Add .airflowignore info into doc (#3939) .airflowignore is a nice feature, but it was not mentioned at all in the documentation. [AIRFLOW-XXX] Add Delete for CLI Example in UPDATING.md [AIRFLOW-3123] Use a stack for DAG context management (#3956) [AIRFLOW-3125] Monitor Task Instances creation rates (#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. [AIRFLOW-3129] Backfill mysql hook unit tests. (#3970) [AIRFLOW-3124] Fix RBAC webserver debug mode (#3958) [AIRFLOW-XXX] Add Compass to companies list (#3972) We're using Airflow at Compass now. [AIRFLOW-XXX] Speed up DagBagTest cases (#3974) I noticed that many of the tests of DagBags operate on a specific DAG only, and don't need to load the example or test dags. By not loading the dags we don't need to this shaves about 10-20s of test time. [AIRFLOW-2912] Add Deploy and Delete operators for GCF (#3969) Both Deploy and Delete operators interact with Google Cloud Functions to manage functions. Both are idempotent and make use of GcfHook - hook that encapsulates communication with GCP over GCP API. [AIRFLOW-1390] Update Alembic to 0.9 (#3935) [AIRFLOW-2238] Update PR tool to remove outdated info (#3978) [AIRFLOW-XXX] Don't spam test logs with "bad cron expression" messages (#3973) We needed these test dags to check the behaviour of invalid cron expressions, but by default we were loading them every time we create a DagBag (which many, many tests to). Instead we ignore these known-bad dags by default, and the test checking those (tests/models.py:DagBagTest.test_process_file_cron_validity_check) is already explicitly processing those DAGs directly, so it remains tested. [AIRFLOW-XXX] Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. [AIRFLOW-3079] Improve migration scripts to support MSSQL Server (#3964) There were two problems for MSSQL. First, 'timestamp' data type in MSSQL Server is essentially a row-id, and not a timezone enabled date/time stamp. Second, alembic creates invalid SQL when applying the 0/1 constraint to boolean values. MSSQL should enforce this constraint by simply asserting a boolean value. [AIRFLOW-XXX] Add DoorDash to README.md (#3980) DoorDash uses Airflow https://softwareengineeringdaily.com/2018/09/28/doordash/ [AIRFLOW-3062] Add Qubole in integration docs (#3946) [AIRFLOW-3129] Improve test coverage of airflow.models. (#3982) [AIRFLOW-2574] Cope with '%' in SQLA DSN when running migrations (#3787) Alembic uses a ConfigParser like Airflow does, and "%% is a special value in there, so we need to escape it. As per the Alembic docs: > Note that this value is passed to ConfigParser.set, which supports > variable interpolation using pyformat (e.g. `%(some_value)s`). A raw > percent sign not part of an interpolation symbol must therefore be > escaped, e.g. `%%` [AIRFLOW-3137] Make ProxyFix middleware optional. (#3983) The ProxyFix middleware should only be used when airflow is running behind a trusted proxy. This patch adds a `USE_PROXY_FIX` flag that defaults to `False`. [AIRFLOW-3004] Add config disabling scheduler cron (#3899) [AIRFLOW-3103][AIRFLOW-3147] Update flask-appbuilder (#3937) [AIRFLOW-XXX] Fixing the issue in Documentation (#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation [AIRFLOW-3088] Include slack-compatible emoji image [AIRFLOW-3161] fix TaskInstance log link in RBAC UI [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (#3995) "Parameters" are used to help render the SQL command. But in this operator, only "schema" and "table" are needed. There is no SQL command to render. By checking the code,we can also find argument "parameters" is never really used. (Fix a minor issue in the docstring as well) [AIRFLOW-3159] Update GCS logging docs for latest code (#3952) Reformmatted to flaskdiff requirements. [AIRFLOW-XXX] Remove residual line in Changelog (#3814) [AIRFLOW-2930] Fix celery excecutor scheduler crash (#3784) Caused by an update in PR #3740. execute_command.apply_async(args=command, ...) -command is a list of short unicode strings and the above code pass multiple arguments to a function defined as taking only one argument. -command = ["airflow", "run", "dag323",...] -args = command = ["airflow", "run", "dag323", ...] -execute_command("airflow","run","dag3s3", ...) will be error and exit. [AIRFLOW-2854] kubernetes_pod_operator add more configuration items (#3697) * kubernetes_pod_operator add more configuration items * fix test_kubernetes_pod_operator test_faulty_service_account failure case * fix review comment issues * pod_operator add hostnetwork config * add doc example [AIRFLOW-2994] Fix command status check in Qubole Check operator (#3790) [AIRFLOW-2949] Add syntax highlight for single quote strings (#3795) * AIRFLOW-2949: Add syntax highlight for single quote strings * AIRFLOW-2949: Also updated new UI main.css [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference [AIRFLOW-2779] Make GHE auth third party licensed (#3803) This reinstates the original license. [AIRFLOW-XXX] Add Format to list of companies (#3824) [AIRFLOW-2900] Show code for packaged DAGs (#3749) [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (#3821) [AIRFLOW-2951] Update dag_run table end_date when state change (#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. [AIRFLOW-2145] fix deadlock on clearing running TI (#3657) a `shutdown` task is not considered be `unfinished`, so a dag run can deadlock when all `unfinished` downstreams are all waiting on a task that's in the `shutdown` state. fix this by considering `shutdown` to be `unfinished`, since it's not truly a terminal state [AIRFLOW-XXX] Fix typo in docstring of gcs_to_bq (#3833) [AIRFLOW-2476] Allow tabulate up to 0.8.2 (#3835) [AIRFLOW-XXX] Fix typos in faq.rst (#3837) [AIRFLOW-2979] Make celery_result_backend conf Backwards compatible (#3832) (#2806) Renamed `celery_result_backend` to `result_backend` and broke backwards compatibility. [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI (#3804) [AIRFLOW-3007] Update backfill example in Scheduler docs The scheduler docs at https://airflow.apache.org/scheduler.html#backfill-and-catchup use deprecated way of passing `schedule_interval`. `schedule_interval` should be pass to DAG as a separate parameter and not as a default arg. [AIRFLOW-3005] Replace 'Airbnb Airflow' with 'Apache Airflow' (#3845) [AIRFLOW-3002] Fix variable & tests in GoogleCloudBucketHelper (#3843) [AIRFLOW-2991] Log path to driver output after Dataproc job (#3827) [AIRFLOW-XXX] Fix python3 and flake8 errors in dev/airflow-jira This is a script that checks if the Jira's marked as fixed in a release are actually merged in - getting this working is helpful to me in preparing 1.10.1 [AIRFLOW-2883] Add import and export for pool cli using JSON [AIRFLOW-3021] Add Censys to who uses Airflow list > Censys > Find and analyze every reachable server and device on the Internet > https://censys.io/ closes AIRFLOW-3021 https://issues.apache.org/jira/browse/AIRFLOW-3021 Add Branch to Company List [AIRFLOW-3008] Move Kubernetes example DAGs to contrib [AIRFLOW-2997] Support cluster fields in bigquery (#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). [AIRFLOW-XXX] Redirect FAQ `airflow[crypto]` to How-to Guides. [AIRFLOW-XXX] Remove redundant space in Kerberos (#3866) [AIRFLOW-3028] Update Text & Images in Readme.md [AIRFLOW-1917] Trim extra newline and trailing whitespace from log (#3862) [AIRFLOW-2985] Operators for S3 object copying/deleting (#3823) 1. Copying: Under the hood, it's `boto3.client.copy_object()`. It can only handle the situation in which the S3 connection used can access both source and destination bucket/key. 2. Deleting: 2.1 Under the hood, it's `boto3.client.delete_objects()`. It supports either deleting one single object or multiple objects. 2.2 If users try to delete a non-existent object, the request will still succeed, but there will be an entry 'Errors' in the response. There may also be other reasons which may cause similar 'Errors' ( request itself would succeed without explicit exception). So an argument `silent_on_errors` is added to let users decide if this sort of 'Errors' should fail the operator. The corresponding methods are added into S3Hook, and these two operators are 'wrappers' of these methods. [AIRFLOW-3030] Fix CLI docs (#3872) [AIRFLOW-XXX] Update kubernetes.rst docs (#3875) Update kubernetes.rst with correct KubernetesPodOperator inputs for the volumes. [AIRFLOW-XXX] Add Enigma to list of companies [AIRFLOW-2965] CLI tool to show the next execution datetime Cover different cases - schedule_interval is "@once" or None, then following_schedule method would always return None - If dag is paused, print reminder - If latest_execution_date is not found, print warning saying not applicable. [AIRFLOW-XXX] Add Bombora Inc using Airflow [AIRFLOW-XXX] Move Dag level access control out of 1.10 section (#3882) It isn't in 1.10 (and wasn't in this section when the PR was created). [AIRFLOW-3012] Fix Bug when passing emails for SLA [AIRFLOW-2797] Create Google Dataproc cluster with custom image (#3871) [AIRFLOW-XXX] Updated README to include CAVA [AIRFLOW-3035] Allow custom 'job_error_states' in dataproc ops (#3884) Allow caller to pass in custom list of Dataproc job states into the DataProc*Operator classes that should result in the _DataProcJob.raise_error() method raising an Exception. [AIRFLOW-3034]: Readme updates : Add Slack & Twitter, remove Gitter [AIRFLOW-3056] Add happn to Airflow user list [AIRFLOW-3052] Add logo options to Airflow (#3892) [AIRFLOW-2524] Add SageMaker Batch Inference (#3767) * Fix for comments * Fix sensor test * Update non_terminal_states and failed_states to static variables of SageMakerHook Add SageMaker Transform Operator & Sensor Co-authored-by: srrajeev-aws <[email protected]> [AIRFLOW-XXX] Added Jeitto as one of happy Airflow users! (#3902) [AIRFLOW-XXX] Add Jeitto as one happy Airflow user! [AIRFLOW-3044] Dataflow operators accept templated job_name param (#3887) * Default value of new job_name param is templated task_id, to match the existing behavior as much as possible. * Change expected value in test_mlengine_operator_utils.py to match default for new job_name param. [AIRFLOW-2707] Validate task_log_reader on upgrade from <=1.9 (#3881) We changed the default logging config and config from 1.9 to 1.10, but anyone who upgrades and has an existing airflow.cfg won't know they need to change this value - instead they will get nothing displayed in the UI (ajax request fails) and see "'NoneType' object has no attribute 'read'" in the error log. This validates that config section at start up, and seamlessly upgrades the old previous value. [AIRFLOW-3025] Enable specifying dns and dns_search options for DockerOperator (#3860) Enable specifying dns and dns_search options for DockerOperator [AIRFLOW-1298] Clear UPSTREAM_FAILED using the clean cli (#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' [AIRFLOW-3059] Log how many rows are read from Postgres (#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. [AIRFLOW-XXX] Fix typo in docs/timezone.rst (#3904) [AIRFLOW-3068] Remove deprecated imports [AIRFLOW-3036] Add relevant ECS options to ECS operator. (#3908) The ECS operator currently supports only a subset of available options for running ECS tasks. This patch adds all ECS options that could be relevant to airflow; options that wouldn't make sense here, like `count`, were skipped. [AIRFLOW-1195] Add feature to clear tasks in Parent Dag (#3907) [AIRFLOW-3073] Add note-Profiling feature not supported in new webserver (#3909) Adhoc queries and Charts features are no longer supported in new FAB-based webserver and UI. But this is not mentioned at all in the doc "Data Profiling" (https://airflow.incubator.apache.org/profiling.html) This commit adds a note to remind users for this. [AIRFLOW-XXX] Fix SlackWebhookOperator docs (#3915) The docs refer to `conn_id` while the actual argument is `http_conn_id`. [AIRFLOW-1441] Fix inconsistent tutorial code (#2466) [AIRFLOW-XXX] Add 90 Seconds to companies [AIRFLOW-3096] Further reduce DaysUntilStale for probo/stale [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role (#3913) [AIRFLOW-3090] Demote dag start/stop log messages to debug (#3920) [AIRFLOW-2407] Use feature detection for reload() (#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 [AIRFLOW-XXX] Fix a wrong sample bash command, a display issue & a few typos (#3924) [AIRFLOW-3090] Make No tasks to consider for execution debug (#3923) During normal operation, it is not necessary to see the message. This can only be useful when debugging an issue. AIRFLOW-2952 Fix Kubernetes CI (#3922) The current dockerised CI pipeline doesn't run minikube and the Kubernetes integration tests. This starts a Kubernetes cluster using minikube and runs k8s integration tests using docker-compose. [AIRFLOW-2918] Fix Flake8 violations (#3931) [AIRFLOW-3076] Remove preloading of MySQL testdata (#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future [AIRFLOW-2918] Remove unused imports [AIRFLOW-3099] Stop Missing Section Errors for optional sections (#3934) [AIRFLOW-3090] Specify path of key file in log message (#3921) [AIRFLOW-3067] Display www_rbac Flask flash msg properly (#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. [AIRFLOW-3109] Bugfix to allow user/op roles to clear task intance via UI by default add show statements to hql filtering. [AIRFLOW-3051] Change CLI to make users ops similar to connections The ability to manipulate users from the command line is a bit clunky. Currently 'airflow create_user' and 'airflow delete_user' and 'airflow list_users'. It seems that these ought to be made more like connections, so that it becomes 'airflow users list ...', 'airflow users delete ...' and 'airflow users create ...' [AIRFLOW-3009] Import Hashable from collection.abc to fix Python 3.7 deprecation warning (#3849) [AIRFLOW-XXX] Add Tesla as an Apache Airflow user (#3947) [AIRFLOW-3111] Fix instructions in UPDATING.md and remove comment (#3944) artifacts in default_airflow.cfg - fixed incorrect instructions in UPDATING.md regarding core.log_filename_template and elasticsearch.elasticsearch_log_id_template - removed comments referencing "additional curly braces" from default_airflow.cfg since they're irrelevant to the rendered airflow.cfg [AIRFLOW-3117] Add instructions to allow GPL dependency (#3949) The installation instructions failed to mention how to proceed with the GPL dependency. For those who are not concerned by GPL, it is useful to know how to proceed with GPL dependency. [AIRFLOW-XXX] Add Square to the companies lists [AIRFLOW-XXX] Add Fathom Health to readme [AIRFLOW-XXX] Pin Click to 6.7 to Fix CI (#3962) [AIRFLOW-XXX] Fix SlackWebhookOperator execute method comment (#3963) [AIRFLOW-3100][AIRFLOW-3101] Improve docker compose local testing (#3933) [AIRFLOW-3127] Fix out-dated doc for Celery SSL (#3967) Now in `airflow.cfg`, for Celery-SSL, the item names are "ssl_active", "ssl_key", "ssl_cert", and "ssl_cacert". (since PR https://github.com/apache/incubator-airflow/pull/2806/files) But in the documentation https://airflow.incubator.apache.org/security.html?highlight=celery or https://github.com/apache/incubator-airflow/blob/master/docs/security.rst, it's "CELERY_SSL_ACTIVE", "CELERY_SSL_KEY", "CELERY_SSL_CERT", and "CELERY_SSL_CACERT", which is out-dated and may confuse readers. [AIRFLOW-XXX] Fix PythonVirtualenvOperator tests (#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. [AIRFLOW-2952] Fix Kubernetes CI (#3957) - Update outdated cli command to create user - Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/` - Update the path to copy K8s dags [AIRFLOW-3104] Add .airflowignore info into doc (#3939) .airflowignore is a nice feature, but it was not mentioned at all in the documentation. [AIRFLOW-XXX] Add Delete for CLI Example in UPDATING.md [AIRFLOW-3123] Use a stack for DAG context management (#3956) [AIRFLOW-3125] Monitor Task Instances creation rates (#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. [AIRFLOW-3129] Backfill mysql hook unit tests. (#3970) [AIRFLOW-3124] Fix RBAC webserver debug mode (#3958) [AIRFLOW-XXX] Add Compass to companies list (#3972) We're using Airflow at Compass now. [AIRFLOW-XXX] Speed up DagBagTest cases (#3974) I noticed that many of the tests of DagBags operate on a specific DAG only, and don't need to load the example or test dags. By not loading the dags we don't need to this shaves about 10-20s of test time. [AIRFLOW-2912] Add Deploy and Delete operators for GCF (#3969) Both Deploy and Delete operators interact with Google Cloud Functions to manage functions. Both are idempotent and make use of GcfHook - hook that encapsulates communication with GCP over GCP API. [AIRFLOW-1390] Update Alembic to 0.9 (#3935) [AIRFLOW-2238] Update PR tool to remove outdated info (#3978) [AIRFLOW-XXX] Don't spam test logs with "bad cron expression" messages (#3973) We needed these test dags to check the behaviour of invalid cron expressions, but by default we were loading them every time we create a DagBag (which many, many tests to). Instead we ignore these known-bad dags by default, and the test checking those (tests/models.py:DagBagTest.test_process_file_cron_validity_check) is already explicitly processing those DAGs directly, so it remains tested. [AIRFLOW-XXX] Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. [AIRFLOW-3079] Improve migration scripts to support MSSQL Server (#3964) There were two problems for MSSQL. First, 'timestamp' data type in MSSQL Server is essentially a row-id, and not a timezone enabled date/time stamp. Second, alembic creates invalid SQL when applying the 0/1 constraint to boolean values. MSSQL should enforce this constraint by simply asserting a boolean value. [AIRFLOW-XXX] Add DoorDash to README.md (#3980) DoorDash uses Airflow https://softwareengineeringdaily.com/2018/09/28/doordash/ [AIRFLOW-3062] Add Qubole in integration docs (#3946) [AIRFLOW-3129] Improve test coverage of airflow.models. (#3982) [AIRFLOW-2574] Cope with '%' in SQLA DSN when running migrations (#3787) Alembic uses a ConfigParser like Airflow does, and "%% is a special value in there, so we need to escape it. As per the Alembic docs: > Note that this value is passed to ConfigParser.set, which supports > variable interpolation using pyformat (e.g. `%(some_value)s`). A raw > percent sign not part of an interpolation symbol must therefore be > escaped, e.g. `%%` [AIRFLOW-3137] Make ProxyFix middleware optional. (#3983) The ProxyFix middleware should only be used when airflow is running behind a trusted proxy. This patch adds a `USE_PROXY_FIX` flag that defaults to `False`. [AIRFLOW-3004] Add config disabling scheduler cron (#3899) [AIRFLOW-3103][AIRFLOW-3147] Update flask-appbuilder (#3937) [AIRFLOW-XXX] Fixing the issue in Documentation (#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation [AIRFLOW-3088] Include slack-compatible emoji image [AIRFLOW-3161] fix TaskInstance log link in RBAC UI [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (#3995) "Parameters" are used to help render the SQL command. But in this operator, only "schema" and "table" are needed. There is no SQL command to render. By checking the code,we can also find argument "parameters" is never really used. (Fix a minor issue in the docstring as well) [AIRFLOW-3159] Update GCS logging docs for latest code (#3952) [AIRFLOW-XXX] Fix airflow.models.DAG docstring mistake Closes #4004 from Sambeth/sambeth Updated the tests written for s3/sftp operators Fixed the flask diff errors. Fixed aws connection test Fixed flask diff errors. Updated test_s3_to_sftp_operator with correct class name. Fixed test_s3_to_sftp_operator error reported in travis. Fixed test_s3_to_sftp_operator error reported in travis. Changed default values for s3_to_sftp_operator Updated test for checking for sftp file content. Fixed flask diff error. [AIRFLOW-XXX] Adding Home Depot as users of Apache airflow (#4013) * Adding Home Depot as users of Apache airflow [AIRFLOW-XXX] Added ThoughtWorks as user of Airflow in README (#4012) [AIRFLOW-XXX] Added DataCamp to list of companies in README (#4009) [AIRFLOW-3165] Document interpolation of '%' and warn (#4007) [AIRFLOW-3099] Complete list of optional airflow.cfg sections (#4002) [AIRFLOW-3162] Fix HttpHook URL parse error when port is specified (#4001) [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook (#3894) * [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook

…e#3830)

KevinYang21 force-pushed the kevin_yang_paralyzing_celery_fetching branch from a8bd2b5 to 23faf39 Compare September 1, 2018 01:24

feng-tao reviewed Sep 4, 2018

View reviewed changes

KevinYang21 force-pushed the kevin_yang_paralyzing_celery_fetching branch from 23faf39 to fdc7423 Compare September 5, 2018 01:18

ashb reviewed Sep 6, 2018

View reviewed changes

KevinYang21 force-pushed the kevin_yang_paralyzing_celery_fetching branch 2 times, most recently from 2552558 to 59910b2 Compare September 9, 2018 06:59

KevinYang21 changed the title ~~[AIRFLOW-2156] Parallelize Celery Executor~~ [AIRFLOW-2156] Parallelize Celery Executor task state fetching Sep 9, 2018

KevinYang21 force-pushed the kevin_yang_paralyzing_celery_fetching branch from 59910b2 to 9b977b0 Compare September 9, 2018 10:34

[AIRFLOW-2156] Parallelize Celery Executor task state fetching

b0b5b98

KevinYang21 force-pushed the kevin_yang_paralyzing_celery_fetching branch from 9b977b0 to b0b5b98 Compare September 10, 2018 19:30

feng-tao merged commit 9b82fcb into apache:master Sep 11, 2018

aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019

[AIRFLOW-2156] Parallelize Celery Executor task state fetching (apach…

e6bfd5e

…e#3830)

ashb pushed a commit to ashb/airflow that referenced this pull request Mar 6, 2019

[AIRFLOW-2156] Parallelize Celery Executor task state fetching (apach…

be7121b

…e#3830)

r4um mentioned this pull request Sep 20, 2022

Task sync fails when using celery executor with AWS SQS broker. #26520

Closed

2 tasks

		self.traceback = exception_traceback


		def fetch_celery_task_state(celery_task):

[AIRFLOW-2156] Parallelize Celery Executor task state fetching #3830

[AIRFLOW-2156] Parallelize Celery Executor task state fetching #3830

Conversation

KevinYang21 commented Sep 1, 2018 • edited Loading

Jira

Description

Tests

Commits

Documentation

Code Quality

codecov-io commented Sep 1, 2018 • edited Loading

Codecov Report

feng-tao commented Sep 1, 2018

KevinYang21 commented Sep 1, 2018

Fokko commented Sep 2, 2018

kaxil commented Sep 3, 2018 • edited Loading

KevinYang21 commented Sep 4, 2018

feng-tao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fokko Sep 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KevinYang21 commented Sep 5, 2018

feng-tao commented Sep 5, 2018

KevinYang21 commented Sep 5, 2018

ashb commented Sep 6, 2018

ashb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KevinYang21 Sep 8, 2018 • edited Loading

Choose a reason for hiding this comment

KevinYang21 commented Sep 8, 2018

KevinYang21 commented Sep 9, 2018 • edited Loading

feng-tao commented Sep 10, 2018

KevinYang21 commented Sep 10, 2018

feng-tao commented Sep 10, 2018 • edited Loading

feng-tao commented Sep 11, 2018

Fokko commented Sep 12, 2018

KevinYang21 commented Sep 1, 2018 •

edited

Loading

codecov-io commented Sep 1, 2018 •

edited

Loading

kaxil commented Sep 3, 2018 •

edited

Loading

Fokko Sep 9, 2018 •

edited

Loading

KevinYang21 Sep 8, 2018 •

edited

Loading

KevinYang21 commented Sep 9, 2018 •

edited

Loading

feng-tao commented Sep 10, 2018 •

edited

Loading