Deferrable mode for ECS operators #31881

vandonr-amz · 2023-06-13T19:45:26Z

Add deferrable mode to ECS operators that can make use of it:

Run Task Operator
Create Cluster Operator
Delete Cluster Operator

The trickiest one is the Run Task operator because it made use of a thread to fetch logs while waiting. I implemented this in the triggerer by staying in the same thread, but pulling logs between waiter checks.

closes #31636

cc @syedahsn

…gger

vandonr-amz · 2023-06-13T19:46:35Z

airflow/providers/amazon/aws/operators/ecs.py

@@ -67,6 +71,15 @@ def execute(self, context: Context):
        """Must overwrite in child classes."""
        raise NotImplementedError("Please implement execute() in subclass")

+    def _complete_exec_with_cluster_desc(self, context, event=None):


this callback is shared between create and delete cluster operators, so I put it there. It felt like a better solution than copy-pasting it for both.

vandonr-amz · 2023-06-13T19:48:47Z

airflow/providers/amazon/aws/operators/ecs.py

+            ...  # TODO return last log line but task_log_fetcher will always be None here
+
+    @provide_session
+    def _after_execution(self, session=None):


I wanted to extract this to reuse it in execute and execute_complete, but I wouldn't find a great name for it.

vandonr-amz · 2023-06-13T19:49:38Z

airflow/providers/amazon/aws/operators/ecs.py

-    def _start_wait_check_task(self, context):
+    def _start_wait_task(self, context):


the check went to _after_execution

vandonr-amz · 2023-06-13T20:02:02Z

airflow/providers/amazon/aws/triggers/ecs.py

+
+        yield TriggerEvent({"status": "success", "task_arn": self.task_arn})
+
+    async def _forward_logs(self, logs_client, next_token: str | None = None) -> str | None:


this is code that very inspired by https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/hooks/logs.py#L53 but since I need to use an async call in the middle, refactoring the existing code to allow that seemed like a lot of added complexity to the existing code

should this be part of the hook implementation?

of the logs hook you mean ? idk, I wasn't sure if it was a good idea to duplicate the code in the hook, I found it easier to write something fitting exactly my need here.

airflow/providers/amazon/aws/operators/ecs.py

syedahsn · 2023-06-14T20:45:15Z

airflow/providers/amazon/aws/operators/ecs.py

-            # In some circumstances the ECS Cluster is deleted immediately,
-            # so there is no reason to wait for completion.
+            # if the cluster doesn't have capacity providers that are associated with it,
+            # the deletion is instantaneous, and we don't need to wait for it.


cluster_details has the capacityProviders associated with the nodegroup. Would that be a better way to decide whether we want to wait for completion or not?
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ecs/client/delete_cluster.html

hmm, we could do that, but the check on the status above is already taking care of that. We can write a different check, but the result would be the same.

syedahsn · 2023-06-14T20:53:32Z

airflow/providers/amazon/aws/triggers/ecs.py

+                    )
+                    # we reach this point only if the waiter met a success criteria
+                    yield TriggerEvent({"status": "success", "arn": self.cluster_arn})
+                    return


I feel that, in this case, by using a generic Trigger for both the cluster_active and cluster_inactive, we are simplifying the code, but at the expense of user experience. Specifically, I think the exceptions raised should tell the user what the specific error was (i.e. create cluster failed etc.)

I don't know if this is a moot point because of the EMR Serverless custom waiters PR, which will clean up a lot of this code.

I'd vote for not modifying that code now since we can replace it with the common helper as soon as it's merged

tim-x-y-z · 2023-06-19T09:21:39Z

I believe this will close #31636

syedahsn

LGTM

This method is just causing trouble by handling several things, it's hiding the logic. A bug fixed in apache#31838 was reintroduced in apache#31881 because the check that was skipped on `wait_for_completion` was not skipped anymore. The bug is that checking the status will always fail if not waiting for completion, because obviously the task is not ready just after creation.

This method is just causing trouble by handling several things, it's hiding the logic. A bug fixed in #31838 was reintroduced in #31881 because the check that was skipped on `wait_for_completion` was not skipped anymore. The bug is that checking the status will always fail if not waiting for completion, because obviously the task is not ready just after creation.

This method is just causing trouble by handling several things, it's hiding the logic. A bug fixed in apache#31838 was reintroduced in apache#31881 because the check that was skipped on `wait_for_completion` was not skipped anymore. The bug is that checking the status will always fail if not waiting for completion, because obviously the task is not ready just after creation.

vandonr-amz added 8 commits June 5, 2023 14:42

add deferrable mode for ECS Create Cluster

f4ea7d9

execute task - easy part, without the logs

4dddc9a

add logs support to run task operator

a6e5c0e

add tests

bca71da

add deferrable for delete cluster, by adapting the create cluster tri…

59ff8f4

…gger

add deferrable parameter

5dc163b

rearranging code around a bit

71b9648

tests

ab0f65d

boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Jun 13, 2023

vandonr-amz commented Jun 13, 2023

View reviewed changes

vandonr-amz added 5 commits June 13, 2023 14:16

add dots in comments

a421457

fix test

b2d1dae

Merge remote-tracking branch 'origin/main' into vandonr/deferrable

913daeb

add trigger to yaml

305ea5c

return last line of logs

eab95e0

vandonr-amz marked this pull request as ready for review June 13, 2023 22:23

vandonr-amz requested review from eladkal and o-nikolas as code owners June 13, 2023 22:23

is this the right integration name ?

02ec64e

syedahsn reviewed Jun 14, 2023

View reviewed changes

airflow/providers/amazon/aws/operators/ecs.py Show resolved Hide resolved

syedahsn reviewed Jun 14, 2023

View reviewed changes

vandonr-amz added 7 commits June 14, 2023 16:08

add timeouts

960d1c5

fix CI + some fix

7141b0e

Merge remote-tracking branch 'origin/main' into vandonr/deferrable

e215315

adjust expected value in test

2eb2252

Merge remote-tracking branch 'origin/main' into vandonr/deferrable

8861815

fix

6ecd019

Merge remote-tracking branch 'origin/main' into vandonr/deferrable

3038246

vandonr-amz added 5 commits June 20, 2023 15:34

Merge remote-tracking branch 'origin/main' into vandonr/deferrable

76a9f39

rename method in test

439bbd9

Merge remote-tracking branch 'origin/main' into vandonr/deferrable

5beb6b2

use newly available wait method

c5d50ad

fix test

4e65d26

syedahsn approved these changes Jun 23, 2023

View reviewed changes

potiuk approved these changes Jun 23, 2023

View reviewed changes

potiuk merged commit 415e076 into apache:main Jun 23, 2023

vandonr-amz deleted the vandonr/deferrable branch June 23, 2023 18:20

vandonr-amz mentioned this pull request Jun 23, 2023

bugfix: break down run+wait method in ECS operator #32104

Merged

ferruzzi pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Jun 27, 2023

Deferrable mode for ECS operators (apache#31881)

fc946ef

vandonr-amz mentioned this pull request Jul 5, 2023

Introduce a base class for aws triggers #32274

Merged

This was referenced Jul 6, 2023

Status of testing Providers that were prepared on July 06, 2023 #32389

Closed

Status of testing Providers that were prepared on July 09, 2023 #32460

Closed

nathadfield mentioned this pull request Jul 13, 2023

execution_timeout seems to be ignored in EcsRunTaskOperator when using deferable #32580

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deferrable mode for ECS operators #31881

Deferrable mode for ECS operators #31881

vandonr-amz commented Jun 13, 2023 •

edited

Loading

vandonr-amz Jun 13, 2023

vandonr-amz Jun 13, 2023

vandonr-amz Jun 13, 2023

vandonr-amz Jun 13, 2023

syedahsn Jun 22, 2023

vandonr-amz Jun 22, 2023

syedahsn Jun 14, 2023 •

edited

Loading

vandonr-amz Jun 14, 2023

syedahsn Jun 14, 2023

vandonr-amz Jun 15, 2023

tim-x-y-z commented Jun 19, 2023

syedahsn left a comment

		def _start_wait_check_task(self, context):
		def _start_wait_task(self, context):


		yield TriggerEvent({"status": "success", "task_arn": self.task_arn})

		async def _forward_logs(self, logs_client, next_token: str \| None = None) -> str \| None:

Deferrable mode for ECS operators #31881

Deferrable mode for ECS operators #31881

Conversation

vandonr-amz commented Jun 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

syedahsn Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tim-x-y-z commented Jun 19, 2023

syedahsn left a comment

Choose a reason for hiding this comment

vandonr-amz commented Jun 13, 2023 •

edited

Loading

syedahsn Jun 14, 2023 •

edited

Loading