Various fixes on ECS run task operator #31838

vandonr-amz · 2023-06-11T00:51:33Z

fixing a bunch of problems around the ECS run task operator:

it was checking for task status even when we were not waiting for completion, which would result in a failure if the check happens fast enough because the task would still be pending
added a warning about trying to pull logs and not waiting for completion, which is just a weird thing to do, because logs might be truncated, they might not even be there yet. I'm assuming it was always undefined behavior, so I'm going ahead and not even starting the thread if wait for completion is false.
added a log when logs are not present (yet) so that if the issue persists, users have a message guiding them towards resolution rather than complete silence

The system tests were not properly configured for logs too. I added the necessary configuration, a cleanup step, and moved the sensor to ecs_fargate so that we can have the normal behavior querying logs in the vanilla ecs test.

… waiting for completion

vandonr-amz · 2023-06-11T01:00:13Z

airflow/providers/amazon/aws/operators/ecs.py

+    @staticmethod
+    def _get_ecs_task_id(task_arn: str) -> str:
+        return task_arn.split("/")[-1]


doing this because I thought that remembering to update the ecs_task_id every time the arn gets changed was a bit brittle. This method makes the dependency arn->task_id explicit.
Yes it means we're going to recompute it each time we need it, but I think it's negligible.

vandonr-amz · 2023-06-11T01:03:00Z

tests/system/providers/amazon/aws/example_ecs.py

+    # A bit brutal to delete the whole group, I know,
+    # but we don't have the access to the arn of the task which is used in the stream name
+    # and also those logs just contain "hello world", which is not very interesting.
+    client.delete_log_group(logGroupName=group_name)


this is going to fail if the group does not exist, so in a way it makes sure the log configuration stays correct.

airflow/providers/amazon/aws/hooks/ecs.py

o-nikolas · 2023-06-12T16:42:29Z

airflow/providers/amazon/aws/operators/ecs.py

+        if not self.wait_for_completion:
+            return
+


Whatever logs users were getting for the short period of time without a wait_for_completion they will no longer get. So we're calling it a bug fix with no deprecation?

Yeah, idk, we may want to keep the existing behavior, but what I don't like about it is that it made the operator slower just for the sake of maybe getting a couple of logs...
Since we were starting the thread, which slept for 30 seconds (or configured value) before checking if it was stopped, this operator would take 30 seconds to return no matter what, when the job was done in a second and a half.
It's like "don't wait for completion but still wait a bit"

I'd agree with Raph, I dont think this is a desired behavior but more an forgotten edge case. I would call it a bug fix

Ack, I'll call that quorum then, let's call it a bug fix 👍

vincbeck · 2023-06-16T14:15:15Z

airflow/providers/amazon/aws/operators/ecs.py

+        if not self.wait_for_completion:
+            return
+


I'd agree with Raph, I dont think this is a desired behavior but more an forgotten edge case. I would call it a bug fix

tests/system/providers/amazon/aws/example_ecs.py

This method is just causing trouble by handling several things, it's hiding the logic. A bug fixed in apache#31838 was reintroduced in apache#31881 because the check that was skipped on `wait_for_completion` was not skipped anymore. The bug is that checking the status will always fail if not waiting for completion, because obviously the task is not ready just after creation.

This method is just causing trouble by handling several things, it's hiding the logic. A bug fixed in #31838 was reintroduced in #31881 because the check that was skipped on `wait_for_completion` was not skipped anymore. The bug is that checking the status will always fail if not waiting for completion, because obviously the task is not ready just after creation.

This method is just causing trouble by handling several things, it's hiding the logic. A bug fixed in apache#31838 was reintroduced in apache#31881 because the check that was skipped on `wait_for_completion` was not skipped anymore. The bug is that checking the status will always fail if not waiting for completion, because obviously the task is not ready just after creation.

vandonr-amz added 2 commits June 10, 2023 17:11

ECS Run Task op should not try to get logs or check the status if not…

a3afef6

… waiting for completion

fix system tests

f96c0fa

vandonr-amz requested review from eladkal and o-nikolas as code owners June 11, 2023 00:51

boring-cyborg bot added area:providers area:system-tests provider:amazon-aws AWS/Amazon - related issues labels Jun 11, 2023

forgot the trigger

a899019

vandonr-amz commented Jun 11, 2023

View reviewed changes

uranusjr reviewed Jun 12, 2023

View reviewed changes

airflow/providers/amazon/aws/hooks/ecs.py Outdated Show resolved Hide resolved

o-nikolas reviewed Jun 12, 2023

View reviewed changes

vandonr-amz added 4 commits June 13, 2023 14:59

single line comment & static check fix

176cc6a

fix tests

3347548

Merge remote-tracking branch 'origin/main' into vandonr/tests

7ea0126

fix doc

e1c9268

vincbeck reviewed Jun 16, 2023

View reviewed changes

vandonr-amz added 2 commits June 16, 2023 09:00

dynamic log group

9c2a3b5

Merge remote-tracking branch 'origin/main' into vandonr/tests

ce01b0c

vincbeck approved these changes Jun 16, 2023

View reviewed changes

o-nikolas approved these changes Jun 16, 2023

View reviewed changes

o-nikolas merged commit e0f21f4 into apache:main Jun 16, 2023

vandonr-amz deleted the vandonr/tests branch June 16, 2023 19:28

eladkal mentioned this pull request Jun 20, 2023

Status of testing Providers that were prepared on June 20, 2023 #32030

Closed

86 tasks

vandonr-amz mentioned this pull request Jun 23, 2023

bugfix: break down run+wait method in ECS operator #32104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various fixes on ECS run task operator #31838

Various fixes on ECS run task operator #31838

vandonr-amz commented Jun 11, 2023

vandonr-amz Jun 11, 2023

vandonr-amz Jun 11, 2023

o-nikolas Jun 12, 2023

vandonr-amz Jun 13, 2023

vincbeck Jun 16, 2023

o-nikolas Jun 16, 2023

vincbeck Jun 16, 2023

Various fixes on ECS run task operator #31838

Various fixes on ECS run task operator #31838

Conversation

vandonr-amz commented Jun 11, 2023

vandonr-amz Jun 11, 2023

Choose a reason for hiding this comment

vandonr-amz Jun 11, 2023

Choose a reason for hiding this comment

o-nikolas Jun 12, 2023

Choose a reason for hiding this comment

vandonr-amz Jun 13, 2023

Choose a reason for hiding this comment

vincbeck Jun 16, 2023

Choose a reason for hiding this comment

o-nikolas Jun 16, 2023

Choose a reason for hiding this comment

vincbeck Jun 16, 2023

Choose a reason for hiding this comment