Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable trigger logging in webserver #27758

Merged
merged 100 commits into from
Feb 4, 2023

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Nov 17, 2022

Implemented for all handlers except alibaba.

Different handlers may be implemented in slightly different ways depending on their characteristics.

Blob storage handlers work by writing to file and then uploading when task is complete. For these handlers, each trigger writes to its own file and at trigger end the file is uploaded with a suffix that distinguishes it from the task log file.

Blob storage handlers include the following:

  • GCS
  • S3
  • WASB
  • Alibaba (not implemented due to missing functionality in hook)

For file-based handlers, we have to do two things to make this work with triggers.

The first is, we don't emit to them synchronously. We pipe the messages through a queue to a QueueListener that runs in a thread and emits them to the potentially-blocking handler.

The second is, we need to create a distinct instance of the task handler for each trigger because each instance corresponds to a specific file (and each trigger needs to write to a distinct file). To accomplish this we add a wrapper handler that manages the individual handlers and routes the messages accordingly.

The other category of handlers is a type of "streaming" handler, where messages are pushed through an API to a remote logging service more or less as they are emitted. This category includes the following handlers:

  • Cloudwatch
  • Elasticsearch
  • Stackdriver

Each of the streaming handlers has slightly different characteristics.

Stackdriver essentially has fully native support for triggers. It doesn't require wrapping or routing messages through queuelistener. This is because for one it already runs in a thread and routes messages through a queue internally so as not to be blocking. Additionally it already attached the necessary labels to the record in each call of emit so updathing it to generate those based on the task instance attached to the LogRecord was a trivial matter.

Cloudwatch could be made "native" like stackdriver but the underlying library (watchtower, a 3rd party community library) doesn't quite have the necessary behavior. So for now it still requires using the wrapper and queuelistener.

Elasticsearch is a bit of a hybrid. By default it is filebased and assumes you have a something like filebeat to ship the files. In that way it behaves the same as blob storage handlers. But it has an optional config where messages go to stdout. It it is possible that it could be modifed such that when run in stdout mode it could be "native" like stackdriver where there's just one instance of the handler but such work is not taken on here.

Try it with sample dag:

with DAG(
    dag_id="example_time_delta_sensor_async",
    schedule=None,
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    catchup=False,
    tags=["example"],
):
    wait = TimeDeltaSensorAsync(task_id="wait", delta=datetime.timedelta(seconds=30))

Example:

screen-recording-tigger-logs.mov

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:CLI area:logging area:Scheduler including HA (high availability) scheduler labels Nov 17, 2022
@potiuk
Copy link
Member

potiuk commented Nov 18, 2022

nice

@dstandish dstandish force-pushed the enable-trigger-logging branch from d8911a8 to 20b4ce2 Compare November 21, 2022 08:39
@dstandish dstandish force-pushed the enable-trigger-logging branch from bf660ba to 782c5a0 Compare November 30, 2022 06:36
@dstandish dstandish force-pushed the enable-trigger-logging branch 5 times, most recently from 6e929a8 to f6ef57a Compare December 12, 2022 22:45
@dstandish dstandish changed the title WIP - Enable trigger logging in webserver -- proof of concept WIP - Enable trigger logging in webserver Dec 15, 2022
@dstandish dstandish force-pushed the enable-trigger-logging branch 2 times, most recently from 3450f65 to 8469e8d Compare December 17, 2022 08:31
airflow/utils/serve_logs.py Outdated Show resolved Hide resolved
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to re-read the configure_trigger_log_handler a couple of times to grok it I think :lolsob:

@dstandish dstandish force-pushed the enable-trigger-logging branch 3 times, most recently from 975b75e to 2921f37 Compare December 22, 2022 09:30
@dstandish
Copy link
Contributor Author

I'll need to re-read the configure_trigger_log_handler a couple of times to grok it I think :lolsob:

yeah it's def a bit much

i have just pushed a change trying to make it a bit more readable

@dstandish dstandish force-pushed the enable-trigger-logging branch 3 times, most recently from 7534d8c to 3b40361 Compare December 22, 2022 09:52
@dstandish dstandish merged commit 1b18a50 into apache:main Feb 4, 2023
@dstandish dstandish deleted the enable-trigger-logging branch February 4, 2023 04:39
@dstandish
Copy link
Contributor Author

o-nikolas added a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Feb 11, 2023
dstandish pushed a commit that referenced this pull request Feb 11, 2023
dstandish added a commit to astronomer/airflow that referenced this pull request Feb 12, 2023
dstandish added a commit that referenced this pull request Feb 12, 2023
Restores trigger logging PR #27758 which was reverted (#29472) due to CI problems.

See commit 5e8aed9 for the fix.  We just mock _serve_logs so log server isn't started during the test.

This reverts commit 60d4bcd.
pankajkoti added a commit to astronomer/astronomer-providers that referenced this pull request Feb 27, 2023
With the 5.0.0 release of the Microsoft Azure PR, the PR apache/airflow#27758
introduced a breaking change where while creating new connections,
the ``extra__`` prefix is no longer set for extra fields in the conneciton.
This issue was not identified with testing the 5.0.0 RC because, it
only happens for new connections that are created. The existing connections
still contain the extra fiels with the ``extra__`` prefix. Hence, the
existing code which looks for the connection field with the prefix
``extra__azure_data_factory__subscriptionId`` works on the older deployment
with the new provider release as the connection was created before the release,
but it fails on new deployments when a fresh connection is created.

To fix this, we're removing the prefix while retrieving the connection
field and at the same time we're supporting previously created connections
by using the same ``get_field`` method from Airflow OSS introduced
in the same PR above to allow backward compatibility.
@pierrejeambrun pierrejeambrun added the type:new-feature Changelog: New Features label Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:CLI area:logging area:Scheduler including HA (high availability) scheduler area:UI Related to UI/UX. For Frontend Developers. type:new-feature Changelog: New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants