Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include run_id in TaskInstance Logging for Enhanced Traceability and Debugging #39272

Closed
2 tasks done
RyuSA opened this issue Apr 26, 2024 · 1 comment · Fixed by #39280
Closed
2 tasks done

Include run_id in TaskInstance Logging for Enhanced Traceability and Debugging #39272

RyuSA opened this issue Apr 26, 2024 · 1 comment · Fixed by #39280

Comments

@RyuSA
Copy link
Contributor

RyuSA commented Apr 26, 2024

Description

Currently, the log outputs for TaskInstance state changes within TaskInstance.py do not include the run_id.

i.e https://github.com/apache/airflow/blob/2.9.0/airflow/models/taskinstance.py#L1200

This omission makes it difficult to directly trace and verify the status of specific Task Instances without relying on inference from the execution_date. Including run_id in these logs would greatly enhance the traceability and debugging process, particularly in environments where multiple runs might be managed concurrently.

I propose changing the log outputs to include run_id in TaskInstance._log_state and TaskInstance._run_raw_task#TaskDeferred log entry. The run_id is one of the primary keys for TaskInstances and I think its inclusion in the logs should not introduce any issues.

Current Log Outputs

  • Pausing task as DEFERRED: dag_id=Test1, task_id=task1, execution_date=..., start_date=...
  • Marking task as SUCCESS: dag_id=Test1, task_id=task1, execution_date=..., start_date=..., end_date=...

Proposed Log Outouts

  • Pausing task as DEFERRED: dag_id=Test1, task_id=task1, run_id=RUN_ID, execution_date=..., start_date=...
  • Marking task as SUCCESS: dag_id=Test1, task_id=task1, run_id=RUN_ID, execution_date=..., start_date=..., end_date=...

Use case/motivation

Debugging and troubleshooting specific DAGRuns on Workers can be a little bit cumbersome without the ability to search logs by run_id.
Currently, we need infer execution_date from run_id to search worker log to find the final status of a TaskInstance on a Worker. This leads to unnecessary queries and steps. Including run_id in the logs would simplify log searches in distributed environments(like Airflow on Kubernetes) and likely assist other users facing similar challenges.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@RyuSA RyuSA added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Apr 26, 2024
Copy link

boring-cyborg bot commented Apr 26, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants