Include run_id
in TaskInstance Logging for Enhanced Traceability and Debugging
#39272
Closed
2 tasks done
Labels
Description
Currently, the log outputs for TaskInstance state changes within
TaskInstance.py
do not include therun_id
.i.e https://github.com/apache/airflow/blob/2.9.0/airflow/models/taskinstance.py#L1200
This omission makes it difficult to directly trace and verify the status of specific Task Instances without relying on inference from the
execution_date
. Includingrun_id
in these logs would greatly enhance the traceability and debugging process, particularly in environments where multiple runs might be managed concurrently.I propose changing the log outputs to include
run_id
inTaskInstance._log_state
andTaskInstance._run_raw_task#TaskDeferred
log entry. Therun_id
is one of the primary keys for TaskInstances and I think its inclusion in the logs should not introduce any issues.Current Log Outputs
Proposed Log Outouts
Use case/motivation
Debugging and troubleshooting specific DAGRuns on Workers can be a little bit cumbersome without the ability to search logs by
run_id
.Currently, we need infer
execution_date
fromrun_id
to search worker log to find the final status of a TaskInstance on a Worker. This leads to unnecessary queries and steps. Includingrun_id
in the logs would simplify log searches in distributed environments(like Airflow on Kubernetes) and likely assist other users facing similar challenges.Related issues
No response
Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: