Skip to content

Commit

Permalink
Don't Fail LocalTaskJob on heartbeat (#41704) (#41810)
Browse files Browse the repository at this point in the history
* Never fail an ltj over a heartbeat

* Log a warning on failed heartbeat

* Avoid using f-string in log

* Remove unnecessary pass statement

(cherry picked from commit 6647610)

Co-authored-by: Collin McNulty <[email protected]>
  • Loading branch information
jedcunningham and collinmcnulty authored Aug 28, 2024
1 parent ada9003 commit d906b51
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions airflow/jobs/local_task_job_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,9 +208,14 @@ def sigusr2_debug_handler(signum, frame):

if span.is_recording():
span.add_event(name="perform_heartbeat")
perform_heartbeat(
job=self.job, heartbeat_callback=self.heartbeat_callback, only_if_necessary=False
)
try:
perform_heartbeat(
job=self.job, heartbeat_callback=self.heartbeat_callback, only_if_necessary=False
)
except Exception as e:
# Failing the heartbeat should never kill the localtaskjob
# If it repeatedly can't heartbeat, it will be marked as a zombie anyhow
self.log.warning("Heartbeat failed with Exception: %s", e)

# If it's been too long since we've heartbeat, then it's possible that
# the scheduler rescheduled this task, so kill launched processes.
Expand Down

0 comments on commit d906b51

Please sign in to comment.