Delayed evaluations for stop_after_client_disconnect
can cause unwanted extra followup evaluations around job garbage collection
#8098
Labels
In Nomad 0.11.2
stop_after_client_disconnect
was introduced. If a nomad client is separated from the network causing the scheduler to delay the evaluation and that job is subsequently garbage collected, the followup evaluation will create two more followup evaluations, with 0WaitUntil
. Unfortunately, both of those will create 2, ultimately causing the cluster leader to become unresponsive.This failure mode requires the combination of jobs opting in to the new feature, the feature being used to delay rescheduling and the job being garbage collected (or
stop -purge
).The text was updated successfully, but these errors were encountered: