-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: fix job update placement on prev node penalized #6781
Conversation
dd4a98e
to
6f98b61
Compare
scheduler/generic_sched.go
Outdated
|
||
// If alloc failed, penalize the node it failed on to encourage | ||
// rescheduling on a new node. | ||
if prevAllocation.ClientStatus == "failed" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok to omit "lost", but it might mean that the whole node reboots, and the job gets scheduled back on the same node when it becomes available again, I'm not sure if we want to apply a penalty there. Also, for google foo I think we should use structs.AllocClientStatusFailed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I couldn't think of a reason to penalize lost
-- if the node comes back in time for the scheduler to place it there (seems extremely unlikely), what's the harm? If it's a flaky node I guess that's a problem, but now we're compounding unlikely scenarios.
If there's a chance the scheduler worker's statestore is using a snapshot that doesn't see the node as being lost yet, that's a good to penalize it I suppose. That could cause a pathological situation where many lost allocations are placed back on the lost node only to fail at placement and go through scheduling all over again at a fresher snapshot. This too seems pretty unlikely to me and impossible if we ensure the snapshot is at least as fresh as the evaluation being processed in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to using structs.AllocClientStatusFailed
Fixes #5856 When the scheduler looks for a placement for an allocation that's replacing another allocation, it's supposed to penalize the previous node if the allocation had been rescheduled or failed. But we're currently always penalizing the node, which leads to unnecessary migrations on job update. This commit leaves in place the existing behavior where if the previous alloc was itself rescheduled, its previous nodes are also penalized. This is conservative but the right behavior especially on larger clusters where a group of hosts might be having correlated trouble (like an AZ failure). Co-Authored-By: Michael Schurter <[email protected]>
6f98b61
to
e82244c
Compare
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Fixes #5856
When the scheduler looks for a placement for an allocation that's replacing another allocation, it's supposed to penalize the previous node if the allocation had been rescheduled or failed. But we're currently always penalizing the node, which leads to unnecessary migrations on job update.
This commit leaves in place the existing behavior where if the previous alloc was itself rescheduled, its previous nodes are also penalized. This is conservative but the right behavior especially on larger clusters where a group of hosts might be having correlated trouble (like an AZ failure).