-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
system: re-evaluate node on feasability changes #11007
Conversation
Fix a bug where system jobs may fail to be placed on a node that initially was not eligible for system job placement. This changes causes the reschedule to re-evaluate the node if any attribute used in feasability checks changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great fix! Sad how long this has been outstanding but happy to see it closed. I think a few more fields need to included and/or documented why they aren't included.
It's unfortunate how this logic depends on logic faraway in feasibility checking... wish there was some way to test them together.
nomad/node_endpoint.go
Outdated
// check fields used by the feasability checkers, through direct | ||
// or interpolated constraints. | ||
return !(original.ID == updated.ID && | ||
original.Datacenter == updated.Datacenter && | ||
original.Name == updated.Name && | ||
original.NodeClass == updated.NodeClass && | ||
reflect.DeepEqual(original.Attributes, updated.Attributes) && | ||
reflect.DeepEqual(original.Meta, updated.Meta) && | ||
reflect.DeepEqual(original.Drivers, updated.Drivers)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, some other fields that might need to be considered:
- NodeResources
- ReservedResources
- HostVolumes
It might also be worth documenting that DrainStrategy and SchedulingEligibility do not need to be considered here since they're only mutated through other endpoints that will emit node evals of their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch on HostVolumes!.
Resource constraints are handled differently: they are handled by blocking evals that got addressed in #5900 .
Fix a bug where system jobs may fail to be placed on a node that initially was not eligible for system job placement. This changes causes the reschedule to re-evaluate the node if any attribute used in feasibility checks changes. Fixes #8448
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Fix a bug where system jobs may fail to be placed on a node that
initially was not eligible for system job placement.
This changes causes the reschedule to re-evaluate the node if any
attribute used in feasability checks changes.
Fixes #8448