-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement affinity support in the scheduler (batch and service jobs) #4513
Conversation
0f4f8df
to
2c61158
Compare
for _, in := range input { | ||
cleaned := strings.TrimSpace(in) | ||
lookup[cleaned] = struct{}{} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this get a test?
scheduler/rank.go
Outdated
if collisions > 0 { | ||
scorePenalty := -1 * float64(collisions) * iter.penalty | ||
option.Score += scorePenalty | ||
scorePenalty := -1 * float64(collisions) / float64(iter.desiredCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be desiredCount - 1
? If you have count = 2 and they both end up on the same node, shouldn't that be the max negative score?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we would want (collisions +1)/desiredCount)
as the score then? i.e get the count of current and proposed allocs in this node, and add one for the current placement being decided in the iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah what you said is correct
iter.penaltyNodes = make(map[string]struct{}) | ||
iter.source.Reset() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments
scheduler/rank.go
Outdated
source RankIterator | ||
} | ||
|
||
func NewScoreNormalizationIterator(ctx Context, source RankIterator) *ScoreNormalizationIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments
scheduler/rank.go
Outdated
|
||
func (iter *ScoreNormalizationIterator) Next() *RankedNode { | ||
option := iter.source.Next() | ||
if option == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if option == nil || len(option.Scores) == 0 {
return nil
}
sum := 0.0
for _, score := range option.Scores {
sum += score
}
option.FinalScore = sum / float64(numScorers)
scheduler/rank_test.go
Outdated
} | ||
// Score should be averaged between both scorers | ||
// -0.5 from job anti affinity and -1 from node rescheduling penalty | ||
if out[0].FinalScore != -0.75 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use require package
scheduler/rank.go
Outdated
} | ||
|
||
func (iter *NodeAffinityIterator) SetJob(job *structs.Job) { | ||
if job.Affinities != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the guard
scheduler/rank.go
Outdated
if option == nil { | ||
return nil | ||
} | ||
if len(iter.affinities) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !iter.hasAffinities()
return option | ||
} | ||
|
||
func matchesAffinity(ctx Context, affinity *structs.Affinity, option *structs.Node) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should put a TODO to use the computed node class as a cache. We could skip resolving and checking based on the computed node class. Since we are examine all nodes this can be a decent speed up
scheduler/stack.go
Outdated
@@ -304,6 +302,7 @@ func (s *SystemStack) SetJob(job *structs.Job) { | |||
|
|||
func (s *SystemStack) Select(tg *structs.TaskGroup, options *SelectOptions) (*RankedNode, *structs.Resources) { | |||
// Reset the binpack selector and context | |||
s.scoreNorm.Reset() | |||
s.binPack.Reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete binPack reset as the scoreNorm will reset everything
14dc4ef
to
65d7e22
Compare
jobspec/test-fixtures/basic.hcl
Outdated
@@ -76,11 +83,25 @@ job "binstore-storagelocker" { | |||
healthy_deadline = "11m" | |||
} | |||
|
|||
affinity { | |||
attribute = "${node.datacenter}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vertical align equal signs and values
jobspec/test-fixtures/basic.hcl
Outdated
@@ -16,6 +16,13 @@ job "binstore-storagelocker" { | |||
value = "windows" | |||
} | |||
|
|||
affinity { | |||
attribute = "${meta.team}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vertical align equal signs and values
jobspec/test-fixtures/basic.hcl
Outdated
task "binstore" { | ||
driver = "docker" | ||
user = "bob" | ||
leader = true | ||
|
||
affinity { | ||
attribute = "${meta.foo}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vertical align equal signs and values
nomad/structs/diff.go
Outdated
[]string{"str"}, | ||
"Affinity", | ||
contextual) | ||
if affinitiesDiff != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
append()
will actually Do The Right Thing (nothing) with nil and empty slices, so you don't need this
https://play.golang.org/p/Cvgda2DSO1r
skipScoreThreshold = -10.0 | ||
// that have a score lower than this. -1 is the lowest possible score for a | ||
// node with penalties (based on job anti affinity and node rescheduling penalties | ||
skipScoreThreshold = 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is missing a closing parenthesis, but I'm curious why this changed from being equal to the value in the comment to 0. Seems strange for the comment to mention one number and the const is set to another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit but LGTM
scheduler/rank.go
Outdated
if collisions > 0 { | ||
scorePenalty := -1 * float64(collisions) * iter.penalty | ||
option.Score += scorePenalty | ||
scorePenalty := -1 * float64(collisions) / float64(iter.desiredCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah what you said is correct
scheduler/rank.go
Outdated
totalAffinityScore := 0.0 | ||
for _, affinity := range iter.affinities { | ||
if matchesAffinity(iter.ctx, affinity, option.Node) { | ||
normScore := affinity.Weight / sumWeight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more accurate to just keep adding the weights and dividing only once at the end. By dividing each time we will accumulate more floating point losses in accuracy
65d7e22
to
f2dc2fe
Compare
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR builds on top of #4512 and implements incorporating affinities into the placement algorithm. Changes include:
The scheduler changes from having different ranges of scores from bin packing (0-18), job anti affinity (-30 to -10) and node anti affinity (-50), to all scores from different factors being normalized to a value between [-1, 1]. Each scoring iterator appends its scores to a list, and a normalization iterator as the last step averages them.
Node affinities/anti affinities as a new step in the scoring task. Inter affinity weights are also normalized. (e.g if a job has two affinities each with a weight of 100, their weight in the scoring layer becomes 0.5 each).
If the job or task group has affinities, then the scheduler examines all nodes rather than log(n) nodes.