-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement spreading allocations based on a target node attribute #4527
Conversation
40fd04c
to
75dab6d
Compare
8ec5642
to
ed4cb93
Compare
75dab6d
to
96a67ee
Compare
nomad/structs/structs.go
Outdated
@@ -2185,6 +2189,13 @@ func (j *Job) Validate() error { | |||
} | |||
} | |||
|
|||
for idx, spread := range j.Spreads { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should error if specifying it on system job?
nomad/structs/structs.go
Outdated
@@ -5384,6 +5407,82 @@ func (a *Affinity) Validate() error { | |||
return mErr.ErrorOrNil() | |||
} | |||
|
|||
type Spread struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments
nomad/structs/structs.go
Outdated
str string | ||
} | ||
|
||
type SpreadTarget struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't intermix struct definitions and methods. Move SpreadTarget and its methods either below or above Spread
nomad/structs/structs.go
Outdated
if s.str != "" { | ||
return s.str | ||
} | ||
s.str = fmt.Sprintf("%s %v", s.Value, s.Percent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%q %v%%
?
nomad/structs/structs.go
Outdated
|
||
type SpreadTarget struct { | ||
Value string | ||
Percent uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate percent is 0-100? Should we also validate that the sum of these isn't greater than 100?
scheduler/propertyset.go
Outdated
@@ -110,7 +124,7 @@ func (p *propertySet) setConstraint(constraint *structs.Constraint, taskGroup st | |||
|
|||
// populateExisting is a helper shared when setting the constraint to populate | |||
// the existing values. | |||
func (p *propertySet) populateExisting(constraint *structs.Constraint) { | |||
func (p *propertySet) populateExisting(targetAttribute string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this take the targetAttribute? Just read off of p.targetAttribute
scheduler/spread.go
Outdated
job *structs.Job | ||
tg *structs.TaskGroup | ||
jobSpreads []*structs.Spread | ||
tgSpreadInfo map[string]spreadAttributeMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put some comments on these non-obvious fields and on the types
scheduler/spread.go
Outdated
} | ||
} | ||
|
||
// Check if there is a distinct property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong comment
scheduler/spread.go
Outdated
combinedSpreads = append(combinedSpreads, tg.Spreads...) | ||
combinedSpreads = append(combinedSpreads, iter.jobSpreads...) | ||
for _, spread := range combinedSpreads { | ||
sumWeight := uint32(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sumPercents?
scheduler/spread.go
Outdated
nValue, errorMsg, usedCount := pset.UsedCount(option.Node, tgName) | ||
// Skip if there was errors in resolving this attribute to compute used counts | ||
if errorMsg != "" { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log?
scheduler/spread.go
Outdated
desiredCount, ok := spreadDetails.desiredCounts[nValue] | ||
if !ok { | ||
// Warn about missing ratio | ||
iter.ctx.Logger().Printf("[WARN] sched: missing desired distribution percentage for attribute value %v in spread stanza for job %v", nValue, iter.job.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't log this since it will be noisy if the user doesn't enumerate every value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also talked about this offline, but I don't think this behavior is right. Refer to what we talked about
@@ -3336,6 +3347,10 @@ type TaskGroup struct { | |||
// Affinities can be specified at the task group level to express | |||
// scheduling preferences. | |||
Affinities []*Affinity | |||
|
|||
// Spread can be specified at the task group level to express spreading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be possible to specify on job level and have the same behavior as update{}
where it cascade / inherit down from job -> spread
to job -> group -> spread
?
Similar question for affinity{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jippi yes this and affinities both work similar to the update stanza and cascade down.
nomad/structs/structs.go
Outdated
mErr.Errors = append(mErr.Errors, outer) | ||
if j.Type == JobTypeSystem { | ||
if j.Spreads != nil { | ||
mErr.Errors = append(mErr.Errors, fmt.Errorf("System jobs may not have a s stanza")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a s
-> a spread
nomad/structs/structs.go
Outdated
Weight int | ||
// Attribute is the node attribute used as the spread criteria | ||
Attribute string | ||
// Weight is the relative weight of this spread, useful when there are multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blank line between fields when you add comments
nomad/structs/structs.go
Outdated
sumPercent += target.Percent | ||
} | ||
if sumPercent > 100 { | ||
mErr.Errors = append(mErr.Errors, errors.New("Sum of spread target percentages must not be greater than 100")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make it 100%?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sum of spread target percentages must not be greater than 100%; got %d%%
nomad/structs/structs.go
Outdated
// for each attribute value | ||
type SpreadTarget struct { | ||
// Value is a single attribute value, like "dc1" | ||
Value string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about spacing
nomad/structs/structs.go
Outdated
} | ||
return mErr.ErrorOrNil() | ||
} | ||
|
||
// SpreadTarget is used to specify desired percentages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 width lines? Why is this so truncated
scheduler/spread.go
Outdated
desiredCount, ok = spreadDetails.desiredCounts[ImplicitTarget] | ||
if !ok { | ||
// The desired count for this attribute is zero if it gets here | ||
// don't boost the score |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the score be negative here? If I have specified targets that add up to 100% and this attribute isn't there, the operator is saying it shouldn't be selected?
scheduler/spread.go
Outdated
} | ||
if float64(usedCount) < desiredCount { | ||
// Calculate the relative weight of this specific spread attribute | ||
spreadWeight := float64(spreadDetails.weight) / float64(iter.sumSpreadWeights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add whitespace to improve readability
scheduler/spread.go
Outdated
continue | ||
} | ||
} | ||
if float64(usedCount) < desiredCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this only boosting the score? Shouldn't we add a penalty when you are > than desired count?
scheduler/spread.go
Outdated
// Get the nodes property value | ||
nValue, ok := getProperty(option, pset.targetAttribute) | ||
currentAttributeCount := uint64(0) | ||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If !ok
shouldn't we return a negative score?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, will fix.
i conflated !ok with the case where that specific value is not present in the combinedUse map.
scheduler/spread.go
Outdated
} | ||
if currentAttributeCount < minCount { | ||
// Small positive boost for attributes with min count | ||
return evenSpreadBoost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a function of the discrepancy
scheduler/spread.go
Outdated
// don't boost the score | ||
continue | ||
// so use the maximum possible penalty for this node | ||
totalSpreadScore += -1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-= 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are falling through to the rest of the code at 140-149 which is incorrect
scheduler/spread.go
Outdated
if currentAttributeCount < minCount { | ||
// Small positive boost for attributes with min count | ||
return evenSpreadBoost | ||
// positive boost for attributes with min count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first two cases can be simplified to if currentAttributeCount != minCount {
scheduler/spread.go
Outdated
return evenSpreadBoost | ||
// Maximum possible boost when there is another attribute with | ||
// more allocations | ||
return 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this delta based from the max? You could imagine attributes with the following alloc counts:
{a1: 2, a2: 3, a3: 4}
min = 2, max = 4.
a1 should have. a score higher than a2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small changes requested but LGTM
scheduler/spread.go
Outdated
@@ -171,6 +185,9 @@ func evenSpreadScoreBoost(pset *propertySet, option *structs.Node) float64 { | |||
currentAttributeCount := uint64(0) | |||
if ok { | |||
currentAttributeCount = combinedUseMap[nValue] | |||
} else { | |||
// If the attribute isn't set on the node, it should get the maximum possible penalty | |||
return -1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nValue, ok := getProperty(option, pset.targetAttribute)
if !ok {
return -1
}
currentAttributeCount := uint64(combinedUseMap[nValue]) // Defaults to 0 anyways
scheduler/spread.go
Outdated
} else if currentAttributeCount > minCount { | ||
// Negative boost if attribute count is greater than minimum | ||
if currentAttributeCount != minCount { | ||
// Boost based on delta between current and min | ||
return deltaBoost | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if currentAttribute != minCount {
return deltaBoost
} else if minCount == maxCount {
return -1.0
}
// Penalty based on delta from max value
delta := int(maxCount - minCount)
deltaBoost = float64(delta) / float64(minCount)
return deltaBoost
nomad/structs/structs.go
Outdated
@@ -2185,6 +2189,19 @@ func (j *Job) Validate() error { | |||
} | |||
} | |||
|
|||
if j.Type == JobTypeSystem { | |||
if j.Spreads != nil { | |||
mErr.Errors = append(mErr.Errors, fmt.Errorf("System jobs may not have a s stanza")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a s stanza?
nomad/structs/structs.go
Outdated
sumPercent += target.Percent | ||
} | ||
if sumPercent > 100 { | ||
mErr.Errors = append(mErr.Errors, errors.New("Sum of spread target percentages must not be greater than 100")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sum of spread target percentages must not be greater than 100%; got %d%%
@@ -602,6 +602,89 @@ func TestServiceSched_JobRegister_DistinctProperty_TaskGroup_Incr(t *testing.T) | |||
h.AssertEvalStatus(t, structs.EvalStatusComplete) | |||
} | |||
|
|||
// Test job registration with spread configured | |||
func TestServiceSched_Spread(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add one for even spread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
scheduler/generic_sched_test.go
Outdated
&structs.Spread{ | ||
Attribute: "${node.datacenter}", | ||
Weight: 100, | ||
SpreadTarget: []*structs.SpreadTarget{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe make this test a subtest that takes the percentages and expectations and runs through all values asserting the right outcome? [(100, 0), (90,10), (80,20), ...]
nomad/structs/structs.go
Outdated
@@ -5479,7 +5479,7 @@ func (s *Spread) Validate() error { | |||
sumPercent += target.Percent | |||
} | |||
if sumPercent > 100 { | |||
mErr.Errors = append(mErr.Errors, errors.New("Sum of spread target percentages must not be greater than 100")) | |||
mErr.Errors = append(mErr.Errors, errors.New(fmt.Sprintf("Sum of spread target percentages must not be greater than 100%%; got %d%%", sumPercent))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just fmt.Errorf
step := uint32(10) | ||
|
||
for i := 0; i < 10; i++ { | ||
name := fmt.Sprintf("%d%% in dc1", start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test cycles through combinations of spread across both data centers, going from (100, 0) to (0, 100) increments of 10% at a time.
…red count in each target. Added this as a new step in the stack and some unit tests
instead, calculate them based on delta between current and minimum value
Also addressed other small code review comments
8c3e316
to
d8b5ec2
Compare
ed4cb93
to
bb26ba3
Compare
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR implements support for spreading allocations across a given target node attribute (datacenter, rack etc) in the scheduler.
Spread can be configured at the task group level, or inherited from the job for all task groups. Spread can be combined with affinities, and given weights.
Builds on top of #4513 and #4512, note to reviewers - its easier if you review this after those two PRs have been merged (should happen sometime this week).
Open questions: