-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiregion: all regions start in running if no max_parallel #8209
Conversation
If `max_parallel` is not set, all regions should begin in a `running` state rather than a `pending` state. Otherwise the first region is set to `running` and then all the remaining regions once it enters `blocked. That behavior is technically correct in that we have at most `max_parallel` regions running, but definitely not what a user expects.
@@ -198,7 +198,6 @@ func (a *allocReconciler) Compute() *reconcileResults { | |||
// Detect if the deployment is paused | |||
if a.deployment != nil { | |||
a.deploymentPaused = a.deployment.Status == structs.DeploymentStatusPaused | |||
//|| a.deployment.Status == structs.DeploymentStatusPending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to me and should go into beta - but one question.
scheduler/reconcile.go
Outdated
// region starts in the running state | ||
if a.job.IsMultiregion() && | ||
a.job.Multiregion.Strategy != nil && | ||
a.job.Multiregion.Strategy.MaxParallel != 0 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed some of the changes of PRs - do we also want to check MaxParallel against current region index (e.g. second region with MaxParallel=2); is that handled elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a good catch. That's the same sort of thing -- we'd safely have at most max_parallel
going, but the operator probably expects us to start with max_parallel
. Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is getting gnarly though so I'm going to pull it out into a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pulled this logic out to a function on Job
in f10cc93
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
In #8209 we fixed the max_parallel stanza for multiregion by introducing the IsMultiregionStarter check, but didn't apply it to the earlier place its required. The result is that deployments start but don't place allocations.
In #8209 we fixed the `max_parallel` stanza for multiregion by introducing the `IsMultiregionStarter` check, but didn't apply it to the earlier place its required. The result is that deployments start but don't place allocations.
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
If
max_parallel
is not set, all regions should begin in arunning
staterather than a
pending
state. Otherwise the first region is set torunning
and then all the remaining regions once it enters
blocked
. That behavior istechnically correct in that we have at most
max_parallel
regions running,but definitely not what a user expects.