Skip to content

Commit

Permalink
mrd: reconcile should treat pending deployments as paused (#8446)
Browse files Browse the repository at this point in the history
If a job update includes a task group that has no changes, those allocations
have their version bumped in-place. The ends up triggering an eval from
`deploymentwatcher` when it verifies their health. Although this eval is a
no-op, we were only treating pending deployments the same as paused when
the deployment was a new MRD. This means that any eval after the initial one
will kick off the deployment, and that caused pending deployments to "jump
the queue" and run ahead of schedule, breaking MRD invariants and resulting in
a state with all regions blocked.

This behavior can be replicated even in the case of job updates with no
in-place updates by patching `deploymentwatcher` to inject a spurious no-op
eval. This changeset fixes the behavior by treating pending deployments the
same as paused in all cases in the reconciler.
  • Loading branch information
tgross authored Jul 16, 2020
1 parent 238f7dc commit 3b52b39
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion scheduler/reconcile.go
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,8 @@ func (a *allocReconciler) Compute() *reconcileResults {

// Detect if the deployment is paused
if a.deployment != nil {
a.deploymentPaused = a.deployment.Status == structs.DeploymentStatusPaused
a.deploymentPaused = a.deployment.Status == structs.DeploymentStatusPaused ||
a.deployment.Status == structs.DeploymentStatusPending
a.deploymentFailed = a.deployment.Status == structs.DeploymentStatusFailed
}
if a.deployment == nil {
Expand Down

0 comments on commit 3b52b39

Please sign in to comment.