Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multiple bugs with progress deadline handling #4842

Merged
merged 3 commits into from
Nov 8, 2018

Conversation

dadgar
Copy link
Contributor

@dadgar dadgar commented Nov 6, 2018

Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.

The PR also makes handling deployment status updates from the client more robust.

Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.

Further fix an issue where the blocked eval optimization would make it
so no evals were created to progress the deployment. To reproduce this
issue, prior to this commit, you can create a job with two task groups.
The first group has count 1 and resources such that it can not be
placed. The second group has count 3, max_parallel=1, and can be placed.
Run this first and then update the second group to do a deployment. It
will place the first of three, but never progress since there exists a
blocked eval. However, that doesn't capture the fact that there are two
groups being deployed.
@dadgar dadgar requested a review from preetapan November 6, 2018 01:08
@preetapan preetapan changed the title Fix multiple tgs with progress deadline handling Fix multiple bugs with progress deadline handling Nov 6, 2018
@@ -419,7 +417,12 @@ FAIL:
default:
}
}
deadlineTimer.Reset(next.Sub(time.Now()))

// If the next deadline is zero, we should not reset the timer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this clarify what problem this !zero check is actually fixing. i'm concerned that this means under some conditions the deadlinetimer would never get reset

@@ -820,5 +871,10 @@ func (w *deploymentWatcher) jobEvalStatus() (latestIndex uint64, blocked bool, e
}
}

return max, false, nil
if max == uint64(0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This index returns the max eval index across all jobs, but we care about a single job. We could miss an update if evals were gced before this ran. This should return zero instead.

copyAlloc.DeploymentStatus.Canary = true
// The client can only set its deployment health and timestamp, so just take
// those
if copyAlloc.DeploymentStatus != nil && alloc.DeploymentStatus != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this change

Copy link
Contributor

@preetapan preetapan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but there was a drive by fix in the state store that looked like it was about minimizing the amount of settable fields in an alloc update. That didn't seem related to the bugfix,

@dadgar
Copy link
Contributor Author

dadgar commented Nov 8, 2018

@preetapan The server was actually panicking if the client sent an update without the deployment status set, so the new code, is more correct and safer.

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants