Nomad doesn't really honor count during updates #2873

hsmade · 2017-07-20T09:18:14Z

Nomad version

Nomad v0.6.0-rc1 (eac3f49)

Operating system and Environment details

repo's vagrant

Issue

When I do a rolling update, tasks are stopped before the new task is healthy. This could degrade performance. It think I even saw a task being stopped before starting a new one, even, but I can't reproduce that (it's too damn fast! :P)

I would expect nomad, on a rolling update, to first start a new task and wait for it to become healthy, before stopping an 'old' task.

Reproduction steps

$ nomad init

Modify count to 6 and set update to:

update {
  stagger = "10s"
  max_parallel = 1
  health_check = "checks"
}

Now:

$ nomad run example.nomad
# modify redis version to something else
$ nomad run example.nomad
$ nomad status
ID            = example
Name          = example
Submit Date   = 07/20/17 08:49:12 UTC
Type          = service
Priority      = 50
Datacenters   = dc1
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
cache       0       0         5        0       1         0

Latest Deployment
ID          = 09ab9a85
Status      = running
Description = Deployment is running

Deployed
Task Group  Auto Revert  Desired  Placed  Healthy  Unhealthy
cache       true         5        1       0        0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created At
6487cdab  21bfefd9  cache       1        run      running   07/20/17 08:49:12 UTC
02a12612  21bfefd9  cache       0        run      running   07/20/17 08:48:37 UTC
1d265c23  21bfefd9  cache       0        run      running   07/20/17 08:48:37 UTC
4a143bf8  21bfefd9  cache       0        stop     complete  07/20/17 08:48:37 UTC
6b98ad60  21bfefd9  cache       0        run      running   07/20/17 08:48:37 UTC
772bfe2f  21bfefd9  cache       0        run      running   07/20/17 08:48:37 UTC

Here we see that of the original 6, one task is stopped and a new task has been started, which is still unhealthy. This means I effectively have 5 healthy tasks instead of the requested 6.

When you have a count of 1, this actually gets worse:

ID            = example
Name          = example
Submit Date   = 07/20/17 09:14:04 UTC
Type          = service
Priority      = 50
Datacenters   = dc1
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
cache       0       1         0        0       1         0

Latest Deployment
ID          = 5ec1f03a
Status      = running
Description = Deployment is running

Deployed
Task Group  Auto Revert  Desired  Placed  Healthy  Unhealthy
cache       true         1        1       0        0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created At
522464a3  1bc934ec  cache       1        run      pending   07/20/17 09:14:04 UTC
1f7a2aac  1bc934ec  cache       0        stop     complete  07/20/17 08:58:47 UTC

Here I have no tasks running at all because of job update I just did.

The text was updated successfully, but these errors were encountered:

dadgar · 2017-07-21T20:57:09Z

Hey,

For the time being, the rolling update will be subtractive and not additive. This may become an option down the line but for now the feature is working as intended. To have an additive change for now you can canary and then promote. Thanks!

github-actions · 2022-12-11T02:18:18Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

dadgar closed this as completed Jul 21, 2017

mildred mentioned this issue Dec 8, 2017

Automatic deployment promotion of canary updates #3636

Closed

github-actions bot locked as resolved and limited conversation to collaborators Dec 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nomad doesn't really honor count during updates #2873

Nomad doesn't really honor count during updates #2873

hsmade commented Jul 20, 2017

dadgar commented Jul 21, 2017

github-actions bot commented Dec 11, 2022

Nomad doesn't really honor count during updates #2873

Nomad doesn't really honor count during updates #2873

Comments

hsmade commented Jul 20, 2017

Nomad version

Operating system and Environment details

Issue

Reproduction steps

dadgar commented Jul 21, 2017

github-actions bot commented Dec 11, 2022