Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behaviour adding the same job twice or updating without any change. #10981

Closed
blmhemu opened this issue Aug 1, 2021 · 4 comments · Fixed by #10990
Closed

Inconsistent behaviour adding the same job twice or updating without any change. #10981

blmhemu opened this issue Aug 1, 2021 · 4 comments · Fixed by #10990
Assignees
Labels

Comments

@blmhemu
Copy link

blmhemu commented Aug 1, 2021

Nomad version

1.1.3

Operating system and Environment details

Ubuntu 21.04 arm64

Issue

I am trying to run the same job (Task1 see below) twice on nomad. Nomad still shows it is being updated (see pic1). This happens only for some tasks and seems inconsistent (see Pic1 and Pic2).

Task1:

job "redis" {
  datacenters = ["dc1"]
  type = "service"
  group "redis" {
    count = 1
    task "redis" {
      driver = "docker"
      config {
        image = "redis:6-alpine"
      }
      resources {
        cpu    = 200
        memory = 128
      }
    }
  }
}

Pic1:
image

Task2:

A big traefik job

Pic2:
image

See the difference between pic1 and pic2.

Also when deploying using ansible (twice), I see some jobs changed while others remained the same.
image

Previously (probably 0.10), this did not happen.

@blmhemu
Copy link
Author

blmhemu commented Aug 1, 2021

This can also be simulated by going to UI->Jobs->job (traefik)->Definition->Edit(Just click. Do NOT edit anything)->Plan
For some jobs, it says "( forces in-place update )" for some it says "(1 ignore)"

@blmhemu blmhemu changed the title Inconsistent behaviour adding the same job twice. Inconsistent behaviour adding the same job twice or updating without any change. Aug 1, 2021
@notnoop notnoop self-assigned this Aug 1, 2021
@blmhemu
Copy link
Author

blmhemu commented Aug 2, 2021

@notnoop Thanks for the quick fix! Will it be backported to 1.1.3 or will there be a new release (say 1.1.4) ?

@notnoop
Copy link
Contributor

notnoop commented Aug 2, 2021

The change should be out in 1.1.4 (as well as 1.0.10), due to be released very soon. In my testing, this issue seems to only impact planning, but it doesn't seem to actually perform in-place updates. For example, when retrying your job in the CLI, I noticed that while job plan returned "forces in-place update" indicator, job run and subsequent job status didn't update the job version or make any updates:

$ ./nomad version
Nomad v1.1.3 (8c0c8140997329136971e66e4c2337dfcf932692)
$ ./nomad job run ./redis.nomad
==> 2021-08-02T12:07:11-04:00: Monitoring evaluation "ef5fa0dc"
    2021-08-02T12:07:11-04:00: Evaluation triggered by job "redis"
==> 2021-08-02T12:07:13-04:00: Monitoring evaluation "ef5fa0dc"
    2021-08-02T12:07:13-04:00: Evaluation within deployment: "e6fccfcf"
    2021-08-02T12:07:13-04:00: Allocation "533dca8d" created: node "27a22149", group "redis"
    2021-08-02T12:07:13-04:00: Evaluation status changed: "pending" -> "complete"
==> 2021-08-02T12:07:13-04:00: Evaluation "ef5fa0dc" finished with status "complete"
==> 2021-08-02T12:07:13-04:00: Monitoring deployment "e6fccfcf"
  ✓ Deployment "e6fccfcf" successful

    2021-08-02T12:07:24-04:00
    ID          = e6fccfcf
    Job ID      = redis
    Job Version = 0
    Status      = successful
    Description = Deployment completed successfully

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    redis       1        1       1        0          2021-08-02T12:17:23-04:00
$ ./nomad job status redis
ID            = redis
Name          = redis
Submit Date   = 2021-08-02T12:07:11-04:00
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
redis       0       0         1        0       0         0

Latest Deployment
ID          = e6fccfcf
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
redis       1        1       1        0          2021-08-02T12:17:23-04:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
533dca8d  27a22149  redis       0        run      running  26s ago  14s ago
$ ./nomad job plan redis.nomad
+/- Job: "redis"
+/- Task Group: "redis" (1 ignore)
  +/- Task: "redis" (forces in-place update)

Scheduler dry-run:
- All tasks successfully allocated.

Job Modify Index: 10
To submit the job with version verification run:

nomad job run -check-index 10 redis.nomad

When running the job with the check-index flag, the job will only be run if the
job modify index given matches the server-side version. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.
$ ./nomad job run ./redis.nomad
==> 2021-08-02T12:07:52-04:00: Monitoring evaluation "2602d546"
    2021-08-02T12:07:52-04:00: Evaluation triggered by job "redis"
    2021-08-02T12:07:52-04:00: Evaluation within deployment: "e6fccfcf"
    2021-08-02T12:07:52-04:00: Evaluation status changed: "pending" -> "complete"
==> 2021-08-02T12:07:52-04:00: Evaluation "2602d546" finished with status "complete"
==> 2021-08-02T12:07:52-04:00: Monitoring deployment "e6fccfcf"
  ✓ Deployment "e6fccfcf" successful

    2021-08-02T12:07:52-04:00
    ID          = e6fccfcf
    Job ID      = redis
    Job Version = 0
    Status      = successful
    Description = Deployment completed successfully

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    redis       1        1       1        0          2021-08-02T12:17:23-04:00
$ ./nomad job status redis
ID            = redis
Name          = redis
Submit Date   = 2021-08-02T12:07:11-04:00
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
redis       0       0         1        0       0         0

Latest Deployment
ID          = e6fccfcf
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
redis       1        1       1        0          2021-08-02T12:17:23-04:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
533dca8d  27a22149  redis       0        run      running  47s ago  35s ago

notnoop pushed a commit that referenced this issue Aug 2, 2021
1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning.

The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally.

Fixes #10981
notnoop pushed a commit that referenced this issue Aug 26, 2021
1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning.

The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally.

Fixes #10981
notnoop pushed a commit that referenced this issue Aug 26, 2021
1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning.

The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally.

Fixes #10981
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants