Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volume_mount does not update allocation #13333

Closed
pikeas opened this issue Jun 10, 2022 · 6 comments
Closed

volume_mount does not update allocation #13333

pikeas opened this issue Jun 10, 2022 · 6 comments

Comments

@pikeas
Copy link

pikeas commented Jun 10, 2022

Nomad version: 1.3.1
Operating system and Environment details: Ubuntu 22.04 LTS

Issue

Updating a job spec by adding a volume mount doesn't launch a new container with access to the volume. Stopping and starting the job does work, so it appears the scheduler does not view the new mount as requiring a relaunch.

Reproduction steps

Run job, add a new volume_mount, run the job again to update it.

Expected Result: new volume mount is available in the container.
Actual Result: new volume mount is not available.

Job file

job "name" {
  datacenters = ["dc1"]

  group "name" {
    volume "aaa" {
      type   = "host"
      source = "aaa"
    }

    volume "bbb" {
      type   = "host"
      source = "bbb"
    }


    task "name" {
      driver = "docker"

      config {
        image = "..."
      }

      volume_mount {
        volume      = "aaa"
        destination = "/aaa"
      }

      // run, uncomment, run again to update
      // volume_mount {
      //   volume      = "bbb"
      //   destination = "/bbb"
      }
    }
  }
}
@tgross
Copy link
Member

tgross commented Jun 10, 2022

Hi @pikeas! This was a known bug prior to Nomad 1.3.1 but it should have been fixed with #13008. I just tested with the released binary and wasn't able to reproduce the problem.

Here's my jobspec:

jobspec
job "httpd" {
  datacenters = ["dc1"]

  group "web" {

    volume "host_data" {
      type      = "host"
      read_only = false
      source    = "shared_data"
    }

    task "http" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "8001", "-h", "/local"]
      }

      # volume_mount {
      #   volume      = "host_data"
      #   destination = "/host_data"
      #   read_only   = false
      # }

      resources {
        cpu    = 128
        memory = 128
      }

    }
  }
}
$ nomad job run ./jobs/hostvolume.nomad
==> 2022-06-10T14:44:57-04:00: Monitoring evaluation "622b75cd"
    2022-06-10T14:44:57-04:00: Evaluation triggered by job "httpd"
    2022-06-10T14:44:57-04:00: Allocation "69bf398e" created: node "58b610ce", group "web"
==> 2022-06-10T14:44:58-04:00: Monitoring evaluation "622b75cd"
    2022-06-10T14:44:58-04:00: Evaluation within deployment: "59c0e42d"
    2022-06-10T14:44:58-04:00: Allocation "69bf398e" status changed: "pending" -> "running" (Tasks are running)
    2022-06-10T14:44:58-04:00: Evaluation status changed: "pending" -> "complete"
==> 2022-06-10T14:44:58-04:00: Evaluation "622b75cd" finished with status "complete"
==> 2022-06-10T14:44:58-04:00: Monitoring deployment "59c0e42d"
  ⠴ Deployment "59c0e42d" in progress...

    2022-06-10T14:45:00-04:00
    ID          = 59c0e42d
    Job ID      = httpd
    Job Version = 0
    Status      = running
    Description = Deployment is running

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    web         1        1       0        0          2022-06-10T18:54:56Z^C

$ nomad alloc status 69b
...
Recent Events:
Time                       Type        Description
2022-06-10T14:44:57-04:00  Started     Task started by client
2022-06-10T14:44:56-04:00  Task Setup  Building Task Directory
2022-06-10T14:44:56-04:00  Received    Task received by client

Then I uncommented the volume_mount block and ran it again, resulting in a new allocation:

$ nomad job run ./jobs/hostvolume.nomad
==> 2022-06-10T14:45:14-04:00: Monitoring evaluation "21868afe"
    2022-06-10T14:45:14-04:00: Evaluation triggered by job "httpd"
    2022-06-10T14:45:14-04:00: Allocation "d728a3b8" created: node "58b610ce", group "web"
==> 2022-06-10T14:45:15-04:00: Monitoring evaluation "21868afe"
    2022-06-10T14:45:15-04:00: Evaluation within deployment: "5559142e"
    2022-06-10T14:45:15-04:00: Evaluation status changed: "pending" -> "complete"
==> 2022-06-10T14:45:15-04:00: Evaluation "21868afe" finished with status "complete"
==> 2022-06-10T14:45:15-04:00: Monitoring deployment "5559142e"
  ⠦ Deployment "5559142e" in progress...

    2022-06-10T14:45:17-04:00
    ID          = 5559142e
    Job ID      = httpd
    Job Version = 1
    Status      = running
    Description = Deployment is running

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    web         1        1       0        0          2022-06-10T18:55:13Z^C

$ nomad job status httpd
...

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created  Modified
d728a3b8  58b610ce  web         1        run      running   9s ago   3s ago
69bf398e  58b610ce  web         0        stop     complete  26s ago  4s ago

@tgross tgross self-assigned this Jun 10, 2022
@pikeas
Copy link
Author

pikeas commented Jun 10, 2022

Thanks for the quick response!

I just tried with your jobspec and am having the same issue.

Notes:

  • Server: Nomad v1.3.1 (2b054e3) running on up-to-date Ubuntu 22.04 LTS.
  • Client: same version, running on macOS 12.3.1, installed view homebrew. Command: nomad job run jobs/httpd.nomad (uses NOMAD_ADDR=http://<addr>:4646)
  • The job definition does change: volume mount section goes from null to the new mount.
  • No new allocation. Ditto if I revert to the previous version.
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created     Modified
79b2667d  c9ec4432  web         3        run      running  16m36s ago  7s ago

@pikeas
Copy link
Author

pikeas commented Jun 10, 2022

Hi @pikeas! This was a known bug prior to Nomad 1.3.1 but it should have been fixed with #13008. I just tested with the released binary and wasn't able to reproduce the problem.

I believe this would be the check a few lines later at https://github.com/hashicorp/nomad/blob/main/scheduler/util.go#L563-L565:

		if !reflect.DeepEqual(at.VolumeMounts, bt.VolumeMounts) {
			return true
		}

So, this should work. I'm not sure why we're seeing different behaviors, please let me know if there are any other logs I can check?

@tgross
Copy link
Member

tgross commented Jun 13, 2022

All the decisions will be made on the server, so if no new allocation is being created I'd expect that there might be some information on the debug-level logs from the server. It also might help to get the nomad eval status for the evaluation(s) mentioned in the output when you run the job.

@tgross
Copy link
Member

tgross commented Jul 5, 2022

I'm going to close this issue for now as we don't have the requested information that we'd need to debug. If you do get that info, please feel free to post here and we can re-open the issue. Thanks!

@tgross tgross closed this as completed Jul 5, 2022
@github-actions
Copy link

github-actions bot commented Nov 3, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

2 participants