Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasks not getting restarted when secret used in template is updated #4397

Closed
tad-lispy opened this issue Jun 8, 2018 · 5 comments
Closed

Comments

@tad-lispy
Copy link

Nomad version

Nomad v0.8.4-rc1 (26e6ffd1c42fcf300b213d80257765d4ae94e24d)

Operating system and Environment details

Ubuntu 16.04

Vault version:

Vault v0.10.1 ('756fdc4587350daf1c65b93647b2cc31a6f119cd')

Everything backed by Consul. Version:

Consul v1.1.0
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Issue

The task using template stanza with change_mode = "restart" is not getting restarted when value of a secret changes.

Reproduction steps

When I run the job below initially both ${secret} and ${public} are populated as expected. When I change value in Consul, the task gets restarted as expected. But when I write new value to Vault:

vault kv put secret/personal gossip=https://youtu.be/Lin-a2lTelg

Nothing happens.

When I explicitly stop and start the job again:

nomad job stop vault-test
nomad job run jobs/vault-test.nomad

then the new secret value is there.

Perhaps I'm missing something, but I would expect the task to be restarted as soon as I put a new value for the secret (as it happens with values coming from Consul).

I'm not sure if it's relevant, but Vault uses TLS, whereas neither Nomad nor Consul do. I'm currently working on it.

Nomad Server and Client logs

My Nomad instances are running both client and server, so the logs are combined. Below is the output from journalctl --follow --unit=nomad.service. Empty lines and lines starting with -- are added by me.

Jun 08 12:09:34 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:09:34 [INFO] (runner) rendered "(dynamic)" => "/var/nomad/data/alloc/d21f4133-daf1-fd04-9974-d836f9f26528/echo-service/secrets/file.env"
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:09:36.356Z [DEBUG] plugin: starting plugin: path=/usr/local/sbin/nomad args="[/usr/local/sbin/nomad executor {"LogFile":"/var/nomad/data/alloc/d21f4133-daf1-fd04-9974-d836f9f26528/echo-service/executor.out","LogLevel":"INFO"}]"
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:09:36.356Z [DEBUG] plugin: starting plugin: path=/usr/local/sbin/nomad args="[/usr/local/sbin/nomad executor {"LogFile":"/var/nomad/data/alloc/d88757b7-d69b-7081-0877-fc2bd8c62efe/echo-service/executor.out","LogLevel":"INFO"}]"
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:09:36.357Z [DEBUG] plugin: waiting for RPC address: path=/usr/local/sbin/nomad
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:09:36.362Z [DEBUG] plugin: waiting for RPC address: path=/usr/local/sbin/nomad
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:09:36.384Z [DEBUG] plugin.nomad: plugin address: timestamp=2018-06-08T12:09:36.384Z address=/tmp/plugin526203665 network=unix
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:09:36.402Z [DEBUG] plugin.nomad: plugin address: timestamp=2018-06-08T12:09:36.402Z address=/tmp/plugin937876244 network=unix
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:09:36.440204 [INFO] driver.docker: created container 3027e6b3f2aa0b17dbe653043b93f0633fa5d269cc72dcac88e05d890fc3e7d5
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:09:36.476106 [INFO] driver.docker: created container d583324672c5b59935ade2cb980889177b69375f9cb9713eb80b866b48c5d71d
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:09:36.877008 [INFO] driver.docker: started container d583324672c5b59935ade2cb980889177b69375f9cb9713eb80b866b48c5d71d
Jun 08 12:09:36 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:09:36.913436 [INFO] driver.docker: started container 3027e6b3f2aa0b17dbe653043b93f0633fa5d269cc72dcac88e05d890fc3e7d5

-- The job is running now and displays the original secret.

-- Changing the secret in Vault with $ vault kv put secret/personal gossip=https://youtu.be/Lin-a2lTelg
-- No new logs produce in Nomad.

-- Stopping the job with: $ nomad stop vault-test

Jun 08 12:19:42 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:42.414064 [INFO] driver.docker: stopped container 3027e6b3f2aa0b17dbe653043b93f0633fa5d269cc72dcac88e05d890fc3e7d5
Jun 08 12:19:42 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:42.487599 [INFO] driver.docker: stopped container d583324672c5b59935ade2cb980889177b69375f9cb9713eb80b866b48c5d71d
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:46.417Z [DEBUG] plugin.nomad: 2018/06/08 12:19:46 [ERR] plugin: plugin server: accept unix /tmp/plugin526203665: use of closed network connection
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:46.418Z [DEBUG] plugin: plugin process exited: path=/usr/local/sbin/nomad
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:46 [INFO] (runner) stopping
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:46.425817 [INFO] client.gc: marking allocation d88757b7-d69b-7081-0877-fc2bd8c62efe for GC
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:46 [INFO] (runner) received finish
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:46.490Z [DEBUG] plugin.nomad: 2018/06/08 12:19:46 [ERR] plugin: plugin server: accept unix /tmp/plugin937876244: use of closed network connection
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:46.491Z [DEBUG] plugin: plugin process exited: path=/usr/local/sbin/nomad
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:46 [INFO] (runner) stopping
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:46.500695 [INFO] client.gc: marking allocation d21f4133-daf1-fd04-9974-d836f9f26528 for GC
Jun 08 12:19:46 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:46 [INFO] (runner) received finish

-- Starting the job again with $ nomad run jobs/vault-test.nomad

Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) creating new runner (dry: false, once: false)
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) creating watcher
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) starting
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) initiating run
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) initiating run
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) creating new runner (dry: false, once: false)
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) creating watcher
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) starting
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) initiating run
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) initiating run
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) initiating run
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) rendered "(dynamic)" => "/var/nomad/data/alloc/0aebdd19-45e6-4129-c74b-468c309a2980/echo-service/secrets/file.env"
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) initiating run
Jun 08 12:19:55 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:55 [INFO] (runner) rendered "(dynamic)" => "/var/nomad/data/alloc/0e85ef0b-69eb-f8c1-3d6f-cc54d50be221/echo-service/secrets/file.env"
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:56.399Z [DEBUG] plugin: starting plugin: path=/usr/local/sbin/nomad args="[/usr/local/sbin/nomad executor {"LogFile":"/var/nomad/data/alloc/0aebdd19-45e6-4129-c74b-468c309a2980/echo-service/executor.out","LogLevel":"INFO"}]"
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:56.399Z [DEBUG] plugin: starting plugin: path=/usr/local/sbin/nomad args="[/usr/local/sbin/nomad executor {"LogFile":"/var/nomad/data/alloc/0e85ef0b-69eb-f8c1-3d6f-cc54d50be221/echo-service/executor.out","LogLevel":"INFO"}]"
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:56.405Z [DEBUG] plugin: waiting for RPC address: path=/usr/local/sbin/nomad
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:56.406Z [DEBUG] plugin: waiting for RPC address: path=/usr/local/sbin/nomad
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:56.440Z [DEBUG] plugin.nomad: plugin address: timestamp=2018-06-08T12:19:56.440Z address=/tmp/plugin927723151 network=unix
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]: 2018-06-08T12:19:56.444Z [DEBUG] plugin.nomad: plugin address: timestamp=2018-06-08T12:19:56.443Z address=/tmp/plugin663721688 network=unix
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:56.496003 [INFO] driver.docker: created container 96a51d49c491006ce43a06aecf2005a2edc9ca925598ee8cc2ec28225e9f622d
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:56.513045 [INFO] driver.docker: created container add2a169ea5aae8e522a18f92fb980a42bda72a76ffb12853b0c2b3e4e98dabd
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:56.861725 [INFO] driver.docker: started container 96a51d49c491006ce43a06aecf2005a2edc9ca925598ee8cc2ec28225e9f622d
Jun 08 12:19:56 consul-nomad-01-ams3 nomad[3699]:     2018/06/08 12:19:56.939429 [INFO] driver.docker: started container add2a169ea5aae8e522a18f92fb980a42bda72a76ffb12853b0c2b3e4e98dabd

-- The job is running and displaying the new secret.

During all these there are no logs at all from active Vault instance. Maybe I should increase verbosity?

Job file (if appropriate)

job "vault-test" {
  datacenters = ["dc1"]

  type = "service"

  group "vault-test" {
    count = 4

    task "echo-service" {
      driver = "docker"

      resources {
        network {
          port "http" {}
        }
      }

      config {
        image = "hashicorp/http-echo"
        port_map {
          http = 5678
        }
        args = [
          "-text",
          "${NOMAD_ALLOC_INDEX}: ${secret} ${public}"
        ]
      }

      template {
        data = <<EOF
          important="important value"
          secret="{{with secret "secret/personal"}}{{.Data.gossip}}{{end}}"
          public="{{key "public"}}"
        EOF

        destination = "secrets/file.env"
        env = true
        change_mode = "restart"
      }

      service {
        name = "leaking-secrets"

        tags =
          [ "traefik.enable=true"
          , "traefik.tags=api"
          , "traefik.tags=external"
          ]

        port = "http"

        check {
          type = "http"
          path = "/"
          interval = "10s"
          timeout = "30s"
        }
      }
    }
  }
}
@chelseakomlo
Copy link
Contributor

This looks similar #4226- we will work on reproducing on our end.

@tad-lispy
Copy link
Author

Ok, let me know if I can provide any more details.

As a work-around I am setting very short TTL on the secrets. See discussion here: https://gitter.im/hashicorp-nomad/Lobby?at=5b1a7adcdd54362753f79ee7

@dadgar
Copy link
Contributor

dadgar commented Jun 11, 2018

@lzrski Hey I am going to close this. Unlike Consul, Vault doesn't have a mechanism for blocking queries to get notified when a value changes. The tunable for staleness with vault is the secret TTL.

@dadgar dadgar closed this as completed Jun 11, 2018
@tad-lispy
Copy link
Author

tad-lispy commented Jun 12, 2018

Thanks for clarification @dadgar

Frankly I wouldn't consider this issue to be resolved. At least Nomad and Consul Template docs should clearly explain the way TTL works. The Nomad docs imply that changing secrets will restart the task: https://www.nomadproject.io/docs/job-specification/template.html (there is no clear distinction between Consul's key and Vault's secret statements). I would say the docs are misleading in this regard.

If one reads it very carefully (as I just did) it's possible to infer the mechanics from the section about vault_grace option, but I don't think it's enough, unless the reader is already an expert in your tool stack.

As a further improvement, I would like the lease times to be be configurable outside of the vault, so that processes with only read capability could set TTL of the secrets they obtain. Perhaps there should be an option similar to vault_grace, but taking absolute time after which the secrets are re-aquired (I guess ttl would be a good name for it). What do you think? Shall I open a separate feature request for it?

I appreciate your hard work on all the projects and I know how difficult it is to write good docs. Above critique is backed by best intentions 🙂

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants