-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job with Vault template re-rendered unexpectedly just after task started #15307
Comments
Hi @maxramqvist! I see you've said in #15057
So this looks pretty strongly to me like a duplicate of #15057, unless you can reproduce with |
Thanks for the feedback @tgross! I just set this config in the Nomad job and had the same result. A extra unexpected template just after deployment is done. So this should probably be opened again, right? "Templates": [
{
"DestPath": "/local/crm-postgres.vars",
"EmbeddedTmpl": "{{ with secret \"database/creds/crm-postgres-crm_app-role\" }}\n CRM_POSTGRES_CRM_CRM_APP=\"postgres://{{ .Data.username }}:{{ .Data.password }}@crm-postgres-dev.postgres.database.azure.com:5432/crm?sslmode=require\"\n CRM_POSTGRES_CRM_CRM_APP_USERNAME=\"{{ .Data.username }}\"\n CRM_POSTGRES_CRM_CRM_APP_PASSWORD=\"{{ .Data.password }}\"\n CRM_POSTGRES_CRM_CRM_APP_HOST=\"database.com\"\n CRM_POSTGRES_CRM_CRM_APP_PORT=\"5432\"\n CRM_POSTGRES_CRM_CRM_APP_DATABASE=\"crm\"\n CRM_POSTGRES_CRM_CRM_APP_SSL_MODE=\"require\"\n{{ end }}",
"Envvars": false
}
], Update: |
Thanks @maxramqvist! Reopening. I'll circle back here once I get a chance to dig into those logs. |
Ok @maxramqvist, I took a look at those logs and I've extraced the relevant bits for allocation ID At 09:21:24.294Z we see the allocation marked as healthy, and that state is broadcast inside the client to make sure that all the allocation runner components know it:
Several heartbeat intervals pass, and during this time we get updates that include updating the state of other allocations:
Then we get a new update from the server that tells us that allocation ID
Update hooks are mostly idempotent but they are not in the case of Vault dynamic secrets. So the underlying problem isn't that Vault is getting re-rendered, but that the server is telling the client to update these allocations! One thing that jumps out at me is that |
I'm still investigating, but seeing same behaviour with Nomad vars and a block like this:
[edit] |
Hi @MikeN123 and @ahjohannessen, the issue in #15433 is definitely unrelated to what we're seeing the logs on this issue, which is that the template has rendered just fine but the server is updating the alloc out of the blue. I'd definitely encourage you to take those reports over to that issue though so we can keep debugging there. |
Yesterday night we restarted a whole datacenter (DR testing). Consul, Vault, Nomad servers and clients and so on. The issue hasn't appeared since.... But it was consistently happening during every deployment before that. |
I've revisited this and found that we've independently fixed the problem at the client updates in #15915, which shipped in Nomad 1.5.0 with backports to Nomad 1.4.5 and 1.3.10. Should be safe to close this now. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Operating system and Environment details
Ubuntu 20.04, x86_64
Issue
Nomad jobs with Vault templates with using the database secrets engine unexpectedly gets re-templated one time ~15-45 seconds after task is started.
It happens both with the Mongo and Postgres integrations Vault database integrations.
At first I thought #15057 might be related but... It's not really the same is it. Although credentials twice from Vault sounds like it could be similar?
The issue happens both with and without Connect for the job.
This is reproducible every time in our environment, for different images and different Nomad job configurations.
Connect / no connect. Different type of health-checks. I've basically tried to change anything I can think of that I could imagine could affect the code path setting up the templates with no luck. We still get the extra templating.
This could of course be a Vault bug. If you think that's the case, I'm happy to open an issue there.
Reproduction steps
Post the attached job to Nomad. Wait for deployment to finish. A couple of seconds later a re-render of the template will happen.
Expected Result
No extra templating after deployment.
Actual Result
An extra templating after deployment. After the templating there is new credentials in the environment variables.
Job file (if appropriate)
Logs
I've seen nothing in the logs indicating a reason for the extra templating.
The text was updated successfully, but these errors were encountered: