-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad Connect doesn't manage TLS Consul endpoints #6594
Comments
What I coincidence. I was about to create a ticket for this since I'm also running into the same issue. Like @vvanholl says: Nomad currently assumes the local Consul agent is available over plain HTTP. Our configuration has TLS enabled on the Consul clients and Consul servers and we don't expose a plain HTTP endpoint on the Consul agent. The problem is Nomad start the Consul Envoy proxy without any HTTP flags: https://github.com/hashicorp/nomad/blob/master/client/allocrunner/taskrunner/envoybootstrap_hook.go#L89 Therefore the Consul proxy fails to connect to the local Consul agent: https://github.com/hashicorp/consul/blob/cc9a6f79934a6da58b7aec63c057681d82aded5a/command/connect/proxy/proxy.go#L221 What Nomad should do is grab the Consul client configuration (the |
Thanks for reporting this @vvanholl and @rkettelerij ! As of right now Consul ACL support is one of the known limitations of our implementation but is in the works. For TLS, I do see that we have an open issue for testing that properly (#6502) but this looks like a bug in how we look up the Consul address. |
There is a workaround in the short term that could be used. You can provide the necessary consul values as environment variables in your init script/systemd unit. I was able to work around this by adding the following values to the Nomad systemd unit on my nomad client.
replacing the paths above with paths to your actual certificates. |
Fixes #6594 #6711 #6714 #7567 e2e testing is still TBD in #6502 Before, we only passed the Nomad agent's configured Consul HTTP address onto the `consul connect envoy ...` bootstrap command. This meant any Consul setup with TLS enabled would not work with Nomad's Connect integration. This change now sets CLI args and Environment Variables for configuring TLS options for communicating with Consul when doing the envoy bootstrap, as described in https://www.consul.io/docs/commands/connect/envoy.html#usage
There is still an issue with Nomad consul connect jobs when Consul has TLS enabled this are my environments vars
|
I enabled TLS on consul and I am also seeing this problem. I've ensured that I have the following in
I also have in my systemd unit file
Nomad = 0.11.1 |
@spuder, If you're talking about the deployment issue that Crizstian mentioned, I'd encourage you to head over to #7715 and chime in there. If you are experiencing something else, you might want to post a fresh issue. An aside, as of Nomad 0.11 you do not need to provide the CONSUL SSL environment variables. That workaround is only necessary for Nomad 0.10.4 |
Should have Nomad and Consul deployed and configured with mTLS. ACLs are currently not enabled on Consul, only Nomad. This should provide the minimal working example using mTLS to get the cought dashboard working after a ton of tinkering. 😭 The links I used during my investigation/debugging session: * hashicorp/nomad#6463 * https://learn.hashicorp.com/nomad/consul-integration/nomad-connect-acl#run-a-connect-enabled-job * hashicorp/nomad#6594 * hashicorp/nomad#4276 hashicorp/nomad#7715 * https://www.consul.io/docs/agent/options ⭐ * hashicorp/nomad#7602
@shoenig Since this is closed, someone should update https://learn.hashicorp.com/tutorials/nomad/consul-service-mesh#create-the-job-specification
|
What about auto_config ? The consul client certificates are dynamic there or am I worng? |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Hi,
Some context :
I am using Nomad 0.10.0 and Consul 1.6.1. Both Nomad and Consul are working with TLS and ACLs enabled.
I try to make my Nomad jobs running with Connect but in the logs I always have these error messages:
2019-10-30T20:34:42.894Z [ERROR] client.alloc_runner.task_runner.task_hook.envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: alloc_id=4660d74d-c834-9219-e8ee-c0fbd6911732 task=connect-proxy-test error="exit status 1" stderr="==> Failed looking up sidecar proxy info for _nomad-task-4660d74d-c834-9219-e8ee-c0fbd6911732-group-test_group-test-1313: Unexpected response code: 400 (Client sent an HTTP request to an HTTPS server.
Then trying to understand more, I noticed Nomad runs this process without success
consul connect envoy -grpc-addr unix://alloc/tmp/consul_grpc.sock -http-addr endpoint.local.compuscene.net:8500 -bootstrap -sidecar-for _nomad-task-4660d74d-c834-9219-e8ee-c0fbd6911732-group-test_group-test-131
This doen't work too with exactly the same error message.
But if I put https:// before endpoint.local.compuscene.net:8500 this command works nice.
It seems Nomad doesn't take care about it's configuration, and in particular the ssl=true option :
"consul": { "address": "endpoint.local.compuscene.net:8500", "auto_advertise": true, "checks_use_advertise": true, "ssl": true, "token": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" },
Moreover when I dig in the Nomad code, I see no reference to the Consul ssl option when creating Connect classes. Only the address is used.
I don't know if this is clear. If you have any question don't hesitate to ask me more if needed.
Vincent
The text was updated successfully, but these errors were encountered: