-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a task name that is a prefix of another task's name can cause consul service flapping #2474
Comments
My WIP consul refactor branch will fix this. #2478 |
Fixes #2478 #2474 #1995 #2294 The new client only handles agent and task service advertisement. Server discovery is mostly unchanged. The Nomad client agent now handles all Consul operations instead of the executor handling task related operations. When upgrading from an earlier version of Nomad existing executors will be told to deregister from Consul so that the Nomad agent can re-register the task's services and checks. Drivers - other than qemu - now support an Exec method for executing abritrary commands in a task's environment. This is used to implement script checks. Interfaces are used extensively to avoid interacting with Consul in tests that don't assert any Consul related behavior.
Confirm that this also occurs with consul 0.8.1 and nomad 0.5.6 with a single task as follows:
|
Fixes #2478 #2474 #1995 #2294 The new client only handles agent and task service advertisement. Server discovery is mostly unchanged. The Nomad client agent now handles all Consul operations instead of the executor handling task related operations. When upgrading from an earlier version of Nomad existing executors will be told to deregister from Consul so that the Nomad agent can re-register the task's services and checks. Drivers - other than qemu - now support an Exec method for executing abritrary commands in a task's environment. This is used to implement script checks. Interfaces are used extensively to avoid interacting with Consul in tests that don't assert any Consul related behavior.
Tried running the new version:
The service registers properly, however if I run a second instance only one is registered in consul from the nomad api the count is two, but only one is registered
The above consul service entry does not change and another instance is not added |
@maguec I've been unable to reproduce that issue using either the build above or master. Steps to reproduce:
Attached a build from 499ada5 Please open a new bug if it is count related since this one is specifically for name prefixes (which hopefully is fixed!). Thanks! |
I figured out that if I changed the nomad config to remove the remote consul server and instead run a consul server on the nomad node and leave the default fixed the issues
|
@maguec Ah, fantastic! That's a requirement our docs should probably make explicit: sharing a remote Consul doesn't work. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Output from
nomad version
Nomad v0.5.5
Consul v0.7.5
(issue replicated on previous versions of nomad as well)
Operating system and Environment details
Ubuntu 16.04 + test Nomad+Consul cluster
Issue
AFAICT nomad executor of shorter named task (
task1
) deregisters service registered by nomad executor of longer named task (task1-sidecar
) every several seconds.Reproduction steps
consul watch -service=task1 -type=service cat
, you'll notice service is reregistered every few seconds.task "task1"
totask "task1-main"
works around the problem.NomadConsul Server logs (if appropriate)Nomad server log is clean and does not show any signs of misbehavior. But every few seconds you'll see consul logs as such:
Job file (if appropriate)
The text was updated successfully, but these errors were encountered: