-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: fix omitted datacenter of a server discovered via Consul #4997
Conversation
@AlexanderZagaevskiy sorry but dc in this PR doesn't correspond to nomad datacenters. This is a consul datacenter which doesn't necessarily match to nomad. The real issue is that |
@tantra35 You are right. But maybe it's worth not to check nomad server's dcs in a method Otherwise a servers list discovered via Consul never equals to the one come in a RPC response of a cluster leader (see a method |
@AlexanderZagaevskiy I think that this is a bad idea. Because As workaround you may combine two methods, to implement third https://github.com/hashicorp/nomad/blob/master/nomad/status_endpoint.go#L61 and to get datacenter info. Then you may produce third method for example |
Also when PR #4688 is used, consul discovery called very few times(at initial state and when quorum of servers are broken), so this can't by very big problem. If not that means that you have very big problems with you nomad servers(at least they count must be 3-5), for example they can disappear from each other due very heavy busy or network latency |
Hey @AlexanderZagaevskiy we believe #4666 to be resolved by #5654 I agree with @tantra35 that while this may fix part of the problem for your case it's not a generalized solution and will break for folks where the consul DC doesn't align with Nomad DC. If you feel theres still an unresolved problem please open an issue so we can discuss. Thanks! |
Thanks! But I have a proposal to exclude DC off the comparison of servers at all. Servers discovered via Consul are always have omitted DC and this becomes a reason for servers list's replacement and for some extra checks|pings while a client's finding an appropriate responding server. So the client's heartbeat check would likely be failed, the client would be considered as failed and it would make currently running task be restarted. Ain't I right? |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
No description provided.