-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inproper ip adress registration when addres mode 'driver' #3681
Comments
Oh no! I'm sorry upgrading to 0.7.1 broke this.
|
We upgraded from version 0.6.3, and use docker
|
For now we revert all nomad agents back to version 0.6.3, and manualy remove all service registration which left after nomad 0.7.1 via consul api. There was epic fail :-) |
@tantra35 And just to be clear |
@schmichael yes. For example before upgrade also when we cleanup consul from nomad 0.7.1 garbage service registrations we see follow:
|
@tantra35 Ah! I see the issue. You're not specifying a port. Prior to 0.6 if a port wasn't specified we'd simply use the IP and set the port to 0. In 0.7.1 rc1 I required ports to be set. This broke backward compatibility and #3673 was filed. We decided unset ports should be allowed, and I attempted to revert to pre-0.7.1-rc1 behavior in PR #3674. However, when a port isn't specified, I don't register the IP either! That still breaks backward compatibility. WorkaroundWhile we're discussing the proper fix you can workaround this by specifying a non-zero port for the service. You don't even have to create a port label. The service can literally be: service
{
name = "zabbixserver"
address_mode = "driver"
# set any non-zero port to get the IP to register
port = "1"
} |
Hm, we were thinked about port, by panic force us to revert to previous version. And when we read disscussion GH-3673 I think that not presented port in job descriptions will not cause any issue due this:
And thanks to workaround we will try to upgrade again |
Fixes #3681 When in drive address mode Nomad should always advertise the driver's IP in Consul even when no network exists. This matches the 0.6 behavior. When in host address mode Nomad advertises the alloc's network's IP if one exists. Otherwise it lets Consul determine the IP. I also added some much needed logging around Docker's network discovery.
Just updated the PR again with some logging improvements and included #3680 in the binaries if you have time to test. |
@schmichael Cool! I doesn't have anouph english skill to express fully my admiration of your work. We will try a fix as soon as possible 1-2 days, |
@schmichael we investigate this issue from our side and inspect all our jobs and found one with this description (i omitt some job details):
but in this case |
We just now tested yours binary and something strange happens. For some time some services disappear from dns consul zone (this happens not for all services). And after 5 minutes they appear again, in nomad client logs not any info about this service declarations for service was absolutely legal:
on serverside we see folow in logs:
I don't think that this problem of this custom binary, but is in all nomad 0.7.1 branch |
@schmichael In this jobs
When we specify port in service definition we got follow error:
and job goes to dead state. Only when |
That is an unfortunate error message that should be improved. You should not use Sorry it's gotten so complicated. I will try to improve the error message to at least point you in the right direction! |
Related to #3681 If a user specifies an invalid port *label* when using address_mode=driver they'll get an error message about the label being an invalid number which is very confusing. I also added a bunch of testing around Service.AddressMode validation since I was concerned by the linked issue that there were cases I was missing. Unfortunately when address_mode=driver is used there's only so much validation that can be done as structs/structs.go validation never peeks into the driver config which would be needed to verify the port labels/map.
Fixes #3681 When in drive address mode Nomad should always advertise the driver's IP in Consul even when no network exists. This matches the 0.6 behavior. When in host address mode Nomad advertises the alloc's network's IP if one exists. Otherwise it lets Consul determine the IP. I also added some much needed logging around Docker's network discovery.
Related to #3681 If a user specifies an invalid port *label* when using address_mode=driver they'll get an error message about the label being an invalid number which is very confusing. I also added a bunch of testing around Service.AddressMode validation since I was concerned by the linked issue that there were cases I was missing. Unfortunately when address_mode=driver is used there's only so much validation that can be done as structs/structs.go validation never peeks into the driver config which would be needed to verify the port labels/map.
I pushed an improvement to the error message in #3682 to hopefully help make debugging this easier. If you remove |
Fixes #3681 When in drive address mode Nomad should always advertise the driver's IP in Consul even when no network exists. This matches the 0.6 behavior. When in host address mode Nomad advertises the alloc's network's IP if one exists. Otherwise it lets Consul determine the IP. I also added some much needed logging around Docker's network discovery.
Related to #3681 If a user specifies an invalid port *label* when using address_mode=driver they'll get an error message about the label being an invalid number which is very confusing. I also added a bunch of testing around Service.AddressMode validation since I was concerned by the linked issue that there were cases I was missing. Unfortunately when address_mode=driver is used there's only so much validation that can be done as structs/structs.go validation never peeks into the driver config which would be needed to verify the port labels/map.
Hopefully helps prevent more issues like #3681 and #4008. The port/address_mode logic is really subtle, and it took me a long time to diagnose #4008 despite being the one to have addressed the duplicate issue before! Not to mention I wrote the code! Definitely need to do something to make it more understandable...
This is still an issue for IPv6 to the point where it completely overrides any IPv6 network settings you have on Docker @schmichael |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v0.7.1 (0b295d3)
We have follow job specification:
prior upgrade to 0.7.1 nomad properly register driver ip adress for service, for now we got that nomad register host ip address, not driver
The text was updated successfully, but these errors were encountered: