-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
host_network configuration ignored #8432
Comments
Setting a CIDR to one that should exclude the assigned IP changes nothing:
|
I have run into the same problem as well. I have done some preliminary debugging and from what I can see the
This method doesn't appear to consider the The above method appears to be called by the In other words, when a
The spec from the job is ignored:
Nomad currently schedules using the primary (meaning publically routable) interface contrary to the job specification. |
As a side note, this is very easy to verify using a multi-interface system on DigitalOcean. |
Hi @joliver and @Legogris I'm working on this one today. One thing that I noticed is that your job files both use the network stanza in the task->resources stanza. The host_network field only works in group network stanzas. We should atleast be throwing a warning about this here so I opened #8497 to track that. Moving forward we're planning to remove the usage of a network stanza inside each task and encourage users to use the group network stanza. If theres any problems with this please open an issue. I'm going to work on spinning up a DO droplet to test this, but could you also try updating your jobs to make use of the group network stanza and report back if that does/doesn't work? Thanks! |
Oh wow, had no idea that it was preferred to specify network at group level rather than task level! I'll try again in the next couple of days. BTW a bit of a separate note but the docs do list the different varieties of the |
So I tried moving the
|
This might be a separate issue from what's described above, but also noticing that when the network mode is set to
However, the web UI, as well as the service registered in Consul, get a different IP (which is not used in configuration anywhere, different from any of nomad's bind/advertise addresses, nor showing up anywhere in Nomad logs) -
|
Moving the network-config to the group-level results in an error when used with Docker-containers. Please see #8488 My job-file:
|
Retried after updating to Nomad 0.12.1. Defining the network at group level still results in the port mapping error. I then tried moving the network back to the resources level. My client config:
My job file:
After starting the container, the port mapping is still bound to the default network interface (eth0):
Job file as shown by the Nomad UI:
|
Haven't gone in-depth with this yet, but since you mentioned keepalived: I noticed that Nomad requires IP's to be bound at Nomad startup to be mappable, which means that if you have two Nomad clients sharing a VRRP IP, only the one which has the IP assigned when Nomad starts will succeed to schedule the job and when IPs are reassigned, it's not recognized and the Nomad process needs to be restarted - so as of right now use of dynamically assigned IPs is practically unusable with Nomad host networks |
I copied example config provided by @HumanPrinter and it doesn't work for me in host_network "wg" {
cidr = "10.0.0.1/32"
interface = "wg0"
} Even if I select non-existing interface in job, there is no error even I'm using wrong interface. EDIT I noticed I didn't set |
I also noticed the following troubling behavior:
It seems like host network assignment is arbitrary when not defined for jobs, which means that previously working jobs will break when a new host_network is added for the sake of a specific job. So once enabled on the client, all jobs that may be allocated to that client will need to have This is with Nomad |
@Legogris You're remark regarding KeepaliveD is a good point, but in our use case this could be solved by adding a script to KeepaliveD that automatically restarts or reloads Nomad when a machine becomes master. The short offline-period is acceptable in our situation, however that might not be the case for everyone so as said, it is a valid point and it would be nice if Nomad would somehow add support for this in the near future (but that is beside the subject of this issue) |
Hey all, just as an FYI it looks like I missed updating the UI to read from the new host_network aware fields for the IP address so the UI may not be the best source of truth. I'm working on getting better visibility into the UI and CLI, but for now here is how you can inspect things through the API. Host networks will show up as part of a response to
Any matched host networks will have an unique Address entry on the NodeNetwork. For an allocation, there is a new structure under an allocation's AllocatedResources.Shared called simply
I hope this helps. I'm still working through a couple different issues that where brought up in this issue and will report back findings. Thankyou for your patience and debugging work! |
Hey folks, after spending some time on this I've identified the following items and opened issues to track them.
@Legogris your following comment needs some more exploration as I have not experienced this behaviour in testing. Could you please open a new issue with some more details in how you came to that state/conclusion. Thanks!
Since this has become a bit of a catch all issue for host networks I'd like to close it in order to better organize the work. I've opened separate issues for the items above to track them individually. The intent is not to shut down any conversation so if I missed or not addressed something please open an new issue and @ me in it. Cheers! |
Great follow-up @nickethier ! I will see if I can make a reproducible config and post it in the relevant issue. |
Was this resolved? This is still not working for me
I also found weird {
"Mode": "host",
"Device": "eth1",
"MacAddress": "4a:e8:57:91:00:c5",
"Speed": 1000,
"Addresses": [
{
"Family": "ipv4",
"Alias": "name", // shouldn't this be `"test"`?
"Address": "10.0.0.2",
"ReservedPorts": "",
"Gateway": ""
}
]
}, Nomad client config: host_network {
name = "test"
interface = "eth1"
} Jobs with Edit oh man, seems like it should be |
I think I have the networking stuff setup correctly, but I noticed that for some reason, nomad registers the public IP into consul and so service checks and service discovery is basically broken. Here's the jobfile (I removed the service check). job "echo" {
datacenters = ["ams3"]
type = "service"
group "echo" {
count = 1
network {
mode = "bridge"
port "http" {
host_network = "lan"
}
}
service {
address_mode = "host"
name = "echo"
port = "http"
}
task "echo" {
driver = "docker"
config {
image = "echo-server"
args = ["--bind", ":${NOMAD_PORT_http}"]
}
}
}
} Here's Consul API answer: [
{
"ID": "c39dc71f-0573-4cee-238a-7a09c10fcdfe",
"Node": "main-001",
"Address": "10.0.0.2", // => matches `host_network "lan"`
"Datacenter": "ams3",
"TaggedAddresses": {
"lan": "10.0.0.2",
"lan_ipv4": "10.0.0.2",
"wan": "10.0.0.2", // doesn't match `host_network "wan"`
"wan_ipv4": "10.0.0.2" // same ^
},
"NodeMeta": {
"consul-network-segment": ""
},
"ServiceKind": "",
"ServiceID": "_nomad-task-fd58156b-0ed5-d1a0-00f9-68762c2ea980-group-echo-echo-http",
"ServiceName": "echo",
"ServiceTags": [],
"ServiceAddress": "<public-ip>",
"ServiceTaggedAddresses": {
"lan_ipv4": {
"Address": "<public-ip>",
"Port": 25509
},
"wan_ipv4": {
"Address": "<public-ip>",
"Port": 25509
}
},
"ServiceWeights": {
"Passing": 1,
"Warning": 1
},
"ServiceMeta": {
"external-source": "nomad"
},
"ServicePort": 25509,
"ServiceEnableTagOverride": false,
"ServiceProxy": {
"MeshGateway": {},
"Expose": {}
},
"ServiceConnect": {},
"CreateIndex": 492,
"ModifyIndex": 492
}
] Same issue with DNS: root@main-001:~ dig +short echo.service.consul
<public-ip> Let me know if I should open a new issue or move to discuss, happy to provide further details. |
@nickethier Just got back to this - maybe this issue should be reopened? Seems like there's still something here not covered by other open issues. Given this client configuration on Nomad 0.12.3:
And this job spec:
Consul service doesn't get registered on the expected IP: So this doesn't seem like a UI or CLI issue per se. Note that in this case, all interfaces are up and IPs assigned at Nomad startup, and there's no floating IPs/VRRP in play. |
I can confirm the same issue that @Legogris is happening to me with nomad 0.12.3 I’m using host networking and the service that is registered with consul has the wrong ip https://discuss.hashicorp.com/t/incorrect-service-ip-registered-with-consul/13000 |
+1 I can reproduce this issue with latest release 0.12.3. Multi interface networking doesn't work with any recommended configuration. Basically I had public/private interface defined in nomad client with CIDR "IP/32" and interface. In job definition I tried to use specific host_network in mode host/bridge, with /without docker port mapping... |
We have run into this issue as well, the service is configured with the public IP and should be configured with the private IP as specified in our client config. A full gist of our configuration is here: https://gist.github.com/neilmock/12f075e3b22e5bc52e17ad7591af8b82 |
Hey folks! I'm sorry for the long silence on this issue, I just merged what I think is a fix for this in #9095 I was able to reproduce the Consul registration issue and this fixed it for me. I'd like to see if someone could test against master before I close it. |
We have problems with
|
Seems to be resolved as of 1.0.1 |
1.0.1 fixes this for us so far in production FYI. |
I have same issue as @urusha (second part) which I posted here https://discuss.hashicorp.com/t/question-how-to-run-task-in-multi-interface-configuration-with-access-to-docker-network/20768 When I set |
@Davasny can you open a new issue with the jobspec, configuration, and expected vs actual? That'll help us resolve that for you. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
I don't seem to be able to get the newly introduced multi-interface networking working. Scenario: Client with two NICs:
enp2s0
: Public NIC. Static IP192.168.1.2
and VRRP IP192.168.1.4
. VRRP IP is shared with another identical instanceenp3s0
: Private NIC. Static IP192.168.1.22
and VRRP IP192.168.1.154
. VRRP IP is shared with another identical instance.Nomad IP is
192.168.1.22
(private IP on internal NIC).Job to be deployed is a reverse proxy/load balancer. Multiple ports/services on the public VRRP IP, single port/service on the private IP. Due to the VRRP public IP only being set on one of the two instances at any time, I am opting to set the
interface
rather than the CIDR (since the subnets are overlapping).Nomad version
Nomad v0.12.0 (8f7fbc8e7b5a4ed0d0209968faf41b238e6d5817)
Operating system and Environment details
Debian 11 bullseye
Linux 5.7.0-1-amd64 #1 SMP Debian 5.7.6-1 (2020-06-24) x86_64 GNU/Linux
Issue
I expect the
lb-http
service to be registered with IP192.168.1.2
(ideally192.168.1.4
, but it seems configuring IPs not assigned at Nomad startup will fail). Instead both the service and the container port gets bound to the same IP as the public service;192.168.1.22
, despite hot network being configured on client and set to port in job config.Reproduction steps
Client configuration:
Job file
Example below with a single public
http
and privateapi
The text was updated successfully, but these errors were encountered: