Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

host_network configuration ignored #8432

Closed
Legogris opened this issue Jul 14, 2020 · 29 comments
Closed

host_network configuration ignored #8432

Legogris opened this issue Jul 14, 2020 · 29 comments

Comments

@Legogris
Copy link

I don't seem to be able to get the newly introduced multi-interface networking working. Scenario: Client with two NICs:

  • enp2s0: Public NIC. Static IP 192.168.1.2 and VRRP IP 192.168.1.4. VRRP IP is shared with another identical instance
  • enp3s0: Private NIC. Static IP 192.168.1.22 and VRRP IP 192.168.1.154. VRRP IP is shared with another identical instance.

Nomad IP is 192.168.1.22 (private IP on internal NIC).

Job to be deployed is a reverse proxy/load balancer. Multiple ports/services on the public VRRP IP, single port/service on the private IP. Due to the VRRP public IP only being set on one of the two instances at any time, I am opting to set the interface rather than the CIDR (since the subnets are overlapping).

Nomad version

Nomad v0.12.0 (8f7fbc8e7b5a4ed0d0209968faf41b238e6d5817)

Operating system and Environment details

Debian 11 bullseye
Linux 5.7.0-1-amd64 #1 SMP Debian 5.7.6-1 (2020-06-24) x86_64 GNU/Linux

Issue

I expect the lb-http service to be registered with IP 192.168.1.2 (ideally 192.168.1.4, but it seems configuring IPs not assigned at Nomad startup will fail). Instead both the service and the container port gets bound to the same IP as the public service; 192.168.1.22, despite hot network being configured on client and set to port in job config.

Reproduction steps

Client configuration:

client {
  host_network "lb" {
    interface = "enp2s0"
  }
}

Job file

Example below with a single public http and private api


job "lb" {
  group "lb" {
    count = 2
    constraint {
      operator = "distinct_hosts"
      value = "true"
    }
    task "lb" {
      driver = "docker"
      config {
        network_mode = "bridge"
      }
      resources {
        network {
          port "http" {
            static = "80"
            host_network = "lb"
          }
          port "api" {
            static = "8081"
          }
        }
      }
      service {
        name = "lb-http"
        port = "http"
        check {
          ...
        }
      }
      service {
        name = "lb-api"
        port = "api"
        check {
          ...
        }
      }
    }
  }
}
@Legogris
Copy link
Author

Setting a CIDR to one that should exclude the assigned IP changes nothing:

client {
  host_network "lb" {
    interface = "enp2s0"
    cidr = "192.168.1.0/29"
  }
}
# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS                                                                                            NAMES
abc123        lb:latest       "/entrypoint.sh ..."   About a minute ago   Up About a minute   192.168.1.22:80->80/tcp, 192.168.1.22:80->80/udp, 192.168.1.22:8081->8081/tcp, 192.168.1.22:8081->8081/udp   lb-123456

@joliver
Copy link

joliver commented Jul 14, 2020

I have run into the same problem as well. I have done some preliminary debugging and from what I can see the structs/network.go file has a method:

func (idx *NetworkIndex) AssignNetwork(ask *NetworkResource) (out *NetworkResource, err error) {
...
}

This method doesn't appear to consider the host_network client configuration when offering an allocation solution.

The above method appears to be called by the scheduler/rank.go. One thing I noticed is that the ask.IP is blank and, even if it was populated, it's not considered within the AssignNetwork call.

In other words, when a client.hcl defines various host_network values:

client {
  host_network "wan"  {
    cidr      = "1.2.3.4/32"
    interface = "eth0"
  }

  host_network "lan" {
    cidr      = "10.0.0.0/8"
    interface = "eth1"
  }
}

The spec from the job is ignored:

resources {
  network {
    port "http" {
      host_network = "lan"
    }
  }
}

Nomad currently schedules using the primary (meaning publically routable) interface contrary to the job specification.

@joliver
Copy link

joliver commented Jul 15, 2020

As a side note, this is very easy to verify using a multi-interface system on DigitalOcean.

@nickethier
Copy link
Member

Hi @joliver and @Legogris I'm working on this one today.

One thing that I noticed is that your job files both use the network stanza in the task->resources stanza. The host_network field only works in group network stanzas. We should atleast be throwing a warning about this here so I opened #8497 to track that. Moving forward we're planning to remove the usage of a network stanza inside each task and encourage users to use the group network stanza. If theres any problems with this please open an issue.

I'm going to work on spinning up a DO droplet to test this, but could you also try updating your jobs to make use of the group network stanza and report back if that does/doesn't work? Thanks!

@nickethier nickethier self-assigned this Jul 22, 2020
@Legogris
Copy link
Author

Legogris commented Jul 22, 2020

Oh wow, had no idea that it was preferred to specify network at group level rather than task level!

I'll try again in the next couple of days.

BTW a bit of a separate note but the docs do list the different varieties of the network stanzas for jobs as identical and there are plenty of examples using the deprecated form: https://www.nomadproject.io/docs/job-specification/network#mapped-ports

@Legogris
Copy link
Author

So I tried moving the network stanza to the group level for one job. They still both end up on the same IP on the same interface.

client {
    host_network "lb" {
            cidr = "192.168.1.0/29"
            interface = "enp2s0"
    }

    host_network "internal" {
            cidr = "192.168.1.254/32"
            interface = "enp3s0"
    }
}
group "lb" {
    network {
      mbits = 11

      port "http" {
        static = 80
        host_network = "lb"
      }

      port "api" {
        static = 8081
        host_network = "internal"
      }
    }
}

image

@Legogris
Copy link
Author

Legogris commented Jul 23, 2020

This might be a separate issue from what's described above, but also noticing that when the network mode is set to host, the Allocation address gets assigned correctly through the CLI:

$ nomad alloc status 095a9457-4641-d5c2-9ba9-b1bfd11012d1

Allocation Addresses
Label  Dynamic  Address
*http  yes      192.168.1.4:80
*api   yes      192.168.1.254:8081

However, the web UI, as well as the service registered in Consul, get a different IP (which is not used in configuration anywhere, different from any of nomad's bind/advertise addresses, nor showing up anywhere in Nomad logs) - 192.168.1.2. This is also an IP on the wrong interface.

    host_network "traefik" {
            cidr = "192.168.1.4/32"
            interface = "enp2s0"
    }

    host_network "internal" {
            cidr = "192.168.1.254/32"
            interface = "enp3s0"
    }

group "lb" {
    network {

      port "http" {
        static = 80
        host_network = "lb"
      }

      port "api" {
        static = 8081
        host_network = "internal"
      }
    }

    service {
      name = "lb-api"
      port = "api"
      address_mode = "auto" // changing this value has no effect on behavior

      check {
        name     = "alive"
        type     = "tcp"
        port     = "api"
        interval = "10s"
        timeout  = "2s"
        check_restart {
          grace = "90s"
        }
      }
    }
}

@HumanPrinter
Copy link

Moving the network-config to the group-level results in an error when used with Docker-containers. Please see #8488

My job-file:

job "nginx-test" {
  datacenters = ["dcf"]
  type = "system"

  group "containers" {
    network {
      mode = "bridge"

      port "http" {
        static = "80"
        host_network = "test_inet"
      }
    }

    service {
      name = "authweb-loadbalancer"
      port = "http"
      check {
        type     = "http"
        protocol = "http"
        method   = "GET"
        header {
          Host = [ "test.zentest.nl" ]
        }
        port     = "http"
        path     = "/health"
        tls_skip_verify = true
        interval = "1s"
        timeout  = "1s"
      }
    }

    volume "html" {
      type = "host"
      read_only = true
      source = "html-files"
    }

    task "nginx-test" {
      template {
        data = <<EOF
server {
  listen 80;
  server_name test.zentest.nl;

  location / {
    root html;
  }

  location /health {
    return 200 'OK';
  }

  location /test {
    return 200 'OK';
  }
}
EOF
        destination = "local/conf.d/default.conf"
        change_mode = "signal"
        change_signal = "SIGHUP"
      }

      driver = "docker"

      config {
        image = "nginx:1.19.0-alpine-perl"

        volumes = [
          "local/conf.d/:/etc/nginx/conf.d/"
        ]

        port_map {
          http = 80
        }
      }

      env {
        TZ = "Europe/Amsterdam"
      }

      volume_mount {
        volume = "html"
        destination = "/etc/nginx/html/"
      }

      resources {
        memory = 300
      }
    }
  }
}

The resulting error after starting the allocation:
image

@HumanPrinter
Copy link

HumanPrinter commented Jul 24, 2020

Retried after updating to Nomad 0.12.1. Defining the network at group level still results in the port mapping error. I then tried moving the network back to the resources level.
The container is deployed but the portmapping is bound to the internal (default) interface instead of the interface that was defined in the job. Looking at the job definition in the Nomad web interface does however show the correct host_network. Is something not working or am I doing something wrong?

My client config:

client {
  network_interface = "eth0"
  ...
  host_network "test_inet" {
    cidr = "11.22.33.44/32" # replaced the IP-address with a dummy value because I don't want to expose the actual IP through Github. The IP is assigned by KeepaliveD
    interface = "eth1"
  }
}
...

My job file:

job "nginx-test" {
  datacenters = ["mydc"]
  type = "system"

  group "containers" {
    task "nginx-test" {
      template {
        data = <<EOF
server {
  listen 80;
  server_name test.myserver.nl;

  location /health {
    return 200 'OK';
  }

  location /test {
    return 200 'OK';
  }
}
EOF
        destination = "local/conf.d/default.conf"
      }

      driver = "docker"
      config {
        image = "nginx:1.19.0-alpine-perl"
        volumes = [
          "local/conf.d/:/etc/nginx/conf.d/"
        ]

        port_map {
          http = 80
        }
      }
    
      resources {
        network {
          mode = "bridge"
          port "http" {
            static = "80"
            host_network = "test_inet"
          }
        }
      }
    }
  }
}

After starting the container, the port mapping is still bound to the default network interface (eth0):

>docker ps
949150ea463e    nginx:1.19.0-alpine-perl    "/docker-entrypoint.…"    10.10.80.211:80->80/tcp, 10.10.80.211:80->80/udp

Job file as shown by the Nomad UI:

{
  ...
  "ID": "nginx-test",
  "Name": "nginx-test",
  "Type": "system",
  "TaskGroups": [
    {
      "Name": "containers",
      "Count": 1,
      ...
      "Tasks": [
        {
          "Name": "nginx-test",
          "Driver": "docker",
          ...
          "Config": {
            "port_map": [
              {
                "http": 80
              }
            ],
            "image": "nginx:1.19.0-alpine-perl"
          },
          ...
          "Resources": {
            ...
            "Networks": [
              {
                "Mode": "bridge",
                "Device": "",
                "CIDR": "",
                "IP": "",
                "MBits": 10,
                "DNS": null,
                "ReservedPorts": [
                  {
                    "Label": "http",
                    "Value": 80,
                    "To": 0,
                    "HostNetwork": "test_inet"
                  }
                ],
                "DynamicPorts": null
              }
            ],
            "Devices": null
          },
          ...
        }
      ],
      ...
      "Networks": null,
      ...
    }
  ],
  ...
}

@Legogris
Copy link
Author

Haven't gone in-depth with this yet, but since you mentioned keepalived: I noticed that Nomad requires IP's to be bound at Nomad startup to be mappable, which means that if you have two Nomad clients sharing a VRRP IP, only the one which has the IP assigned when Nomad starts will succeed to schedule the job and when IPs are reassigned, it's not recognized and the Nomad process needs to be restarted - so as of right now use of dynamically assigned IPs is practically unusable with Nomad host networks

@SystemZ
Copy link

SystemZ commented Jul 26, 2020

I copied example config provided by @HumanPrinter and it doesn't work for me in v0.12.1 either.
I'm using wireguard interface like this in nomad config:

    host_network "wg" {
        cidr = "10.0.0.1/32"
        interface = "wg0"
    }

Even if I select non-existing interface in job, there is no error even I'm using wrong interface.
IMHO there should be some validation to prevent that or at least warning in log.
Silent error with assigning private service to public network is no good in production.

EDIT I noticed I didn't set network_interface in nomad client config.
After I set network_interface = "lo", validation error for job using wrong interface like wgtest become visible in UI (or I just missed it before?)

@Legogris
Copy link
Author

I also noticed the following troubling behavior:

  1. No client host_network defined, not job host_network declared: Job assigned to Nomad IP
  2. Add new client host_network named lb and host_network named default: Job gets assigned to lb.

It seems like host network assignment is arbitrary when not defined for jobs, which means that previously working jobs will break when a new host_network is added for the sake of a specific job. So once enabled on the client, all jobs that may be allocated to that client will need to have host_network explicitly specified to not have unspecified behavior.

This is with Nomad 0.12.1

@HumanPrinter
Copy link

@Legogris You're remark regarding KeepaliveD is a good point, but in our use case this could be solved by adding a script to KeepaliveD that automatically restarts or reloads Nomad when a machine becomes master. The short offline-period is acceptable in our situation, however that might not be the case for everyone so as said, it is a valid point and it would be nice if Nomad would somehow add support for this in the near future (but that is beside the subject of this issue)

@nickethier
Copy link
Member

Hey all, just as an FYI it looks like I missed updating the UI to read from the new host_network aware fields for the IP address so the UI may not be the best source of truth. I'm working on getting better visibility into the UI and CLI, but for now here is how you can inspect things through the API.

Host networks will show up as part of a response to /v1/node/<node_id>. Here is an example from the vagrant environment:

$> curl localhost:4646/v1/node/341e3a9c-3f48-c41e-19ce-e5a6258c6c28 | jq .NodeResources.NodeNetworks
[
  {
    "Addresses": null,
    "Device": "",
    "MacAddress": "",
    "Mode": "bridge",
    "Speed": 0
  },
  {
    "Addresses": [
      {
        "Address": "10.0.2.15",
        "Alias": "default",
        "Family": "ipv4",
        "Gateway": "",
        "ReservedPorts": ""
      }
    ],
    "Device": "eth0",
    "MacAddress": "08:00:27:51:2c:84",
    "Mode": "host",
    "Speed": 1000
  }
]

Any matched host networks will have an unique Address entry on the NodeNetwork.

For an allocation, there is a new structure under an allocation's AllocatedResources.Shared called simply Ports. Each port object here has the HostIP associated with the port. Heres an example from the countdash example:

 curl localhost:4646/v1/allocation/41e46930-6e48-764b-3812-d310aebce3df | jq .AllocatedResources.Shared.Ports
[
  {
    "HostIP": "10.0.2.15",
    "Label": "http",
    "To": 9002,
    "Value": 9002
  },
  {
    "HostIP": "10.0.2.15",
    "Label": "connect-proxy-count-dashboard",
    "To": 30615,
    "Value": 30615
  }
]

I hope this helps. I'm still working through a couple different issues that where brought up in this issue and will report back findings. Thankyou for your patience and debugging work!

@nickethier
Copy link
Member

Hey folks, after spending some time on this I've identified the following items and opened issues to track them.

  1. Poor documentation and validation makes it very easy to misconfigure/misunderstand host_networks. Host networks only work with group network stanzas and if you need port mapping it must be done via Nomad's native port mapping using bridge or cni network modes. Host networks are not meant to work with docker's port_map driver configuration currently. (Nomad host_network documentation and example improvments #8575)

  2. UI and some CLI fields that reference port addresses aren't using the correct field and thus will not return the host network specific IP address. (Some CLI and UI fields relating to a host_network report incorrect address #8576)

  3. Host networks are only fingerprinted once at start up. This means floating addresses, VRRP etc will likely not work properly. (host_network should support floating/virtual IP addresses #8577)

@Legogris your following comment needs some more exploration as I have not experienced this behaviour in testing. Could you please open a new issue with some more details in how you came to that state/conclusion. Thanks!

It seems like host network assignment is arbitrary when not defined for jobs, which means that previously working jobs will break when a new host_network is added for the sake of a specific job. So once enabled on the client, all jobs that may be allocated to that client will need to have host_network explicitly specified to not have unspecified behavior.

Since this has become a bit of a catch all issue for host networks I'd like to close it in order to better organize the work. I've opened separate issues for the items above to track them individually. The intent is not to shut down any conversation so if I missed or not addressed something please open an new issue and @ me in it.

Cheers!

@Legogris
Copy link
Author

Great follow-up @nickethier !

I will see if I can make a reproducible config and post it in the relevant issue.

@yields
Copy link

yields commented Aug 17, 2020

Was this resolved? This is still not working for me

Nomad v0.12.3 (2db8abd9620dd41cb7bfe399551ba0f7824b3f61)

I also found weird Aliases when looking at jq .NodeResources.NodeResources for that particular node:

  {
    "Mode": "host",
    "Device": "eth1",
    "MacAddress": "4a:e8:57:91:00:c5",
    "Speed": 1000,
    "Addresses": [
      {
        "Family": "ipv4",
        "Alias": "name", // shouldn't this be `"test"`?
        "Address": "10.0.0.2",
        "ReservedPorts": "",
        "Gateway": ""
      }
    ]
  },

Nomad client config:

 host_network {
    name = "test"
    interface = "eth1"
 }

Jobs with host_network=test are assigned to the node's public IP.

Edit

oh man, seems like it should be host_network "test" not name="test", I'll see if I can contribute some validation logic.

@yields
Copy link

yields commented Aug 17, 2020

I think I have the networking stuff setup correctly, but I noticed that for some reason, nomad registers the public IP into consul and so service checks and service discovery is basically broken.

Here's the jobfile (I removed the service check).

job "echo" {
  datacenters = ["ams3"]
  type        = "service"

  group "echo" {
    count = 1

    network {
      mode = "bridge"
      port "http" {
        host_network = "lan"
      }
    }

    service {
      address_mode = "host"
      name         = "echo"
      port         = "http"
    }

    task "echo" {
      driver = "docker"

      config {
        image = "echo-server"
        args  = ["--bind", ":${NOMAD_PORT_http}"]
      }
    }
  }
}

Here's Consul API answer:

[
  {
    "ID": "c39dc71f-0573-4cee-238a-7a09c10fcdfe",
    "Node": "main-001",
    "Address": "10.0.0.2", // => matches `host_network "lan"`
    "Datacenter": "ams3",
    "TaggedAddresses": {
      "lan": "10.0.0.2",
      "lan_ipv4": "10.0.0.2",
      "wan": "10.0.0.2", // doesn't match `host_network "wan"`
      "wan_ipv4": "10.0.0.2" // same ^
    },
    "NodeMeta": {
      "consul-network-segment": ""
    },
    "ServiceKind": "",
    "ServiceID": "_nomad-task-fd58156b-0ed5-d1a0-00f9-68762c2ea980-group-echo-echo-http",
    "ServiceName": "echo",
    "ServiceTags": [],
    "ServiceAddress": "<public-ip>", 
    "ServiceTaggedAddresses": {
      "lan_ipv4": {
        "Address": "<public-ip>",
        "Port": 25509
      },
      "wan_ipv4": {
        "Address": "<public-ip>",
        "Port": 25509
      }
    },
    "ServiceWeights": {
      "Passing": 1,
      "Warning": 1
    },
    "ServiceMeta": {
      "external-source": "nomad"
    },
    "ServicePort": 25509,
    "ServiceEnableTagOverride": false,
    "ServiceProxy": {
      "MeshGateway": {},
      "Expose": {}
    },
    "ServiceConnect": {},
    "CreateIndex": 492,
    "ModifyIndex": 492
  }
]

Same issue with DNS:

root@main-001:~ dig +short echo.service.consul
<public-ip>

Let me know if I should open a new issue or move to discuss, happy to provide further details.

@Legogris
Copy link
Author

Legogris commented Aug 30, 2020

@nickethier Just got back to this - maybe this issue should be reopened? Seems like there's still something here not covered by other open issues.

Given this client configuration on Nomad 0.12.3:

client {
    host_network "default" {
            cidr = "192.168.2.42/32"
            interface = "eth3"
    }
    host_network "other" {
            cidr = "192.168.2.0/29"
            interface = "eth2"
    }
}

And this job spec:

job ".." {
  group ".." {
    network {
      port "api" {
        static = 8081
        host_network = "default"
      }
    }
    service {
      name = "foobar"
      port = "api"
    }
    task "foobar" {
      driver = "docker"
      config {
        network_mode = "host"
      }
  }
}

Consul service doesn't get registered on the expected IP:
image

So this doesn't seem like a UI or CLI issue per se. Note that in this case, all interfaces are up and IPs assigned at Nomad startup, and there's no floating IPs/VRRP in play.

@yields
Copy link

yields commented Aug 30, 2020

I can confirm the same issue that @Legogris is happening to me with nomad 0.12.3

I’m using host networking and the service that is registered with consul has the wrong ip

https://discuss.hashicorp.com/t/incorrect-service-ip-registered-with-consul/13000

@mbrezovsky
Copy link

mbrezovsky commented Sep 3, 2020

+1

I can reproduce this issue with latest release 0.12.3. Multi interface networking doesn't work with any recommended configuration. Basically I had public/private interface defined in nomad client with CIDR "IP/32" and interface. In job definition I tried to use specific host_network in mode host/bridge, with /without docker port mapping...
Assigned IP address was always the same.

@nickethier nickethier reopened this Sep 4, 2020
@neilmock
Copy link
Contributor

neilmock commented Sep 5, 2020

We have run into this issue as well, the service is configured with the public IP and should be configured with the private IP as specified in our client config.

A full gist of our configuration is here:

https://gist.github.com/neilmock/12f075e3b22e5bc52e17ad7591af8b82

@nickethier
Copy link
Member

nickethier commented Oct 16, 2020

Hey folks! I'm sorry for the long silence on this issue, I just merged what I think is a fix for this in #9095

I was able to reproduce the Consul registration issue and this fixed it for me. I'd like to see if someone could test against master before I close it.

@urusha
Copy link

urusha commented Oct 20, 2020

We have problems with host_network and system jobs (where static ports are widely used). Please, see the next 2 jobs. Service one honors network_mode (CNI adds -d ip.of.dev.if/32 to iptables DNAT rules), but system job doesn't (DNAT is working on all addresses). This might be fixed here https://github.com/hashicorp/nomad/pull/8822/files ?
Another issue is that CNI doesn't honor docker's network_mode. If there is network_mode defined (name of a docker bridge), container is configured with this network_mode, but CNI iptables rules use default bridge called nomad, so DNAT rules pass packets to the incorrect ip address.
Should I file another report(s)?

job "test" {
  datacenters = ["dc1"]
  type = "service"
  group "group-test" {
    count = 1
    task "job-test" {
      driver = "docker"
      config {
        image = "hashicorp/http-echo"
//        network_mode = "net1"
        args = [ "-text='hello world'" ]
      }
    }
    network {
      mode = "bridge"
      port "http" {
        static       = 1480
        to           = 5678
        host_network = "dev"
      }
    }
  }
}

job "test-sys" {
  datacenters = ["dc1"]
  type = "system"
  group "group-test" {
    task "job-test" {
      driver = "docker"
      config {
        image = "hashicorp/http-echo"
//        network_mode = "net1"
        args = [ "-text='hello world'" ]
      }
    }
    network {
      mode = "bridge"
      port "http" {
        static       = 1480
        to           = 5678
        host_network = "dev"
      }
    }
  }
}

@Legogris
Copy link
Author

Seems to be resolved as of 1.0.1

@neilmock
Copy link
Contributor

1.0.1 fixes this for us so far in production FYI.

@Davasny
Copy link

Davasny commented Feb 10, 2021

I have same issue as @urusha (second part) which I posted here https://discuss.hashicorp.com/t/question-how-to-run-task-in-multi-interface-configuration-with-access-to-docker-network/20768
I'm running version 1.0.3.

When I set network_mode in config, I can't access tasks on public interfaces.

@tgross
Copy link
Member

tgross commented Feb 10, 2021

@Davasny can you open a new issue with the jobspec, configuration, and expected vs actual? That'll help us resolve that for you.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests