Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support SO_REUSEPORT for safe reuse of static ports #15487

Closed
victorstewart opened this issue Dec 7, 2022 · 5 comments · Fixed by #23956
Closed

support SO_REUSEPORT for safe reuse of static ports #15487

victorstewart opened this issue Dec 7, 2022 · 5 comments · Fixed by #23956

Comments

@victorstewart
Copy link

on https://developer.hashicorp.com/nomad/docs/job-specification/network#host_network

it states...

Since multiple services cannot share a port, the port must be open in order to place your task.

but this is false: if using host networking, and SO_REUSEPORT, ports can be shared. docker supports this, i assume the containerd plugin as well.

just wanted to confirm Nomad doesn't refuse to start jobs that all use the same static port.

@tgross tgross added theme/docs Documentation issues and enhancements and removed type/bug labels Jan 3, 2023
@tgross tgross self-assigned this Jan 3, 2023
@tgross tgross changed the title Documentation Error? documentation on port reuse doesn't address SO_REUSEPORT Jan 3, 2023
@tgross
Copy link
Member

tgross commented Jan 3, 2023

Hi @victorstewart! Unfortunately the docs aren't incorrect here, as the scheduler checks for port collisions. So if you run a job like the following:

job with static port
job "httpd" {
  datacenters =  ["dc1"]

  group "web" {

    network {
      mode = "host"
      port "www" {
        static = 8001
      }
    }

    task "http" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "8001", "-h", "/local"]
        ports   = ["www"]
      }

      template {
        data        = "<html>hello, world</html>"
        destination = "local/index.html"
      }

      resources {
        cpu    = 128
        memory = 128
      }

    }
  }
}

That'll run fine, and then if you run the same job with a different ID the scheduler will reject it:

$ nomad job plan ./example.nomad
+ Job: "httpd2"
+ Task Group: "web" (1 create)
  + Task: "http" (forces create)

Scheduler dry-run:
- WARNING: Failed to place all allocations.
  Task Group "web" (failed to place 1 allocation):
    * Resources exhausted on 1 nodes
    * Class "local" exhausted on 1 nodes
    * Dimension "network: reserved port collision www=8001" exhausted on 1 nodes

Job Modify Index: 0
To submit the job with version verification run:

nomad job run -check-index 0 ./example.nomad

When running the job with the check-index flag, the job will only be run if the
job modify index given matches the server-side version. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.

The SO_REUSEPORT use case is obviously important for high availability, but it's also behavior specific to not only the task driver but the workload itself. I'm going to retitle this issue and relabel it as a feature request, but I don't have a solid idea of how we'd change the behavior in a way that's expected for users who aren't using SO_REUSEPORT.

@tgross tgross changed the title documentation on port reuse doesn't address SO_REUSEPORT support SO_REUSEPORT for safe reuse of static ports Jan 3, 2023
@tgross tgross added type/enhancement and removed theme/docs Documentation issues and enhancements labels Jan 3, 2023
@tgross tgross removed their assignment Jan 3, 2023
@victorstewart
Copy link
Author

okay no worries. between now and then i ended up writing my own machine orchestrator, program scheduler and container runtime... just ended up being the path of least resistance for me for many reasons. so i ended up completely sidestepping this issue.

but i do think this is an important feature.

take an array of QUIC programs that use a load balancer to assist connection establishment, then switch to their unique unicast address. all of these servers must run on the public (read: external facing) 443 port for most client firewalls, at whatever waypoint, to reliably allow the UDP traffic through. but this pattern would be impossible in Nomad today.

(not everyone load balances or proxies all traffic through CNI meshes).

best of luck!

@joliver
Copy link

joliver commented Jul 18, 2023

In our infrastructure, because 443 is generally only used by HAProxy for load balancing, we opted to NOT register the port with Nomad at all but to still have the application listen on 443. We also created tags that other jobs (that might want to use 443) can use as a negative constraint to avoid scheduling on nodes with HAProxy.

By doing the above, we can take advantage of SO_REUSEPORT for more seamless load balancer reloads.

@gulducat
Copy link
Member

gulducat commented Sep 13, 2024

edit: the below will be released in Nomad 1.9!


Howdy folks! I have a draft PR #23956 up for this, and I'd love to hear any feedback about whether it would or would not improve the situation for any particular use cases.

Lots of detail in the PR description, but in short: how would being able to specify a port, but ignore collisions, affect your deployments?

group "g" {
    network {
        mode = "host"

        port "http" {
            static = 8000
            to     = 8000

            ignore_collision = true
        }
    }
    task "t" {
        ...

@github-project-automation github-project-automation bot moved this from Needs Roadmapping to Done in Nomad - Community Issues Triage Sep 17, 2024
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

Successfully merging a pull request may close this issue.

5 participants