Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad upgrade documentation could be clearer about service-stanza #9292

Open
kaspergrubbe opened this issue Nov 7, 2020 · 4 comments
Open
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues theme/docs Documentation issues and enhancements

Comments

@kaspergrubbe
Copy link

kaspergrubbe commented Nov 7, 2020

Nomad version

0.12.7 / 1.0.0.beta

Reproduction steps

Go to https://www.nomadproject.io/docs/upgrade/upgrade-specific and read about »mbits and Task Network Resource deprecation:

Starting in Nomad 0.12.0 the mbits field of the network resource block has been deprecated. This is in part because we felt that mbits didn't accurately account network bandwidth as a resource.

Additionally the use of the network block inside of a task's resource block is also deprecated. Users are advised to move their network block to the group block. Recent networking features have only been added to group based network configuration. If any usecase or feature which was available with task network resource is not fulfilled with group network configuration, please open an issue detailing the missing capability.

Okay, that means that this old job file:

job "ingress" {
  datacenters = ["eu-test-1"]
  type = "system"

  update {
    max_parallel = 1
    min_healthy_time = "10s"
    healthy_deadline = "3m"
    progress_deadline = "10m"
    auto_revert = false
    canary = 1
  }

  group "loadbalancer" {
    count = 1

    restart {
      interval = "30m"
      delay = "15s"
      mode = "fail"
    }

    ephemeral_disk {
      size = 300
    }

    task "haproxy" {
      driver = "docker"

      kill_timeout = "12s"
      kill_signal = "SIGUSR1" # https://linux.die.net/man/1/haproxy

      template {
        data = <<EOF
global
  maxconn 1024
  daemon

  log 127.0.0.1 local0 debug

resolvers ourdns
  nameserver dns1 8.8.8.8:53
  nameserver dns2 8.8.4.4:53
  nameserver dns3 1.1.1.1:53
  nameserver dns4 1.0.0.1:53

resolvers consuldns
  nameserver consul 127.0.0.1:8600
  accepted_payload_size 8192
  hold valid 5s

listen fe_ingress_stats
  bind *:8887
  option httplog
  mode http
  stats enable
  stats show-legends
  stats show-node
  stats uri /

  acl going_down stopping
  monitor-uri /healthy
  monitor fail if going_down

backend be_ingress_stats
  mode http
  server Local localhost:8887

frontend fe_http
  bind *:8888
  option httplog
  mode http

  use_backend be_ingress_stats
EOF
        destination = "local/haproxy.cfg"
        change_mode   = "signal"
        change_signal = "SIGHUP"
      }

      config {
        image = "haproxy:2.3.0"
        network_mode = "host"

        volumes = [
          "local/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg",
        ]
      }

      resources {
        cpu    = 300
        memory = 100

        network {
          port "admin" {
            static = 8887
          }
          port "http" {
            static = 8888
          }
          port "https" {
            static = 8889
          }
        }
      }

      service {
        name = "haproxy-ui"
        tags = []
        port = "admin"

        check {
          name     = "Ingress HTTP healthcheck"
          path     = "/"
          type     = "http"
          protocol = "http"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

Needs to be updated like this by moving the network-stanza into the group stanza, easy!

--- ingress-nomad-1.nomad.hcl	2020-11-07 23:27:39.000000000 +0000
+++ ingress-nomad-2.nomad.hcl	2020-11-07 23:45:26.000000000 +0000
@@ -14,6 +14,18 @@
   group "loadbalancer" {
     count = 1

+    network {
+      port "admin" {
+        static = 8887
+      }
+      port "http" {
+        static = 8888
+      }
+      port "https" {
+        static = 8889
+      }
+    }
+
     restart {
       interval = "30m"
       delay = "15s"
@@ -90,18 +102,6 @@
       resources {
         cpu    = 300
         memory = 100
-
-        network {
-          port "admin" {
-            static = 8887
-          }
-          port "http" {
-            static = 8888
-          }
-          port "https" {
-            static = 8889
-          }
-        }
       }

       service {

That blows up like this:

Error submitting job: Unexpected response code: 500 (rpc error: 1 error occurred:
	* Task group loadbalancer validation failed: 1 error occurred:
	* Task haproxy validation failed: 1 error occurred:
	* 1 error occurred:
	* port label "admin" referenced by services haproxy-ui does not exist
)

It took me a while to understand that the service stanza needed to be movws to the group stanza (it was in the task-stanza before):

--- ingress-nomad-2.nomad.hcl	2020-11-07 23:45:26.000000000 +0000
+++ ingress-nomad-3.nomad.hcl	2020-11-07 23:47:01.000000000 +0000
@@ -26,6 +26,21 @@
       }
     }

+    service {
+      name = "haproxy-ui"
+      tags = []
+      port = "admin"
+
+      check {
+        name     = "Ingress HTTP healthcheck"
+        path     = "/"
+        type     = "http"
+        protocol = "http"
+        interval = "10s"
+        timeout  = "2s"
+      }
+    }
+
     restart {
       interval = "30m"
       delay = "15s"
@@ -103,21 +118,6 @@
         cpu    = 300
         memory = 100
       }
-
-      service {
-        name = "haproxy-ui"
-        tags = []
-        port = "admin"
-
-        check {
-          name     = "Ingress HTTP healthcheck"
-          path     = "/"
-          type     = "http"
-          protocol = "http"
-          interval = "10s"
-          timeout  = "2s"
-        }
-      }
     }
   }
 }

Did I miss something obvious or are the upgrade path not very clear? I might not be the only one that stumbled upon this.

@kaspergrubbe
Copy link
Author

If the diffs are confusing to read, here's a zip with the three configuration files: ingress-nomad-files.zip

@shoenig shoenig added stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/docs Documentation issues and enhancements labels Nov 9, 2020
@schmichael
Copy link
Member

Worth noting that in addition to the confusing error message and unhelpful docs, we should also fix the http response code to 400 instead of 500. Thanks for the report @kaspergrubbe!

@schmichael schmichael added the theme/api HTTP API and SDK issues label Nov 11, 2020
@dnrce
Copy link

dnrce commented Dec 28, 2020

Big +1 on this, and please note that it is not only the upgrade docs, but https://www.nomadproject.io/docs/job-specification/service as a whole that still promotes putting service inside a task.

Should job -> group -> task -> service be fully deprecated?

@kaspergrubbe
Copy link
Author

kaspergrubbe commented Dec 29, 2020

I thought this deprecation was fixed or decided before the 1.0.0 release, but it seems like it is still here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues theme/docs Documentation issues and enhancements
Projects
None yet
Development

No branches or pull requests

4 participants