Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad can not use consul ingress-gateways because tasks use protocol tcp #8647

Open
spuder opened this issue Aug 12, 2020 · 24 comments
Open
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/consul/connect Consul Connect integration theme/networking type/enhancement

Comments

@spuder
Copy link
Contributor

spuder commented Aug 12, 2020

Nomad 0.11.1
Consul 1.8.2

Consul Ingress-Gateways support tcp and http listeners. Http listeners are preferred because they allow for multiple services to listen on a single port and use Host header identification.

Problem

Nomad jobs default to service type of tcp. There does not appear to be a documented way to change a nomad job to use http as the service type. As a result the user will get the following error when they attempt to create a listener for it.

https://www.nomadproject.io/docs/job-specification/service

Error writing config entry ingress-gateway/ingress-ngproxy: Unexpected response code: 500 (rpc error making call: service "count-dashboard" has protocol "tcp", which does not match defined listener protocol "http")

Steps to reproduce

  1. Submit the standard count-dash example
count-dash.job

job "countdash" {
   datacenters = ["dc1"]
   group "api" {
     network {
       mode = "bridge"
     }

     service {
       name = "count-api"
       port = "9001"

       connect {
         sidecar_service {}
       }
     }

     task "web" {
       driver = "docker"
       config {
         image = "hashicorpnomad/counter-api:v1"
       }
     }
   }

   group "dashboard" {
     network {
       mode ="bridge"
       port "http" {
         static = 9002
         to     = 9002
       }
     }

     service {
       name = "count-dashboard"
       port = "9002"
       # This is slightly modified from the stock count-dash examples
       # By adding an 'http' health check, the hope was to force nomad to use 'http' over 'tcp'
       check  {
         name = "count-dashboard-health"
         type = "http"
         protocol = "http"
         path = "/health"
         port = 9002
         interval = "10s"
         timeout = "5s"

       }
       connect {
         sidecar_service {
           proxy {
             upstreams {
               destination_name = "count-api"
               local_bind_port = 8080
             }
           }
         }
       }
     }

     task "dashboard" {
       driver = "docker"
       env {
         COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
       }
       config {
         image = "hashicorpnomad/counter-dashboard:v1"
       }
     }
   }
 }

  1. Create an ingress controller and register it with consul config
consul config write ingress-service.hcl

Listeners = [
 {
   Port = 8080
   Protocol = "http"
   Services = [
     {
	Name = "count-dashboard",
        Hosts = ["count.example.com"]
   }
  ]
 }
]

Expected result

The service should be added to the ingress controller

Actual result

Consul throws this warning

Error writing config entry ingress-gateway/ingress-service: Unexpected response code: 500 (rpc error making call: service "count-dashboard" has protocol "tcp", which does not match defined listener protocol "http")
@liemle3893
Copy link

How about adding below into count-dashboard:

 
sidecar_service {
  proxy {
    config {
      protocol = "http"
    }
  }
}

@spuder
Copy link
Contributor Author

spuder commented Aug 12, 2020

Great idea. I tried setting protocol with no change in behavior

     service {
       name = "count-dashboard"
       port = "9002"
       check  {
         name = "count-dashboard-health"
         type = "http"
         protocol = "http"
         path = "/health"
         port = 8080
         interval = "10s"
         timeout = "5s"

       }
       connect {
         sidecar_service {
           proxy {
             config {
               protocol = "http"
             }
             upstreams {
               destination_name = "count-api"
               local_bind_port = 8080
             }
           }
         }
       }
     }

I believe this is the documentation page that lists the available config options
https://www.consul.io/docs/connect/registration/sidecar-service

@blake
Copy link
Member

blake commented Aug 12, 2020

By default services deployed within Consul service mesh are configured as tcp services. You can override this on a per-service basis by creating a service-defaults configuration entry, or at the global level by creating a proxy-defaults entry.

Any services you wish to associate with an ingress gateway listener must previously be configured to use the same protocol as that listener prior to association, otherwise a configuration error will be returned.

@apollo13
Copy link
Contributor

Hi @spuder there might be some overlap with #8294 (comment) -- apparently Michael got it working there

@spuder
Copy link
Contributor Author

spuder commented Aug 12, 2020

Good suggestions. I've modified the job to use connect.sidecar_service.proxy.config.protocol=http and connect.sidecar_service.proxy.local_service_port=9002, however I am still unable to register this service in the load balancer as HTTP

connect {
         sidecar_service {
           proxy {
             config {
               protocol = "http"
             }
             local_service_port = 9002
             upstreams {
               destination_name = "count-api"
               local_bind_port = 8080
             }
           }
         }
       }
countdash.job

job "countdash" {
   datacenters = ["dc1"]
   group "api" {
     network {
       mode = "bridge"
     }

     service {
       name = "count-api"
       port = "9001"

       connect {
         sidecar_service {}
       }
     }

     task "web" {
       driver = "docker"
       config {
         image = "hashicorpnomad/counter-api:v1"
       }
     }
   }

   group "dashboard" {
     network {
       mode ="bridge"
       port "http" {
         static = 9002
         to     = 9002
       }
     }

     service {
       name = "count-dashboard"
       port = "9002"
       check  {
         name = "count-dashboard-health"
         type = "http"
         protocol = "http"
         path = "/health"
         port = 9002
         interval = "10s"
         timeout = "5s"

       }
       connect {
         sidecar_service {
           proxy {
             config {
               protocol = "http"
             }
             local_service_port = 9002
             upstreams {
               destination_name = "count-api"
               local_bind_port = 8080
             }
           }
         }
       }
     }

     task "dashboard" {
       driver = "docker"
       env {
         COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
       }
       config {
         image = "hashicorpnomad/counter-dashboard:v1"
       }
     }
   }
 }

I still am seeing this error when I attempt to register the service

Error writing config entry ingress-gateway/ingress-ngproxy: Unexpected response code: 500 (rpc error making call: service "count-dashboard" has protocol "tcp", which does not match defined listener protocol "http")

I've ensured that the job is completely stoped and the service is eliminated from consul before submitting the job again.

@Lucretius
Copy link

Lucretius commented Aug 14, 2020

@spuder

I was running into this exact same issue and finally was able to get this working by creating a "service-defaults" config entry in Consul (not through Nomad), with the same name as the service - and specifying the protocol there

For example, for your above service (assume it is named "web")

resource "consul_config_entry" "web" {
  kind = "service-defaults"
  name = "web"

  config_json = jsonencode({
    Protocol : "http"
  })
}

Stop and start your job in Nomad, then try registering the gateway.

It is unfortunate that when registering the service, Consul seems to ignore the specification inside the Nomad Connect proxy config stanza making it impossible to accomplish this in a single Nomad configuration . I dug through the source code a little but was unable to find anything that stood out as to why that is. Seems like a bug but this should provide a workaround in the meantime.

shoenig added a commit that referenced this issue Aug 21, 2020
This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.

```hcl
service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      ingress {
        // ingress-gateway configuration entry
      }
    }
  }
}
```

A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.

Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.

Aims to address #8294 and tangentially #8647
@tgross tgross added theme/consul/connect Consul Connect integration type/enhancement stage/accepted Confirmed, and intend to work on. No timeline committment though. labels Aug 24, 2020
@apollo13
Copy link
Contributor

Mhm, so while the ingress gateway in 0.12.4 seems to work nicely for tcp services (just tried :D) it seems to fail rather horribly for http. Do we miss something @shoenig or is there simply no support for it yet?

@shoenig
Copy link
Member

shoenig commented Sep 11, 2020

What's the problem you're seeing @apollo13 ? Using the http protocol should work, though you do still have to configure the service default setting the protocol to http before Consul will accept the config entry.

@apollo13
Copy link
Contributor

@shoenig, exactly. I was just wondering if there is already something builtin in Nomad to set the type (not that we miss something :))

@shoenig
Copy link
Member

shoenig commented Sep 11, 2020

I think we'll publish a learn guide in the near future detailing the ins-and-outs of running Gateways in Nomad. For now though, here's a little example I've been using:

set service defaults

$ cat ig-service-defaults.json
{
    "Kind": "service-defaults",
    "Name": "uuid-api",
    "Protocol": "http"
}
consul config write ig-service-defaults.json

example job file

# $ cat ig-http.nomad

job "ig-http" {

  datacenters = ["dc1"]

  group "ingress-group" {

    network {
      mode = "bridge"
      port "inbound" {
        static = 8080
        to     = 8080
      }
    }

    service {
      name = "my-ingress-service"
      port = "8080"

      connect {
	gateway {
	  proxy {
	    connect_timeout = "500ms"
	  }
	  ingress {
            listener {
              port     = 8080
              protocol = "http"
              service {
		name = "uuid-api"
		hosts = ["example.com", "example.com:8080"]
              }
            }
          }
        }
      }
    }
  }

  group "generator" {
    network {
      mode = "host"
      port "api" {}
    }

    service {
      name = "uuid-api"
      port = "${NOMAD_PORT_api}"

      connect {
        native = true
      }
    }

    task "generate" {
      driver = "docker"

      config {
        image        = "hashicorpnomad/uuid-api:v3"
        network_mode = "host"
      }

      env {
        BIND = "0.0.0.0"
        PORT = "${NOMAD_PORT_api}"
      }
    }
  }
}
$ nomad job run ig-http.nomad

inspect

consul config read -kind ingress-gateway -name my-ingress-service
<our config entry>
curl -H "Host: example.com"  $(dig +short @127.0.0.1 -p 8600 uuid-api.ingress.dc1.consul. ANY):8080
3a9faa28-36bf-46c3-8274-be1c6f0a1978

@apollo13
Copy link
Contributor

apollo13 commented Sep 11, 2020

Thanks, do you think setting the service type/proto directly in nomad would be in scope in the future, or is that something out of scope for nomad totally?

@shoenig
Copy link
Member

shoenig commented Sep 11, 2020

It might be possible in the future. We shied away from managing anything but the ingress-gateway config entry type for now because there are issues around the multi-writer problem implied in how Consul makes config entries global in scope. Individual [OSS] Nomad clusters don't communicate with one another, so it's kinda sketchy to be writing config entries from Nomad. We rationalized it's fine for ingress-gateway entries, since it's probably a bug to be trying to define different IGCE's for the same service name regardless of which Nomad cluster it's coming from. But we didn't want to push that rationalization any further than necessary, and so at least for now service defaults still need to be set in Consul out of band from Nomad.

We've discussed internally some possible mechanics Consul can provide to improve the multi-writer story - if that stuff gets implemented then I don't see why Nomad couldn't make use of it. If you don't mind opening a ticket describing your use case, that would definitely help us gauge the interest for that feature.

@apollo13
Copy link
Contributor

apollo13 commented Sep 12, 2020

Oh, thank you for the extensive explanation. I didn't realize that service defaults is the only way to set the protocol. I thought it would be possible to do that during service registration (I never looked that closely at consul aside from it's nomad integration).

Not sure if a new ticket makes sense; the use-case is simply providing an ingress gateway to the outside world (well mostly internal infra) so everything is encrypted. As it stands currently most people use traefik or so but then (usually) the traffic between traefik and the services is not encrypted. That said traefik just laid the groundwork to support connect services traefik/traefik@76f42a3

EDIT:// To be expand on the usecase a bit: When I said "simply" I ment something along the lines of "people can just submit a job to nomad and the rest will be taken care of". Ie they shouldn't have to know about specific consul quirks for configuration.

@mister2d
Copy link

@shoenig Is it possible to gauge interest by simple a thumbs up? I just want to submit a Nomad job that configures an ingress controller to route to internal Connect services via HTTP header.

The use case is very simple and not new conceptually. I've had this methodology working in Docker Swarm for about 3 years and would like to finalize the transition over to Nomad + Consul Connect.

@tunhvn
Copy link

tunhvn commented Oct 11, 2020

I have the same issue with proxy config stanza in Nomad. I setted protocol = "http" but it seems Consul ignored this config.
How can I set default HTTP for all services?

@3nprob
Copy link

3nprob commented Feb 4, 2021

since it's probably a bug to be trying to define different IGCE's for the same service name regardless of which Nomad cluster it's coming from

Counter-example if I understand it right: There are many P2P applications (notably Ethereum) that suppose publicly reachable TCP and UDP on the same port. Today Nomad doesn't have that distinction - specifying a port with a service for the port means both TCP and UDP.

In that scenario one would need two ingresses to the same service, unless I'm misisng something.

@spuder
Copy link
Contributor Author

spuder commented Feb 25, 2021

For future reference, here is our current work around

  1. Create a service-defaults with protocol http
  2. Create an ingress proxy with protocol http

Here is an example of how you may configure consul using terraform

resource "consul_config_entry" "ingress-example" {
  name       = "ingress-example"
  kind       = "ingress-gateway"
  depends_on = [consul_config_entry.foo, consul_config_entry.bar ] # <- Note this sets the resource in the proper order
  config_json = jsonencode({
    Listeners = [{
      Port     = 8080
      Protocol = "http"
      Services = [
        {
          Name  = "foo"
          Hosts = ["foo.example.com"]
        },
        {
          Name  = "bar"
          Hosts = ["bar.example.com"]
        }]
    }]
  })

}

resource "consul_config_entry" "foo" {
  name = "foo"
  kind = "service-defaults"

  config_json = jsonencode({
    Protocol = "http"
  })
}

resource "consul_config_entry" "bar" {
  name = "bar"
  kind = "service-defaults"

  config_json = jsonencode({
    Protocol = "http"
  })
}

Note that if you try and change this on a running service, you will get an error because the service will already have the default type of tcp. The work around is to create this consul config before deploying a job with nomad. Or atleast stopping the nomad job, creating these configs, then starting the nomad job back up.

@paladin-devops
Copy link

Any update on this issue? Like others in this thread I would also like to avoid making updates in Consul directly, outside of the Nomad job.

@shoenig
Copy link
Member

shoenig commented Mar 24, 2022

Now that Consul versions pre-dating ConfigEntry Meta fields have been phased out, it might be reasonable to have Nomad do something clever with regard to automatically managing the prerequisite service-defaults ConfigEntry with Protocol set to the associated ingress.listener.protocol value for each enumerated service. The idea being Nomad only upserts the service-defaults ConfigEntry if and only if an existing ConfigEntry for the service contains a nomad_managed: true meta field (or doesn't exist yet), avoiding overwriting a ConfigEntry created outside of Nomad. These service-default ConfigEntry's would be created on submission of the job containing the ingress gateway definition, along side the ingress-gateway ConfigEntry Nomad already creates.

The global nature of ConfigEntry still implies each discrete Nomad cluster would be re-upserting the same service-defaults ConfigEntry for each service, as is already the case for ingress-gateway ConfigEntry.

Certainly open to feedback!

@josegonzalez
Copy link
Contributor

@shoenig that certainly seems more than fine for us at SeatGeek.

At the moment, we have to configure this and a ServiceResolver separately from registering a Nomad job during a deploy, meaning much more coordination for something that is more or less a unit of work (registering a service against Consul for service discovery). Our initial use case is exactly setting the service protocol correctly so that Consul Connect does the right thing at the proxy level, though there are other things we want to configure as we expand our Consul Connect adoption past the initial phase.

It would even be fine if the logic here was gated behind some sort of beta/technical preview advisory, as has been done with CSI or Remote Task Drivers.

@tgross
Copy link
Member

tgross commented Oct 5, 2022

See also #14802 for an example of the challenges around updating the configuration.

@suikast42
Copy link
Contributor

Any progress in this issue? That's realy a mess to maintain the services in that way if you want introduce distrubuted tracing over envoy

hashicorp/consul#15515

@thnee
Copy link

thnee commented Nov 22, 2023

For posterity, this is how to set the proxy defaults as a Terraform resource.
Which is what was needed to make services work with Consul API Gateway.

resource "consul_config_entry" "proxy_defaults_global" {
  kind = "proxy-defaults"
  name = "global"

  config_json = jsonencode({
    Config = {
      protocol = "http"
    }
  })
}

@ehsannm
Copy link

ehsannm commented Oct 22, 2024

any update ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/consul/connect Consul Connect integration theme/networking type/enhancement
Projects
None yet
Development

No branches or pull requests