-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow custom ports for services and checks when using driver address_mode #3619
Conversation
Some test binaries for the brave: old 2017-02-06 4:10pm Pacific |
9239f64
to
994cc07
Compare
case <-h.doneCh: | ||
// already closed | ||
default: | ||
close(h.doneCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran into a double close panic on the mock driver when testing, so I guarded all of these closes. Not sure why we had never hit it before...
@@ -133,7 +162,8 @@ func (m *MockDriver) Start(ctx *ExecContext, task *structs.Task) (*StartResponse | |||
} | |||
m.logger.Printf("[DEBUG] driver.mock: starting task %q", task.Name) | |||
go h.run() | |||
return &StartResponse{Handle: &h}, nil | |||
|
|||
return &StartResponse{Handle: &h, Network: net}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mock_driver
now supports DriverNetwork! I should have added this when I added DriverNetwork originally...
allocDir *allocdir.AllocDir | ||
vault *vaultclient.MockVaultClient | ||
consul *consul.MockAgent | ||
consulClient *consul.ServiceClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this use of *mockConsulServiceClient
is what exposed the task runner bug.
There's still a few tests that use that mock as they inspect the order of Consul operations which the mock makes very easy.
@@ -1079,3 +1083,44 @@ func isNomadService(id string) bool { | |||
const prefix = nomadServicePrefix + "-executor" | |||
return strings.HasPrefix(id, prefix) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below is the core address resolution logic for both services and checks.
jobspec/parse_test.go
Outdated
@@ -580,6 +628,7 @@ func TestParse(t *testing.T) { | |||
for _, d := range pretty.Diff(actual, tc.Result) { | |||
t.Logf(d) | |||
} | |||
//t.Logf(pretty.Sprint(actual)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
@@ -120,6 +122,11 @@ the script will run inside the Docker container. If your task is running in a | |||
chroot, it will run in the chroot. Please keep this in mind when authoring check | |||
scripts. | |||
|
|||
- `address_mode` `(string: "host")` - Same as `address_mode` on `service`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a full example for this and link it to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a full section at the bottom of that page with a link
nomad/structs/structs.go
Outdated
@@ -3444,7 +3462,12 @@ func validateServices(t *Task) error { | |||
knownServices[service.Name+service.PortLabel] = struct{}{} | |||
|
|||
if service.PortLabel != "" { | |||
servicePorts[service.PortLabel] = append(servicePorts[service.PortLabel], service.Name) | |||
if _, err := strconv.Atoi(service.PortLabel); service.AddressMode == "driver" && err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hard to read, and I am not sure if the logic is correct. If addressmode is not driver, and the strconv succeeds (if portlabel is an integer), then should it be allowed or should that throw an error? if not, why even do the parsing?
I have setup a test nomad job but am not getting working health checks:
|
and port. `driver` advertises the IP used in the driver (e.g. Docker's | ||
internal IP) and uses the container's port specified in the port map. The | ||
default is `auto` which behaves the same as `host` unless the driver | ||
determines its IP should be used. This setting supported Docker since Nomad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supported in Docker
default is `auto` which behaves the same as `host` unless the driver | ||
determines its IP should be used. This setting supported Docker since Nomad | ||
0.6 and rkt since Nomad 0.7. It will advertise the container IP if a network | ||
plugin is used (e.g. weave). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make each valid option sub items or a table with option/behavior as headings, and 1/2 lines explaining them. I would also start with the default option auto.
command/agent/consul/client.go
Outdated
if err != nil { | ||
return "", 0, fmt.Errorf("invalid port %q: %v", portLabel, err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have a check for port > 0 - If some job file misconfiguration leads to someone specifying 0 as the port label, we should not try to use it and fail fast..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had some questions and small suggestions to the documentation.
4c41676
to
c68ed09
Compare
@thetooth Thanks for testing! Fixed in f4e341e Since I can see that it's a bit confusing to have Here's another build: |
@schmichael Some usability issues: The following configuration does not work as expected. Since I am not using a port map I tried a deploy without a port label in resources,
The error for this was
Which cased
This worked and I have working service checks to the docker network(weave mesh) however as I hinted at in my earlier comment, since the port label is used docker has been instructed to publish those ports to localhost: |
determines its IP should be used. This setting is supported in Docker since | ||
Nomad 0.6 and rkt since Nomad 0.7. Nomad will advertise the container IP if a | ||
network plugin is used (e.g. weave). See [below for | ||
details.](#using-driver-address-mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you see the other comment I had about making each of the modes sub list headers for clarity?
command/agent/consul/client.go
Outdated
if err != nil { | ||
return "", 0, fmt.Errorf("invalid port %q: %v", portLabel, err) | ||
} | ||
if port == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to <=0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small comments, LGTM otherwise
Fixes #3380 Adds address_mode to checks (but no auto) and allows services and checks to set literal port numbers when using address_mode=driver. This allows SDNs, overlays, etc to advertise internal and host addresses as well as do checks against either.
Rely less on the mockConsulServiceClient because the real consul.ServiceClient needs all the testing it can get!
Previously if only an interpolated variable used in a service or check was changed we interpolated the old and new services and checks with the new variable, so nothing appeared to have changed.
Also skip getting an address for script checks which don't use them. Fixed a weird invalid reserved port in a TaskRunner test helper as well as a problem with our mock Alloc/Job. Hopefully the latter doesn't cause other tests to fail, but we were referencing an invalid PortLabel and just not catching it before.
Fixes #3620 Previously we concatenated tags into task service IDs. This could break deregistration of tag names that contained double //s like some Fabio tags. This change breaks service ID backward compatibility so on upgrade all users services and checks will be removed and re-added with new IDs. This change has the side effect of including all service fields in the ID's hash, so we no longer have to track PortLabel and AddressMode changes independently.
a512a8c
to
afd5bca
Compare
@thetooth Thanks for testing. The However, the error you're receiving is entirely unhelpful! The fix is to specify |
Not sure why
|
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Fixes #3380
Adds address_mode to checks (but no auto) and allows services and checks
to set literal port numbers when using address_mode=driver.
This allows SDNs, overlays, etc to advertise internal and host addresses
as well as do checks against either.
task_runner.go
changesFixes a bug found when using fewer mocks in TaskRunner tests: changing only an interpolated variable used in a service or a check wouldn't actually update the service or check.