Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add go-discover support to Nomad clients #4277

Merged
merged 21 commits into from
Jun 1, 2018
Merged

Conversation

chelseakomlo
Copy link
Contributor

No description provided.

@chelseakomlo chelseakomlo force-pushed the f-retry-join-clients branch 3 times, most recently from d478c00 to 47ef99e Compare May 10, 2018 21:30
@chelseakomlo chelseakomlo force-pushed the f-retry-join-clients branch 6 times, most recently from ebd6dd4 to 5498a3a Compare May 19, 2018 19:18
@chelseakomlo chelseakomlo changed the title [WIP] Add go-discover support to Nomad clients Add go-discover support to Nomad clients May 21, 2018
@chelseakomlo chelseakomlo force-pushed the f-retry-join-clients branch 3 times, most recently from 606e3ba to 27151de Compare May 21, 2018 17:39
@chelseakomlo chelseakomlo requested review from preetapan and dadgar May 21, 2018 17:40
@@ -29,6 +30,13 @@ client {
foo = "bar"
baz = "zip"
}
server_join_info {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt

@@ -86,6 +94,12 @@ server {
redundancy_zone = "foo"
upgrade_version = "0.8.0"
encrypt = "abc"
server_join_info {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt

serverEnabled: true,
}

// This is for backwards compatibility, this should be removed in Nomad
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use // COMPAT: Remove in 0.10 and only use ServerJoin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows us to search backwards compatibility hacks

@@ -217,6 +217,9 @@ type ClientConfig struct {
// NoHostUUID disables using the host's UUID and will force generation of a
// random UUID.
NoHostUUID *bool `mapstructure:"no_host_uuid"`

// ServerJoin contains information that is used to attempt to join servers
ServerJoin *ServerJoin `mapstructure:"server_join_info"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why _info?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, would suggest calling this server_join or server_join_config maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is left over from a rename, I'll move to just server_join

@@ -29,6 +30,13 @@ client {
foo = "bar"
baz = "zip"
}
server_join_info {
retry_join = [ "1.1.1.1", "2.2.2.2" ]
start_join = [ "1.1.1.1", "2.2.2.2" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start_join isn't supported on client?


require.Equal(1, len(output))
require.Equal(stubAddress, output[0])
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do a client and server test

Copy link
Contributor Author

@chelseakomlo chelseakomlo May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed existing tests to be server-specific, but I'll mark it in the test name as such.

@@ -90,6 +90,22 @@ client {
receive work. This may be specified as an IP address or DNS, with or without
the port. If the port is omitted, the default port of `4647` is used.

- `server_join` `(map[string]string` - Specifies the list of server information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think map[string]string is the right type? It should be its own stanza similar to: https://www.nomadproject.io/docs/agent/configuration/acl.html

@@ -148,6 +148,13 @@ server {
made before exiting with a return code of 1. By default, this is set to 0
which is interpreted as infinite retries.

- `server_join` `(map[string]string` - Specifies the list of server information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point this to the new stanza docs

@@ -131,8 +131,8 @@ server {
cluster again when starting. This flag allows the previous state to be used to
rejoin the cluster.

- `retry_join` `(array<string>: [])` - Specifies a list of server addresses to
retry joining if the first attempt fails. This is similar to
- `retry_join` `(array<string>: [])` - Specifies a list of server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document that these should be deprecated in favor of server_join

discover: &discover.Discover{},
errCh: c.retryJoinErrCh,
logger: c.agent.logger,
if config.Server.Enabled && len(config.Server.RetryJoin) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure there is verification that:

  1. On the server either, config.Server.RetryJoin/StartJoin/etc is set or the new ServerJoin is set and not both
  2. On the client that StartupJoin isn't set.

@@ -222,7 +222,7 @@ func (s *HTTPServer) updateServers(resp http.ResponseWriter, req *http.Request)

// Set the servers list into the client
s.agent.logger.Printf("[TRACE] Adding servers %+q to the client's primary server list", servers)
if err := client.SetServers(servers); err != nil {
if _, err := client.SetServers(servers); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why SetServers changed to return the number of endpoints if its not being used here where its called

Copy link
Contributor Author

@chelseakomlo chelseakomlo May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to change this so it has the same interface as Join for servers, as we both use them as handles to RetryJoiner.

}
go joiner.RetryJoin(serverJoinInfo)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like only one of the two (either config.Server.RetryJoin or config.Server.RetryJoin) should be set? If both are set it should fail validation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in adding a Validate method for all edge cases like this.

// addresses, then the agent will error and exit.
StartJoin []string `mapstructure:"start_join"`

// RetryJoin is a list of addresses to join with retry enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention here that we can use a single value in the array to find multiple servers using go discover syntax?

@@ -217,6 +217,9 @@ type ClientConfig struct {
// NoHostUUID disables using the host's UUID and will force generation of a
// random UUID.
NoHostUUID *bool `mapstructure:"no_host_uuid"`

// ServerJoin contains information that is used to attempt to join servers
ServerJoin *ServerJoin `mapstructure:"server_join_info"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, would suggest calling this server_join or server_join_config maybe?

@chelseakomlo chelseakomlo force-pushed the f-retry-join-clients branch 2 times, most recently from dbf77d0 to 7dcae2e Compare May 23, 2018 23:34
@@ -29,6 +30,12 @@ client {
foo = "bar"
baz = "zip"
}
server_join {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the indentation

@@ -1055,6 +1088,9 @@ func (a *ServerConfig) Merge(b *ServerConfig) *ServerConfig {
if b.EncryptKey != "" {
result.EncryptKey = b.EncryptKey
}
if b.ServerJoin != nil {
result.ServerJoin = b.ServerJoin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is overriding not merging. You need a merge method on the ServerJoin object

// fields and the server_join stanza are not both set
if config.Server != nil && config.Server.ServerJoin != nil {
if len(config.Server.RetryJoin) != 0 {
return fmt.Errorf("server_join and retry_join cannot both be defined; try defining only server_join")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try defining only -> prefer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

server_join and retry_join cannot both be defined; prefer setting the server_join parameter.

@@ -90,6 +90,9 @@ client {
receive work. This may be specified as an IP address or DNS, with or without
the port. If the port is omitted, the default port of `4647` is used.

- `server_join` <code>([ServerJoin][server_join]: nil)</code> - Specifies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this link clickable?

@@ -196,6 +207,17 @@ unless configured otherwise:
nomad-01.company.local => nomad-01.company.local:4648
```

#### Via the go-discover interface

As of Nomad 0.9, `retry-join` accepts a unified interface using the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.8.4 and you should document it via the new stanza

@@ -90,6 +90,9 @@ client {
receive work. This may be specified as an IP address or DNS, with or without
the port. If the port is omitted, the default port of `4647` is used.

- `server_join` <code>([ServerJoin][server_join]: nil)</code> - Specifies
configuration which is specific to retry joining Nomad servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifies how the Nomad client will connect to Nomad servers. The start_join field is not supported on the client. The retry_join fields may directly specify the server address or use go-discover syntax for auto-discovery. See the documentation for more detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the text I wrote

page_title: "Cloud Auto-join"
sidebar_current: "docs-agent-cloud-auto-join"
description: |-
Nomad supports automatic cluster joining using cloud metadata on various providers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on -> from various cloud providers

Nomad supports automatic cluster joining using cloud metadata on various providers.
---

# Cloud Auto-joining
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a call out somewhere that this is only a subset of the cloud providers and for the most up to date documentation see the library itself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

page_title: "server_join Stanza - Agent Configuration"
sidebar_current: "docs-agent-configuration-server_join"
description: |-
The "server_join" stanza configures the Nomad agent to enable retry_join logic for connecting to Nomad servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server_join stanza specifies how the Nomad agent will discover and connect to Nomad servers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below too


#### Via the go-discover interface

As of Nomad 0.9, `retry-join` accepts a unified interface using the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.8.4.

This above section looks like a copy of what is on the server page. We shouldn't duplicate. Probably remove that one?

@@ -82,6 +82,33 @@ func (r *retryJoiner) Validate(config *Config) error {
}
}

if config.Server != nil {
dur, err := time.ParseDuration(config.Server.RetryInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just change the struct to parse directly into the time value to avoid this?

HeartbeatGrace time.Duration `mapstructure:"heartbeat_grace"`

@@ -90,6 +90,9 @@ client {
receive work. This may be specified as an IP address or DNS, with or without
the port. If the port is omitted, the default port of `4647` is used.

- `server_join` <code>([ServerJoin][server_join]: nil)</code> - Specifies
configuration which is specific to retry joining Nomad servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the text I wrote

@@ -131,24 +131,26 @@ server {
cluster again when starting. This flag allows the previous state to be used to
rejoin the cluster.

- `server_join` - Specifies the [Server Join][server_join] stanza for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the same syntax as the rest of the parameters. <Name of the parameter> <type> - <Text describing>

@@ -165,59 +167,6 @@ server {
on the format of the string. This field is deprecated in favor of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The address format link is broken

@chelseakomlo chelseakomlo force-pushed the f-retry-join-clients branch from e377e57 to 82f0a38 Compare May 25, 2018 21:13
@@ -106,7 +106,7 @@ nomad-01.company.local => nomad-01.company.local:4648

#### Via the go-discover interface

As of Nomad 0.9, `retry-join` accepts a unified interface using the
As of Nomad 0.8.4, `retry-join` accepts a unified interface using the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retry-join -> retry_join

@chelseakomlo chelseakomlo force-pushed the f-retry-join-clients branch 2 times, most recently from 5e381ce to 1707b3c Compare May 26, 2018 00:22
@@ -70,7 +70,7 @@ func (r *retryJoiner) Validate(config *Config) error {
return fmt.Errorf("server_join and retry_max cannot both be defined; try defining only server_join")
}
if config.Server.RetryInterval != "0" {
return fmt.Errorf("server_join and retry_interval cannot both be defined; try defining only server_join")
return fmt.Errorf("server_join and retry_interval cannot both be defined; prefer setting the server_join parameter")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update all the errors to the suggested format

return fmt.Errorf("server_join and retry_interval cannot both be defined; prefer setting the server_join parameter")
if config.Server.RetryInterval != "0" && config.Server.RetryInterval != "" {
// 30s is the default value that is set, ignore if this is the case
if config.Server.RetryInterval != "30s" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is quite odd. I wonder if there is a more elegant way to handle this.

@dadgar dadgar force-pushed the f-retry-join-clients branch from 27610f1 to 4f86d2c Compare May 30, 2018 20:46
chelseakomlo and others added 17 commits May 31, 2018 10:50
clean up styling of new pages
documentation fixes
add additional validation case
This commit:
* Improves how we combine the old retry-* fields and the new stanza and
how it is validated
* Handles the new stanza setting start_join
* Fixes integration test to not bind to the standard port and instead be
randomized.
* Simplifies parsing of the old retry_interval
* Fixes the errors from retry join being masked
* Flags get parsed into new server_join stanza
@dadgar dadgar force-pushed the f-retry-join-clients branch from f9d2bea to 92096d6 Compare May 31, 2018 17:53
$ nomad agent -retry-join "127.0.0.1:4648"
```

Note that `retry_join` can be defined for only servers as a command-line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reword as : retry-join can be defined as a command line flag only for servers. Clients can configure retry-join only in configuration files.

@dadgar dadgar merged commit d834439 into master Jun 1, 2018
@dadgar dadgar deleted the f-retry-join-clients branch June 1, 2018 16:57
@github-actions
Copy link

github-actions bot commented Mar 3, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants