Merge pull request #1452 from hashicorp/d-bootstrap

Bootstrapping a production cluster documentation
hashicorp · Jul 25, 2016 · e8356bc · e8356bc
2 parents c58d23a + 65f6481
commit e8356bc
Show file tree

Hide file tree

Showing 2 changed files with 205 additions and 91 deletions.
diff --git a/website/source/docs/agent/config.html.md b/website/source/docs/agent/config.html.md
@@ -431,6 +431,7 @@ configured on server nodes.
     task specifies a `kill_timeout` greater than `max_kill_timeout`,
     `max_kill_timeout` is used. This is to prevent a user being able to set an
     unreasonable timeout. If unset, a default is used.
+<a id="reserved"></a>
   * `reserved`: `reserved` is used to reserve a portion of the nodes resources
     from being used by Nomad when placing tasks.  It can be used to target
     a certain capacity usage for the node. For example, 20% of the nodes CPU

diff --git a/website/source/docs/cluster/bootstrapping.html.md b/website/source/docs/cluster/bootstrapping.html.md
@@ -8,134 +8,190 @@ description: |-
 
 # Creating a cluster
 
-Nomad clusters in production comprises of a few Nomad servers (an odd number,
-preferably 3 or 5, but never an even number to prevent split-brain), clients and
-optionally Consul servers and clients. Before we start discussing the specifics
-around bootstrapping clusters we should discuss the network topology. Nomad
-models infrastructure as regions and datacenters. Nomad regions may contain multiple
-datacenters. Nomad servers are assigned to the same region and hence a region is a
-single scheduling domain in Nomad. Each cluster of Nomad servers support
-only one region, however multiple regions can be stitched together to allow a
-globally coherent view of an organization's resources.
+Nomad models infrastructure as regions and datacenters. Regions may contain
+multiple datacenters. Servers are assigned to regions and manage all state for
+the region and make scheduling decisions within that region. Clients are
+registered to a single datacenter and region.
 
+[![Regional Architecture](/assets/images/nomad-architecture-region.png)](/assets/images/nomad-architecture-region.png)
 
-## Consul Cluster
+This page will explain how to bootstrap a production grade Nomad region, both
+with and without Consul, and how to federate multiple regions together.
+
+[![Global Architecture](/assets/images/nomad-architecture-global.png)](/assets/images/nomad-architecture-global.png)
+
+Bootstrapping Nomad is made significantly easier when there already exists a
+Consul cluster in place. Since Nomad's topology is slightly richer than Consul's
+since it supports not only datacenters but also regions lets start with how
+Consul should be deployed in relation to Nomad.
+
+For more details on the architecture of Nomad and how it models infrastructure
+see the [Architecture page](/docs/internals/architecture.html).
+
+## Deploying Consul Clusters
+
+A Nomad cluster gains the ability to bootstrap itself as well as provide service
+and health check registration to applications when Consul is deployed along side
+Nomad.
 
-Bootstrapping a Nomad cluster becomes significantly easier if operators use
-Consul. Network topology of a Consul cluster is slightly different than Nomad.
 Consul models infrastructures as datacenters and multiple Consul datacenters can
 be connected over the WAN so that clients can discover nodes in other
-datacenters. We recommend running a Consul Cluster in every Nomad datacenter and
-connecting them over the WAN. Please refer to the Consul
-[documentation](https://www.consul.io/docs/commands/join.html) to learn more
-about bootstrapping Consul and connecting multiple Consul clusters over the WAN.
+datacenters. Since Nomad regions can encapsulate many datacenters, we recommend
+running a Consul cluster in every Nomad datacenter and connecting them over the
+WAN. Please refer to the Consul guide for both
+[bootstrapping](https://www.consul.io/docs/guides/bootstrapping.html) a single datacenter and 
+[connecting multiple Consul clusters over the
+WAN](https://www.consul.io/docs/guides/datacenters.html).
 
 
-## Nomad Servers
+## Bootstrapping a Nomad cluster
 
-Nomad servers are expected to have sub 10 millisecond network latencies between
-them. Nomad servers could be spread across multiple datacenters, if they have
-low latency connections between them, to achieve high availability. For example,
-on AWS every region comprises of multiple zones which have very low latency
-links between them, so every zone can be modeled as a Nomad datacenter and
-every Zone can have a single Nomad server which could be connected to form a
-quorum and form a region. Nomad servers uses Raft for replicating state between
-them and Raft being highly consistent needs a quorum of servers to function,
-therefore we recommend running an odd number of Nomad servers in a region.
-Usually running 3-5 servers in a region is recommended. The cluster can
-withstand a failure of one server in a cluster of three servers and two failures
-in a cluster of five servers. Adding more servers to the quorum adds more time
-to replicate state and hence throughput decreases so we don't recommend having
-more than seven servers in a region.
+Nomad supports merging multiple configuration files together on startup. This is
+done to enable generating a base configuration that can be shared by Nomad
+servers and clients. A suggested base configuration is:
 
-During the bootstrapping phase Nomad servers need to know the addresses of other
-servers.  Nomad will automatically bootstrap itself when the Consul service is
-present, or can be manually joined by using the
-[`-retry-join`](https://www.nomadproject.io/docs/agent/config.html#_retry_join)
-CLI command or the Server
-[`retry_join`](https://www.nomadproject.io/docs/agent/config.html#retry_join)
-option.
+```
+# Name the region, if omitted, the default "global" region will be used.
+region = "europe"
+
+# Persist data to a location that will survive a machine reboot.
+data_dir = "/opt/nomad/"
+
+# Bind to all addresses so that the Nomad agent is available both on loopback
+# and externally.
+bind_addr = "0.0.0.0"
+
+# Advertise an accessible IP address so the server is reachable by other servers
+# and clients. The IPs can be materialized by Terraform or be replaced by an
+# init script.
+advertise {
+    http = "${self.ipv4_address}:4648"
+    rpc = "${self.ipv4_address}:4647"
+    serf = "${self.ipv4_address}:4648"
+}
 
+# Ship metrics to monitor the health of the cluster and to see task resource
+# usage.
+telemetry {
+    statsite_address = "${var.statsite}"
+    disable_hostname = true
+}
 
+# Enable debug endpoints.
+enable_debug = true
+```
 
-## Nomad Clients
+### With Consul
 
-Nomad clients are organized in datacenters and need to be made aware of the
-Nomad servers to communicate with.  If Consul is present, Nomad will
-automatically self-bootstrap, otherwise they will need to be provided with a
-static list of
-[`servers`](https://www.nomadproject.io/docs/agent/config.html#servers) to find
-the list of Nomad servers.
+If a local Consul cluster is bootstrapped before Nomad, on startup Nomad
+server's will register with Consul and discover other server's. With their set
+of peers, they will automatically form quorum, respecting the `bootstrap_expect`
+field. Thus to form a 3 server region, the below configuration can be used in
+conjunction with the base config:
 
-Operators can either place the addresses of the Nomad servers in the client
-configuration or point Nomad client to the Nomad server service in Consul. Once
-a client establishes connection with a Nomad servers, if new servers are added
-to the cluster the addresses are propagated down to the clients along with
-heartbeat.
+```
+server {
+    enabled = true
+    bootstrap_expect = 3
+}
+```
 
+And an equally simple configuration can be used for clients:
 
-### Bootstrapping a Nomad cluster without Consul
+```
+# Replace with the relevant datacenter.
+datacenter = "dc1"
 
-At least one Nomad server's address (also known as the seed node) needs to be
-known ahead of time and a running agent can be joined to a cluster by running
-the [`server-join` command](/docs/commands/server-join.html). 
+client {
+    enabled = true
+}
+```
 
-For example, once a Nomad agent starts in the server mode it can be joined to an
-existing cluster with a server whose IP is known. Once the agent joins the other
-node in the cluster, it can discover the other nodes via the gossip protocol.
+As you can see, the above configurations have no mention of the other server's to
+join or any Consul configuration. That is because by default, the following is
+merged with the configuration file:
 
 ```
-nomad server-join -retry-join 10.0.0.1
+consul {
+    # The address to the Consul agent.
+    address = "127.0.0.1:8500"
+
+    # The service name to register the server and client with Consul.
+    server_service_name = "nomad"
+    client_service_name = "nomad-client"
+
+    # Enables automatically registering the services.
+    auto_advertise = true
+
+    # Enabling the server and client to bootstrap using Consul.
+    server_auto_join = true
+    client_auto_join = true
+}
 ```
 
-The `-retry-join` parameter indicates that the agent should keep trying to join
-the server even if the first attempt fails. This is essential when the other
-address is going to be eventually available after some time as nodes might take
-a variable amount of time to boot up in a cluster.
+Since the `consul` block is merged by default, bootstrapping a cluster becomes
+as easy as running the following on each of the three servers:
 
-On the client side, the addresses of the servers are expected to be specified
-via the client configuration.
+```
+$ nomad agent -config base.hcl -config server.hcl
+```
+
+And on every client in the cluster, the following should be run:
 
 ```
-client {
-    ...
-    servers = ["10.10.11.2:4648", "10.10.11.3:4648", "10.10.11.4:4648"]
-    ...
-}
+$ nomad agent -config base.hcl -config client.hcl
 ```
 
-In the above example we are specifying three servers for the clients to
-connect. If servers are added or removed, clients know about them via the
-heartbeat of a server which is alive.
+With the above configurations and commands the Nomad agents will automatically
+register themselves with Consul and discover other Nomad servers. If the agent
+is a server, it will join the quorum and if it is a client, it will register
+itself and join the cluster.
+
+Please refer to the [Consul documentation](/docs/agent/config.html#consul_options)
+for the complete set of configuration options.
 
+### Without Consul
 
-### Bootstrapping a Nomad cluster with Consul
+When bootstrapping without Consul, Nomad servers and clients must be started
+knowing the address of at least one Nomad server.
 
-Bootstrapping a Nomad cluster is significantly easier if Consul is used along
-with Nomad. If a local Consul cluster is bootstrapped before Nomad, the
-following configuration would register the Nomad agent with Consul and look up
-the addresses of the other Nomad server addresses and join with them
-automatically.
+To join the Nomad server's we can either encode the address in the server
+configs as such:
 
 ```
-{
-    "server_service_name": "nomad",
-    "server_auto_join": true,
-    "client_service_name": "nomad-client",
-    "client_auto_join": true
+server {
+    enabled = true
+    bootstrap_expect = 3
+    retry_join = ["<known-address>"]
 }
 ```
 
-With the above configuration Nomad agent is going to look up Consul for
-addresses of agents in the `nomad` service and join them automatically.  In
-addition, if the `auto-advertise` option is set Nomad is going to register the
-agents with Consul automatically too. By default, Nomad will automatically
-register the server and the client agents with Consul and try to auto-discover
-the servers if it can talk to a local Consul agent on the same server.
+Alternatively, the address can be supplied after the servers have all been started by
+running the [`server-join` command](/docs/commands/server-join.html) on the servers
+individual to cluster the servers. All servers can join just one other server,
+and then rely on the gossip protocol to discover the rest.
 
-Please refer to the [documentation](/docs/agent/config.html#consul_options)
-for the complete set of configuration options.
+```
+nomad server-join <known-address>
+```
+
+On the client side, the addresses of the servers are expected to be specified
+via the client configuration.
 
+```
+client {
+    enabled = true
+    servers = ["10.10.11.2:4648", "10.10.11.3:4648", "10.10.11.4:4648"]
+}
+```
+
+If servers are added or removed from the cluster, the information will be
+pushed to the client. This means, that only one server must be specified because
+after initial contact, the full set of servers in the client's region will be
+pushed to the client.
+
+The same commmands can be used to start the servers and clients as shown in the
+bootstrapping with Consul section.
 
 ### Federating a cluster
 
@@ -152,3 +208,60 @@ nomad server-join 10.10.11.8:4648
 
 Servers across regions discover other servers in the cluster via the gossip
 protocol and hence it enough to join one known server.
+
+If the Consul clusters in the different Nomad regions are federated, and Consul
+`server_auto_join` is enabled, then federation occurs automatically.
+
+## Network Topology
+
+### Nomad Servers
+
+Nomad servers are expected to have sub 10 millisecond network latencies between
+each other to ensure liveness and high throughput scheduling. Nomad servers
+can be spread across multiple datacenters if they have low latency
+connections between them to achieve high availability.
+
+For example, on AWS every region comprises of multiple zones which have very low
+latency links between them, so every zone can be modeled as a Nomad datacenter
+and every Zone can have a single Nomad server which could be connected to form a
+quorum and a region. 
+
+Nomad servers uses Raft for state replication and Raft being highly consistent
+needs a quorum of servers to function, therefore we recommend running an odd
+number of Nomad servers in a region.  Usually running 3-5 servers in a region is
+recommended. The cluster can withstand a failure of one server in a cluster of
+three servers and two failures in a cluster of five servers. Adding more servers
+to the quorum adds more time to replicate state and hence throughput decreases
+so we don't recommend having more than seven servers in a region.
+
+### Nomad Clients
+
+Nomad clients do not have the same latency requirements as servers since they
+are not participating in Raft. Thus clients can have 100+ millisecond latency to
+their servers. This allows having a set of Nomad servers that service clients
+that can be spread geographically over a continent or even the world in the case
+of having a single "global" region and many datacenter.
+
+## Production Considerations
+
+### Nomad Servers
+
+Depending on the number of jobs the cluster will be managing and the rate at
+which jobs are submitted, the Nomad servers may need to be run on large machine
+instances. We suggest having 8+ cores, 32 GB+ of memory, 80 GB+ of disk and
+significant network bandwith. The core count and network recommendations are to
+ensure high throughput as Nomad heavily relies on network communication and as
+the Servers are managing all the nodes in the region and performing scheduling.
+The memory and disk requirements are due to the fact that Nomad stores all state
+in memory and will store two snapshots of this data onto disk. Thus disk should
+be at least 2 times the memory available to the server when deploying a high
+load cluster.
+
+### Nomad Clients
+
+Nomad clients support reserving resources on the node that should not be used by
+Nomad. This should be used to target a specific resource utilization per node
+and to reserve resources for applications running outside of Nomad's supervision
+such as Consul and the operating system itself.
+
+Please see the [`reservation` config](/docs/agent/config.html#reserved) for more detail.