Skip to content

Commit

Permalink
Clean up the HA docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Ole Markus With committed Jun 17, 2020
1 parent 8c3b4e4 commit 62375e6
Showing 1 changed file with 14 additions and 69 deletions.
83 changes: 14 additions & 69 deletions docs/operations/high_availability.md
Original file line number Diff line number Diff line change
@@ -1,96 +1,41 @@
High Availability (HA)
======================
# High Availability (HA)

Introduction
-------------
## Introduction

Kubernetes has two strategies for high availability:
For testing purposes, kubernetes works just fine with a single master. However, when the master becomes unavailable, for example due to upgrade or instance failure, the kubernetes API will be unavailable. Pods and services that are running on the continues to operate as long as they do not depend on interacting with the API, but operations such as adding nodes, scaling pods, replacing terminated pods will not work. Running kubectl will also not work.

* Run multiple independent clusters and combine them behind one management plane: [federation](https://kubernetes.io/docs/user-guide/federation/)
* Run a single cluster in multiple cloud zones, with redundant components
kops runs each master in a dedicated autoscaling groups (ASG) and stores data on ESB volumes. That way, if a master node is terminated the ASG will launch a new master instance with the master's volume. Because of the dedicated ESB volumes, each master is bound to a fixed Availability Zone (AZ). If the AZ becomes unavailable, the master instance in that AZ will also become unavailable.

kops has good support for a cluster than runs
with redundant components. kops is able to create multiple kubernetes masters, so in the event of
a master instance failure, the kubernetes API will continue to operate.
For production use, you therefor want to run kubernetes in a HA setup with multiple masters. With multiple master nodes, you will be able both to do graceful, zero-down time upgrades, and you will be able to survive AZ failures.

However, when running kubernetes with a single master, if the master fails, the kubernetes API will be unavailable, but pods and services that are running on the (unaffected) nodes should continue to operate. In this situation, we won't be able to do anything that involves the API (adding nodes, scaling pods, replacing
terminated pods), and kubectl won't work. However your application should continue to run, and most applications
could probably tolerate an API outage of an hour or more.
If you already have a single-master cluster you would like to convert to a multi-master cluster, read the [single to multi-master](../single-to-multi-master.md) docs.

Moreover, kops runs the masters in an automatic replacement mode. Masters are run in auto-scaling groups, with
the data on an EBS volume. If a master node is terminated, the ASG will launch a new master instance, and kops
will mount the master volume and replace the master.

In short:
## Creating a HA cluster

* A single master kops cluster is still reasonably available; if the master instance terminates it will be automatically
replaced. But the use of EBS binds us to a single AZ, and in the event of a prolonged AZ outage, we might experience
downtime.
* A multi-node kops cluster can tolerate the outage of a single AZ
### Example 1: public topology


Using Kops HA
-------------

We can create HA clusters using kops, but only it's important to note that migrating from a single-master
cluster to a multi-master cluster is a complicated operation (described [here](../single-to-multi-master.md)).
If possible, try to plan this at time of cluster creation.

When you first call `kops create cluster`, you specify the `--master-zones` flag listing the zones you want your masters
to run in, for example:
The simplest way to get started with a HA cluster is to run `kops create cluster` as shown below. The `--master-zones` flag listing the zones you want your masters
to run in. By default, kops will create one master per AZ. Since the kubernetes etcd cluster runs on the master nodes, you have to specify an odd number of zones in order to obtain quorum.

```
kops create cluster \
--node-count 3 \
--zones us-west-2a,us-west-2b,us-west-2c \
--master-zones us-west-2a,us-west-2b,us-west-2c \
--node-size t2.medium \
--master-size t2.medium \
--topology private \
--networking kopeio-vxlan \
hacluster.example.com
```

Kubernetes relies on a key-value store called "etcd", which uses the Quorum approach to consistency,
so it is available if 51% of the nodes are available.

As a result there are a few considerations that need to be taken into account when using kops with HA:

* Only odd number of masters instances should be created, as an even number is likely _less_ reliable than the lower odd number.
* Kops has experimental support for running multiple masters in the same AZ, but it should be used carefully.
If we create 2 (or more) masters in the same AZ, then failure of the AZ will likely cause etcd to lose quorum
and stop operating (with 3 nodes). Running in the same AZ therefore increases the risk of cluster disruption,
though it can be a valid scenario, particularly if combined with [federation](https://kubernetes.io/docs/user-guide/federation/).
## Example 2: private topology


Advanced Example
----------------

Another example `create cluster` invocation for HA with [a private network topology](../topology.md):
Create a cluster using [private network topology](../topology.md):

```
kops create cluster \
--node-count 3 \
--zones us-west-2a,us-west-2b,us-west-2c \
--master-zones us-west-2a,us-west-2b,us-west-2c \
--dns-zone example.com \
--node-size t2.medium \
--master-size t2.medium \
--node-security-groups sg-12345678 \
--master-security-groups sg-12345678,i-abcd1234 \
--topology private \
--networking weave \
--cloud-labels "Team=Dev,Owner=John Doe" \
--image 293135079892/k8s-1.4-debian-jessie-amd64-hvm-ebs-2016-11-16 \
--networking cilium \
${NAME}
```

Notes (Best Practice)
----
* In regions with 2 Availability Zones, deploy the 3 masters in one zone and the nodes can be distributed between the 2
zones. This can be done by specifying the flags:
```
--master-count=3
--master-zones=$MASTER_ZONE
--zones=$NODE_ZONES
```
```

0 comments on commit 62375e6

Please sign in to comment.