Skip to content

Commit

Permalink
Add docs for Envoy shutdown manager
Browse files Browse the repository at this point in the history
Signed-off-by: Steve Sloka <[email protected]>
  • Loading branch information
stevesloka committed Feb 13, 2020
1 parent 81f421a commit fd372ff
Show file tree
Hide file tree
Showing 2 changed files with 75 additions and 0 deletions.
2 changes: 2 additions & 0 deletions site/_data/master-toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ toc:
link: /resources/upgrading
- page: Enabling TLS between Envoy and Contour
url: /grpc-tls-howto
- page: Envoy Shutdown Manager
url: /shutdown-manager
- title: Guides
subfolderitems:
- page: Cert-Manager
Expand Down
73 changes: 73 additions & 0 deletions site/docs/master/shutdown-manager.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Envoy Shutdown Manager

The Envoy process, the data path component of Contour, at times needs to be re-deployed.
This could be due to an upgrade, a change in configuration, or a node-failure forcing a redeployment.

When implementing this roll out, the following steps should be taken:

1. Stop Envoy from accepting new connections
2. Start draining existing connections in Envoy by sending a `POST` request to `/healthcheck/fail` endpoint
3. Wait for connections to drain before allowing Kubernetes to `SIGTERM` the pod

## Overview

Contour implements a new `envoy` sub-command which has a `shutdown-manager` who's job is to manage a single Envoy instances lifecycle for Kubernetes.
The `shutdown-maanger` runs as a new container alongside the Envoy container in the same pod.
It exposes two HTTP endpoints which are used for `livenessProbe` as well as to handle the Kubernetes `preStop` event hook.

- **livenessProbe**: Uses to validate the shutdown manager is still running properly. If requests to `/healthz` fail, the container will be restarted.
- **preStop**: This is used to keep the container running while waiting for Envoy to drain connections. The `/shutdown` endpoint blocks until the connections are drained.

```yaml
- name: shutdown-manager
command:
- /bin/contour
args:
- envoy
- shutdown-manager
image: docker.io/projectcontour/contour:master
imagePullPolicy: Always
lifecycle:
preStop:
httpGet:
path: /shutdown
port: 8090
scheme: HTTP
livenessProbe:
httpGet:
path: /healthz
port: 8090
initialDelaySeconds: 3
periodSeconds: 10
```
The Envoy container also has some configuration to implement the shutdown manager.
First the `preStop` hook is configured to use the `/shutdown` endpoint which blocks the container from exiting.
Finally, the pod's `terminationGracePeriodSeconds` is customized to extend the time in which Kubernetes will allow the pod to be in the `Terminating` state.
If during shutdown, the connections aren't drained to the configured amount, the `terminationGracePeriodSeconds` will send a `SIGTERM` to the pod killing it.

### Shutdown Manager Config Options

The shutdown manager has a set of arguments that can be passed to change how it behaves:

- **check-interval:** [duration] Time to poll Envoy for open connections.
- (Default 5s)
- **check-delay:** [duration] Time wait before polling Envoy for open connections.
- (Default 60s)
- **min-open-connections:** [int] Min number of open connections when polling Envoy.
- (Default 0)
- **serve-port:** [int] Port to serve the http server on.
- (Default 8090)
- **prometheus-path:** [string] The path to query Envoy's Prometheus HTTP Endpoint.
- (Default "/stats/prometheus")
- **prometheus-stat:** [string] Prometheus stat to query.
- (Default "envoy_http_downstream_cx_active")
- **prometheus-values:** [string array] Prometheus values to look for in prometheus-stat.
- (Default ["ingress_http", "ingress_https"])
- **envoy-host:** [string] HTTP endpoint for Envoy's stats page.
- (Default "localhost")
- **envoy-port:** [int] HTTP port for Envoy's stats page.
- (Default "9001")



0 comments on commit fd372ff

Please sign in to comment.