-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Blog: Kubernetes v1.26: Advancements in Kubernetes Traffic Engineering
Signed-off-by: Andrew Sy Kim <[email protected]>
- Loading branch information
1 parent
25fdb78
commit 61b0d45
Showing
8 changed files
with
97 additions
and
0 deletions.
There are no files selected for viewing
Binary file added
BIN
+75.1 KB
...posts/2022-11-28-advancements-in-traffic-engineering/endpointslice-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+76.9 KB
...1-28-advancements-in-traffic-engineering/endpointslice-with-terminating-pod.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
96 changes: 96 additions & 0 deletions
96
content/en/blog/_posts/2022-11-28-advancements-in-traffic-engineering/index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
--- | ||
layout: blog | ||
title: "Kubernetes v1.26: Advancements in Kubernetes Traffic Engineering" | ||
date: 2022-11-28 | ||
slug: advancements-in-kubernetes-traffic-engineering | ||
--- | ||
|
||
**Authors:** Andrew Sy Kim (Google) | ||
|
||
Kubernetes v1.26 includes significant advancements in network traffic engineering with the graduation of | ||
two features (Service internal traffic policy support, and EndpointSlice terminating conditions) to GA, | ||
and a third feature (Proxy terminating endpoints) to beta. The combination of these enhancements aims | ||
to address short-comings in traffic engineering that people face today, and unlock new capabilities for the future. | ||
|
||
## Traffic Loss from Load Balancers During Rolling Updates | ||
|
||
Prior to Kubernetes v1.26, clusters could experience [loss of traffic](https://github.com/kubernetes/kubernetes/issues/85643) | ||
from Service load balancers during rolling updates when setting the `externalTrafficPolicy` field to `Local`. | ||
There are a lot of moving parts at play here so a quick overview of how Kubernetes manages load balancers might help! | ||
|
||
In Kubernetes, you can create a Service with `type: LoadBalancer` to expose an application externally with a load balancer. | ||
The load balancer implementation varies between clusters and platforms, but the Service provides a generic abstraction | ||
representing the load balancer that is consistent across all Kubernetes installations. | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: my-service | ||
spec: | ||
selector: | ||
app.kubernetes.io/name: my-app | ||
ports: | ||
- protocol: TCP | ||
port: 80 | ||
targetPort: 9376 | ||
type: LoadBalancer | ||
``` | ||
Under the hood, Kubernetes allocates a NodePort for the Service, which is then used by kube-proxy to provide a | ||
network data path from the NodePort to the Pod. A controller will then add all available Nodes in the cluster | ||
to the load balancer’s backend pool, using the designated NodePort for the Service as the backend target port. | ||
{{< figure src="traffic-engineering-service-load-balancer.png" caption="Figure 1: Overview of Service Load Balancers" >}} | ||
Oftentimes it is beneficial to set `externalTrafficPolicy: Local` for Services, to avoid extra hops between | ||
Nodes that are not running healthy Pods backing that Service. When using `externalTrafficPolicy: Local`, | ||
an additional NodePort is allocated for health checking purposes, such that Nodes that do not contain healthy | ||
Pods are excluded from the backend pool for a load balancer. | ||
|
||
{{< figure src="traffic-engineering-lb-healthy.png" caption="Figure 2: Load Balancer Traffic to a Healthy Node" >}} | ||
|
||
One such scenario where traffic can be lost is when a Node loses all Pods for a Service, | ||
but the external load balancer has not probed the health check NodePort yet. The likelihood of this situation | ||
is largely dependent on the health checking interval configured on the load balancer. The larger the interval, | ||
the more likely this will happen, since the load balancer will continue to send traffic to a node without | ||
even after kube-proxy has removed forwarding rules for that Service. This also occurrs when Pods start terminating | ||
during rolling updates. Since Kubernetes does not consider terminating Pods as “Ready”, traffic can be loss | ||
when there are only terminating Pods on any given Node during a rolling update. | ||
|
||
{{< figure src="traffic-engineering-lb-unhealthy.png" caption="Figure 3: Load Balancer Traffic to an Unhealthy Node" >}} | ||
|
||
Starting in Kubernetes v1.26, kube-proxy enables the `ProxyTerminatingEndpoints` feature by default, which | ||
adds automatic failover and routing to terminating endpoints in scenarios where the traffic would otherwise | ||
be dropped. More specifically, when there is a rolling update and a Node only contains terminating replicas, | ||
kube-proxy will route traffic to the terminating replicas as long as their readiness probes are passing. | ||
By doing so, kube-proxy provides the external load balancer a window of time to gracefully steer traffic | ||
away from the Node before its next health check probe. | ||
|
||
{{< figure src="traffic-engineering-lb-proxy-terminating.png" caption="Figure 4: Load Balancer Traffic to a Node with Terminating Endpoints" >}} | ||
|
||
### EndpointSlice Conditions | ||
|
||
In order to support this new capability in kube-proxy, the EndpointSlice API introduced new conditions for endpoints: | ||
`serving` and `terminating`. | ||
|
||
{{< figure src="endpointslice-overview.png" caption="Figure 5: Overview of EndpointSlice Conditions" >}} | ||
|
||
The `serving` condition is semantically identical to `ready`, except that it can be `true` or `false` | ||
while a Pod is terminating, unlike `ready` which will always be `false` for terminating Pods for compatibility reasons. | ||
The `terminating` condition is true for Pods undergoing termination (non-empty deletionTimestamp), false otherwise. | ||
|
||
The addition of these two conditions enables consumers of this API to understand Pod states that were previously not possible. | ||
For example, Pods can have long termination grace periods with readiness probes that continue to pass during the termination phase. | ||
|
||
{{< figure src="endpointslice-with-terminating-pod.png" caption="Figure 6: EndpointSlice Conditions with a Terminating Pod" >}} | ||
|
||
Consumers of the EndpointSlice API, such as Kube-proxy and Ingress Controllers, can now use these conditions to coordinate connection draining | ||
for terminating Pods by disallowing new connections (weight=0), but continuing to forward requests to existing connections | ||
if the readiness probe continues to succeed. | ||
|
||
## Optimizing Internal Node-Local Traffic | ||
|
||
Similar to how Services can set `externalTrafficPolicy: Local` to avoid extra hops for externally sourced traffic, Kubernetes | ||
now supports `internalTrafficPolicy: Local`, to enable the same optimization for traffic originating within the cluster. | ||
This feature graduated to Beta in Kubernetes v1.24 and is graduating to GA in v1.26. |
Binary file added
BIN
+69 KB
...22-11-28-advancements-in-traffic-engineering/traffic-engineering-lb-healthy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+61.5 KB
...dvancements-in-traffic-engineering/traffic-engineering-lb-proxy-terminating.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+61.4 KB
...-11-28-advancements-in-traffic-engineering/traffic-engineering-lb-unhealthy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+55.5 KB
...vancements-in-traffic-engineering/traffic-engineering-service-load-balancer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
content/en/blog/_posts/2022-MM-DD-service-internal-traffic-policy.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
PLACEHOLDER |