Implement topology spread constraints #8168

jcogilvie · 2022-03-29T16:19:50Z

What problem are you trying to solve?

I want a high-availability linkerd deployment that doesn't break if I want more replicas than I have nodes.

How should the problem be solved?

I noticed that presently the linkerd-ha values file uses node anti-affinity to represent a desire not to co-locate on the same host with other linkerd pods. But I believe what we really mean isn't "we can't have more than one per host" so much as "they must be distributed across hosts."

I believe this desire can be more accurately represented by enabling topologySpreadConstraints instead of antiAffinity. Instead of saying "don't schedule with another like pod," we could say "spread evenly between hosts" for the same kind of availability but with more flexibility and less brittleness (i.e. it won't fail if you have a cluster of 3 nodes and desire 4 linkerd controller replicas).

Any alternatives you've considered?

The currently implemented antiAffinity does prevent pods from stacking on the same nodes, but it suffers from robustness concerns if you want a number of replicas >= the number of nodes in the cluster. (For instance, in our environment we have few but very large nodes.) Under that condition, antiAffinity rules become unsatisfiable and deployments (and upgrades) fail.

How would users interact with this feature?

A flag in the helm chart, parallel to enableAntiAffinity, called enableTopologySpread, would cause topology spread constraints to be employed.

n.b. this is only supported on kubernetes 1.19+ so we may have to take version into consideration.

Would you like to work on this feature?

maybe

The text was updated successfully, but these errors were encountered:

jcogilvie · 2022-03-29T16:22:16Z

Aside: this thought was prompted by this karpenter issue. Its scheduler understands topology spread, but has trouble with anti-affinity, and things with anti-affinity don't presently get scheduled.

adleong · 2022-04-12T16:45:29Z

This sounds like a great idea. topologySpreadConstraints does sound like a better fit than antiAffinity for this.

olix0r · 2022-04-12T16:46:15Z

n.b. this is only supported on kubernetes 1.19+ so we may have to take version into consideration.

I think the MSKV is already at 1.20 on edge, so this should be fine to rely on.

kleimkuhler · 2022-06-28T16:47:51Z

I think it makes sense to just replace antiAffinity with toplogySpreadConstraints rather than introducing additional configuration to use one over the other. Our MSKV supports it, and it more accurately represents what HA should do when working with a limited number of nodes, so it should just be replaced in all the templates.

Fixes linkerd#8168 Signed-off-by: Takumi Sue <[email protected]>

adleong · 2022-07-14T22:06:20Z

In #8826 we discovered that TopologySpreadConstraints have some unfortunate downsides. In particular, that the constraints are not continuously enforced after resource creation.

Given that, I wanted to go back to the drawing board and ask some basic questions about this use case. Why is it desirable to run more control plane replicas than nodes? HA mode is generally not recommended for clusters with a small number of nodes.

jcogilvie · 2022-07-14T22:54:48Z

Why is it desirable to run more control plane replicas than nodes?

The case where we have few very large nodes is sometimes prompted by licensing agreements for cluster software that charge per node.

But the issue was prompted by limitations (at the time) of karpenter:

this thought was prompted by this karpenter issue. Its scheduler understands topology spread, but has trouble with anti-affinity, and things with anti-affinity don't presently get scheduled.

That issue, combined with the economic considerations above, and the fact that topologySpreadConstraints seemed like a better semantic fit for what we're actually intending to declare, is what led me to open the issue.

adleong · 2022-07-14T23:13:39Z

In that situation, I would recommend not running Linkerd in HA mode. The goal of HA mode is to create a configuration that is highly resilient to node failure, and when you're running with few very large nodes, it's not really possible to provide those guarantees.

On paper, topologySpreadConstraints did sound like a better semantic fit, but after seeing the way that it's implemented, I don't actually think it's such a great fit.

jcogilvie · 2022-07-14T23:39:02Z

I think what I'm reading is that topologySpreadConstraints and antiAffinity aren't mutually exclusive. Why not add chart support for both and let customers decide how they fit together?

adleong · 2022-07-14T23:43:27Z

To the extent possible, I'm trying to avoid the Linkerd charts becoming a grab bag of options and limit them to a set of configurations that we know make sense and can endorse. For users who want to blaze their own path, there are a wealth of options including using Kustomize, forking the Linkerd charts, or crafting your own manifests some other way.

jcogilvie added the enhancement label Mar 29, 2022

adleong added the help wanted label Apr 12, 2022

adleong added this to the stable-2.12.0 milestone Apr 12, 2022

adleong self-assigned this Jun 30, 2022

adleong added priority/triage priority/P2 Nice-to-have for Release and removed priority/triage labels Jun 30, 2022

adleong removed their assignment Jul 1, 2022

mikutas mentioned this issue Jul 7, 2022

Replace PodAntiAffinity with TopologySpreadConstraints #8826

Closed

mikutas added a commit to mikutas/linkerd2 that referenced this issue Jul 7, 2022

Replace PodAntiAffinity with TopologySpreadConstraints

4e5b861

Fixes linkerd#8168 Signed-off-by: Takumi Sue <[email protected]>

mikutas added a commit to mikutas/linkerd2 that referenced this issue Jul 7, 2022

Replace PodAntiAffinity with TopologySpreadConstraints

a7a6c3b

Fixes linkerd#8168 Signed-off-by: Takumi Sue <[email protected]>

adleong closed this as completed Jul 21, 2022

github-actions bot locked as resolved and limited conversation to collaborators Aug 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement topology spread constraints #8168

Implement topology spread constraints #8168

jcogilvie commented Mar 29, 2022 •

edited

Loading

jcogilvie commented Mar 29, 2022

adleong commented Apr 12, 2022

olix0r commented Apr 12, 2022

kleimkuhler commented Jun 28, 2022

adleong commented Jul 14, 2022

jcogilvie commented Jul 14, 2022

adleong commented Jul 14, 2022

jcogilvie commented Jul 14, 2022

adleong commented Jul 14, 2022

Implement topology spread constraints #8168

Implement topology spread constraints #8168

Comments

jcogilvie commented Mar 29, 2022 • edited Loading

What problem are you trying to solve?

How should the problem be solved?

Any alternatives you've considered?

How would users interact with this feature?

Would you like to work on this feature?

jcogilvie commented Mar 29, 2022

adleong commented Apr 12, 2022

olix0r commented Apr 12, 2022

kleimkuhler commented Jun 28, 2022

adleong commented Jul 14, 2022

jcogilvie commented Jul 14, 2022

adleong commented Jul 14, 2022

jcogilvie commented Jul 14, 2022

adleong commented Jul 14, 2022

jcogilvie commented Mar 29, 2022 •

edited

Loading