karpenter not provisioning new nodes with proper capacity-spread label #3397

tarfik · 2023-02-14T10:38:58Z

Version

Karpenter Version: v0.23.0

Kubernetes Version: v1.22.16

Expected Behavior

karpenter will identify the unscheduled pods and will provision the new nodes with proper capacity-spread label

Actual Behavior

In our setup we have 2 provsioner:
spot only
on-demand
with an addtional requirement capacity-spread and all deployments are enforcing that requirement. And at some point during the rollout many pods stuck in Pending without a reason why karpenter does not scale new nodes.

Steps to Reproduce the Problem

change deployment namespace1controllerpendingapp-main
topologySpreadConstraints topologyKey: capacity-spread
from: whenUnsatisfiable: ScheduleAnyway
to: whenUnsatisfiable: DoNotSchedule

In our case we weren't able to reproduce the issue on the isolated env, attached anonimized logs from production env. Worth mention is that we have 1k nodes and about 50 provisioners and workload deployed via fluxcd/helmcontroller rollout, so we presume either 1) cluster is too large, 2) flux intervals /timeouts introduce too much noise during rolllout and then rollbacks or both.

Resource Specs and Logs

awsnodetemplate.txt
provisioner_on-demand.yaml.txt
provisioner_spot_on-demand.yaml.txt
namespace1application_deployment.yaml.txt
nodes_namespace1_list_with_labels.txt
pod_namespace1application-main-7b6c8f6b9f-9fv6r_describe.txt
karpenter_2.23.0.log.txt.tar.gz

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

ellistarn · 2023-02-14T19:01:56Z

Seeing two interesting things in your pod describe

  Normal   NotTriggerScaleUp  86s (x44 over 10m)    cluster-autoscaler  (combined from similar events): pod didn't trigger scale-up: 1 node(s) had taint {dedicated: tracking}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: email}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: vikasa}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: copamarata}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: logging}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: dlembedmodule}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: externalcustomer}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: medpikachu}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: kiskegyed}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: dpp}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: pi}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: rivysaur}, that the pod didn't tolerate, 3 node(s) had taint {dedicated: ocdn}, that the pod didn't tolerate, 8 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {dedicated: lakasaabc}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: magic}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: rrzzaonlineapi}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: storyscore}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: uuts4}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: authamadeo}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: storybot}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: zaspomer}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: cmp}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: photoimporter}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: aturbomachine}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: storylinker}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: weather}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: rrzzae2embed}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: zenaapi}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: landingzone}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: maratone}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: skapiec}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: publishing-automation}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: rrzzacms}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: rrzzae2}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: widgets-platform}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: phraselinker}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: zamenix}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: newsletters}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: fia}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: mpikachueasy}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: sese}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: blastoiseapi}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: wspak}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: ivysaur}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: fastoma}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: business}, that the pod didn't tolerate, 3 max node group size reached, 1 node(s) had taint {dedicated: turboaudio}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: monitoturbo}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: scanner}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: turbo}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: buita22}, that the pod didn't tolerate, 2 node(s) had taint {dedicated: tms}, that the pod didn't tolerat

  Warning  FailedScheduling   104s (x66 over 10m)   karpenter           (combined from similar events): Failed to schedule pod, incompatible with provisioner "mega-venusaur-pikachuappspots-d4qmxfokwv-spot-on-demand", did not tolerate dedicated=mega-venusaur:NoSchedule; did not tolerate designation=spotapp:NoSchedule; incompatible with provisioner "charmander-falm-d5fv924aqa-spot", did not tolerate dedicated=charmander:NoSchedule; incompatible with provisioner "caterpie-karp-prod-dp57s51yvb-on-demand", did not tolerate dedicated=caterpie:NoSchedule; incompatible with provisioner "mega-venusaur-o3sfailover-d27641w2o8-spot-on-demand", did not tolerate dedicated=mega-venusaur:NoSchedule; did not tolerate designation=failover:NoSchedule; incompatible with provisioner "venusaur-platform-cp-karpenter-d4dmj25wbr-on-demand", did not tolerate dedicated=venusaur-platform:NoSchedule; incompatible with provisioner "mega-venusaur-pikachuappmixed-dudvritgs3-on-demand", did not tolerate dedicated=mega-venusaur:NoSchedule; did not tolerate designation=mixedapp:NoSchedule; incompatible with provisioner "ivysaur-importer-dv3dswe6q3-spot-on-demand", did not tolerate dedicated=ivysaur:NoSchedule; did not tolerate kind=memory:NoSchedule; incompatible with provisioner "namespace1-karpenternamespace1teservice-dk5b2mvusd-on-demand", incompatible requirements, key tech.nodegroup/name, tech.nodegroup/name In [karpenternamespace1controller] not in tech.nodegroup/name In [karpenternamespace1teservice]; incompatible with provisioner "raichu-raichuhu-d9df6ohc1i-spot-on-demand", did not tolerate dedicated=raichu:NoSchedule; incompatible with provisioner "namespace1-karpenternamespace1apps-d2cseecz6v-spot-on-demand", incompatible requirements, key tech.nodegroup/name, tech.nodegroup/name In [karpenternamespace1controller] not in tech.nodegroup/name In [karpenternamespace1apps]; incompatible with provisioner "namespace1-karpenternamespace1teservice-dk5b2mvusd-spot-on-demand", incompatible requirements, key tech.nodegroup/name, tech.nodegroup/name In [karpenternamespace1controller] not in tech.nodegroup/name In [karpenternamespace1teservice]; incompatible with provisioner "ivysaur-t3-de5ukgfkx1-spot-on-demand", did not tolerate dedicated=ivysaur:NoSchedule; incompatible with provisioner "nidoking-nidokingeksprovisioner-de8sa6wbhv-on-demand", did not tolerate dedicated=nidoking:NoSchedule; incompatible with provisioner "monitoturbo-firstspot-dqdwoaa33n-spot-on-demand", did not tolerate dedicated=monitoturbo:NoSchedule; incompatible with provisioner "psyduck-prodkarp-d8ow34mx51-spot-on-demand", did not tolerate dedicated=psyduck:NoSchedule; incompatible with provisioner "psyduck-prodkarp-d8ow34mx51-on-demand", did not tolerate dedicated=psyduck:NoSchedule; incompatible with provisioner "nidoking-nidokingeksprovisioner-de8sa6wbhv-spot-on-demand", did not tolerate dedicated=nidoking:NoSchedule; incompatible with provisioner "blastoiseapi-blastoise-karpent-dyzbsfr433-spot-on-demand", did not tolerate dedicated=blastoiseapi:NoSchedule; incompatible with provisioner "mankey-karpenter-d24jon35w3-on-demand", did not tolerate dedicated=mankey:NoSchedule; all available instance types exceed provisioner limits; incompatible with provisioner "primeape-primeape-testprovisioner-d259cfaxkz-spot-on-demand", did not tolerate dedicated=primeape:NoSchedule; incompatible with provisioner "mankey-karpenter-d24jon35w3-spot-on-demand", did not tolerate dedicated=mankey:NoSchedule; incompatible with provisioner "namespace1-karpenterdeliveryapi-dun4yc53ae-spot-on-demand", incompatible requirements, key tech.nodegroup/name, tech.nodegroup/name In [karpenternamespace1controller] not in tech.nodegroup/name In [karpenterdeliveryapi]; incompatible with provisioner "namespace1-karpenternamespace1controller-d2hy7xum8s-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread; incompatible with provisioner "bulbasaur-main-d4qvnksp4u-on-demand", did not tolerate dedicated=bulbasaur:NoSchedule; incompatible with provisioner "arcanine-arcaninekpt-d6dmoqf2xx-spot-on-demand", did not tolerate dedicated=arcanine:NoSchedule; incompatible with provisioner "poliwag-karpenter1-dz8wkpbsx5-on-demand", did not tolerate dedicated=poliwag:NoSchedule; incompatible with provisioner "abra-karp1-dxiwuv66sg-on-demand", did not tolerate dedicated=abra:NoSchedule; incompatible with provisioner "charmander-falm-dfi57w2tvh-on-demand", did not tolerate dedicated=charmander:NoSchedule; incompatible with provisioner "charmander-falm-dfi57w2tvh-spot-on-demand", did not tolerate dedicated=charmander:NoSchedule; incompatible with provisioner "namespace1-karpenternamespace1apps-d2cseecz6v-on-demand", incompatible requirements, key tech.nodegroup/name, tech.nodegroup/name In [karpenternamespace1controller] not in tech.nodegroup/name In [karpenternamespace1apps]; incompatible with provisioner "ivysaur-core-d215i1ht7c-spot-on-demand", did not tolerate dedicated=ivysaur:NoSchedule; incompatible with provisioner "kadabra-t3-d2j1vya47p-spot-on-demand", did not tolerate dedicated=kadabra:NoSchedule; incompatible with provisioner "raichu-raichuhu-d9df6ohc1i-on-demand", did not tolerate dedicated=raichu:NoSchedule; incompatible with provisioner "ivysaur-events-dgn1eq8srb-spot-on-demand", did not tolerate dedicated=ivysaur:NoSchedule; incompatible with provisioner "mega-venusaur-pikachuappmixed-dudvritgs3-spot-on-demand", did not tolerate dedicated=mega-venusaur:NoSchedule; did not tolerate designation=mixedapp:NoSchedule; incompatible with provisioner "bulbasaur-main-d4qvnksp4u-spot", did not tolerate dedicated=bulbasaur:NoSchedule; incompatible with provisioner "ivysaur-memory-dauaxibq5d-spot-on-demand", did not tolerate dedicated=ivysaur:NoSchedule; incompatible with provisioner "notifications-center-small-dmph234swc-spot-on-demand", did not tolerate dedicated=notifications-center:NoSchedule; incompatible with provisioner "namespace1-karpenterdeliveryapi-dun4yc53ae-on-demand", incompatible requirements, key tech.nodegroup/name, tech.nodegroup/name In [karpenternamespace1controller] not in tech.nodegroup/name In [karpenterdeliveryapi]; incompatible with provisioner "venusaur-platform-cp-karpenter-d4dmj25wbr-spot-on-demand", did not tolerate dedicated=venusaur-platform:NoSchedule; incompatible with provisioner "charmander-falm-d5fv924aqa-on-demand", did not tolerate dedicated=charmander:NoSchedule; incompatible with provisioner "charmeleon-main-dgp99u8v48-spot-on-demand", did not tolerate dedicated=charmeleon:NoSchedule; incompatible with provisioner "squirtle-spot-first-d2eb56b5ze-spot-on-demand", did not tolerate dedicated=squirtle:NoSchedule; incompatible with provisioner "charmeleon-main-dgp99u8v48-on-demand", did not tolerate dedicated=charmeleon:NoSchedule; incompatible with provisioner "charizard-main-d2ic4i1vth-spot-on-demand", did not tolerate dedicated=charizard:NoSchedule; incompatible with provisioner "wartortle-wartortleprovisioner-dy687sn65k-spot-on-demand", did not tolerate dedicated=wartortle:NoSchedule; incompatible with provisioner "namespace1-karpenternamespace1controller-d2hy7xum8s-spot-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread; incompatible with provisioner "blastoiseapi-blastoise-karpent-dyzbsfr433-on-demand", did not tolerate dedicated=blastoiseapi:NoSchedule; incompatible with provisioner "caterpie-karp-prod-dp57s51yvb-spot-on-demand", did not tolerate dedicated=caterpie:NoSchedule; incompatible with provisioner "wartortle-wartortleprovisioner-dy687sn65k-on-demand", did not tolerate dedicated=wartortle:NoSchedule

Specifically:

 incompatible with provisioner "namespace1-karpenternamespace1controller-d2hy7xum8s-spot-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread;

We improved logging on this very recently: https://github.com/aws/karpenter-core/pull/192/files

ellistarn · 2023-02-14T19:07:22Z

This should be available in v0.24: https://github.com/aws/karpenter-core/releases/tag/v0.24.0

ellistarn · 2023-02-14T19:18:48Z

cluster is too large,

I don't think it's this one, given the events that we're emitting.

flux intervals /timeouts introduce too much noise during rolllout and then rollbacks or both.

Not sure what you mean with this one.

ellistarn · 2023-02-14T19:20:56Z

I think it's likely that you're running into well known rollout issues w/ topology spread (more common at large scale). This is solved by this KEP: https://github.com/denkensk/enhancements/blob/master/keps/sig-scheduling/3243-respect-pod-topology-spread-after-rolling-upgrades/README.md

tarfik · 2023-02-16T12:03:20Z

Thank you for your reply.

Explaining doubts:

We are migrating to Karpenter but cluster-autoscaler is working in the cluster because we still have nodegroups.
"flux intervals /timeouts introduce too much noise..." maybe changes in the cluster are too frequent and there is no chance to get a stable state.

After the karpenter upgrade to the 0.24.0 version, it was possible to repeat the problem on another deployment.
Deployment files and provisioners:
awsnodetemplate.txt
provisioner_on-demand.yaml.txt
provisioner_spot_on-demand.yaml.txt
namespace1teservicependingapp-main-6569859df4-bqz67_yaml.txt
namespace1teservicependingapp-main-6569859df4-bqz67_describe.txt
namespace1teservicependingapp_nodes.txt
karpenter_2.24.0.log.txt.tar.gz

Analyzing pod:

    namespace1teservicependingapp-main-6569859df4-bqz67    0/2     Pending    0   7m42s

In Karpenter's logs we could notice:

    incompatible with provisioner "namespace1-karpenternamespace1teservice-dk5b2mvusd-spot-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread (counts = map[1:10 2:20 3:0 4:0 5:0], podDomains = capacity-spread Exists, nodeDomains = capacity-spread In [1]);
    incompatible with provisioner "namespace1-karpenternamespace1teservice-dk5b2mvusd-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread (counts = map[1:10 2:20 3:0 4:0 5:0], podDomains = capacity-spread Exists, nodeDomains = capacity-spread In [2]);

Looking at the part of the log (if I understand it correctly):

    "key=capacity-spread (counts = map[1:10 2:20 3:0 4:0 5:0]"

and nodeSelector is defined as:

  nodeSelector:
    tech.nodegroup/name: karpenternamespace1teservice
    tech.nodegroup/namespace: namespace1

it is interesting that the count of the nodes filtered by label tech.nodegroup/name=karpenternamespace1teservice:

    kubectl get node -l tech.nodegroup/name=karpenternamespace1teservice

it was equal to 15.
count all nodes in namespace

    kubectl get node -l tech.nodegroup/namespace=namespace1,karpenter.sh/initialized=true

(karpenter.sh/initialized=true to filter only karpenter nodes) was equal to 30.

Inside namespace, nodes labeled with key:capacity-spread has values only from list ["1", "2"].
In the whole cluster we have other values for this label in ["1", "2", "3", "4", "5"]

I don't know if I interpret it well, but according to Karpenter's logs, it sees a larger number of nodes and possible values of label key: capacity-spread than it should filter by assuming a nodeSelector limit.

tarfik · 2023-02-27T16:11:50Z

We are able to reproduce the issue in the isolated env now with two pairs of provisions with a different lists of values for the label 'capacity-spread'. Configuration below.

2 provisioners labeled tech.nodegroup/name: testbacground with capacity-spread in values ['1', '2', '3', '4', '5']:

...
  labels:
    tech.nodegroup/namespace: test
    tech.nodegroup/name: testbacground
...
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - spot
    - on-demand
  - key: capacity-spread
    operator: In
    values:
    - '1'
    - '2'
    - '3'
...

---

...
  labels:
    tech.nodegroup/namespace: test
    tech.nodegroup/name: testbacground
...
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  - key: capacity-spread
    operator: In
    values:
    - '4'
    - '5'
...

Deployment for first provisioners pair.

...
replicas: 5
...
      nodeSelector:
        tech.nodegroup/namespace: test
        tech.nodegroup/name: testbacground
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: "capacity-spread"
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app.kubernetes.io/name: nginx-bacground

It properly created 5 pods/nodes with capacity-spread label values from ['1', '2', '3', '4', '5'].

Second provisioners pair labeled tech.nodegroup/name: testpending with capacity-spread in values ['1', '2']:

  labels:
    rasp.nodegroup/name: testpending
    rasp.nodegroup/namespace: test
...
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - spot
    - on-demand
  - key: capacity-spread
    operator: In
    values:
    - '1'
...

---

...
  labels:
    rasp.nodegroup/name: testpending
    rasp.nodegroup/namespace: test
...
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  - key: capacity-spread
    operator: In
    values:
    - '2'
...

Deployment for second provisioners pair.

...
replicas: 5
...
     nodeSelector:
        rasp.nodegroup/namespace: test
        rasp.nodegroup/name: testpending
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: "capacity-spread"
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app.kubernetes.io/name: nginx-pending

Pod status:

pod/nginx-pending-696998c96f-7gbcx           1/1     Running   0                19m
pod/nginx-pending-696998c96f-n7rnr           1/1     Running   0                19m
pod/nginx-pending-696998c96f-4wbbr           0/1     Pending   0                19m
pod/nginx-pending-696998c96f-pd7w6           0/1     Pending   0                19m
pod/nginx-pending-696998c96f-z665x           0/1     Pending   0                19m

Karpenter logs:

  incompatible with provisioner "test-testpending-d2pv3ur8dm-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread (counts = map[1:1 2:1 3:0 4:0 5:0], podDomains = capacity-spread Exists, nodeDomains = capacity-spread In [2]); 
  incompatible with provisioner "test-testpending-d2pv3ur8dm-spot-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread (counts = map[1:1 2:1 3:0 4:0 5:0], podDomains = capacity-spread Exists, nodeDomains = capacity-spread In [1]);

tzneal · 2023-02-27T17:34:31Z

I don't know if I interpret it well, but according to Karpenter's logs, it sees a larger number of nodes and possible values of label key: capacity-spread than it should filter by assuming a nodeSelector limit.

Using a node selector to limit a pod to a particular provisioner won't limit the range of labels that Karpenter sees for that pod. That's why it's failing to schedule, it needs to spread across the domains evenly but is restricted to a provisioner that can't do it.

We follow the standard K8s rules here (see https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#example-topologyspreadconstraints-with-nodeaffinity). To accomplish what you're trying to do, you'll need to add a required node affinity to the pod instead that limits the pod to only scheduling across the capacities that you want it to schedule against. The topology spread will then only be calculated against those capacities.

kwarunek · 2023-02-27T18:58:33Z

Fair, but as far as I understand the kube-scheduler implements a similar filtering mechanism. And yes we do not set nodeAffinity, but we have configured the taints/tolerations that limits schedulable nodes (even before the score part), so in the end... IMHO Karpenter should look for the label's values only of schedulable nodes as the kube-scheduler does.

tarfik · 2023-02-27T19:12:34Z

Hi,
Thanks for Your answer.
I know now how to fix our configuration.

When I see this doc: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#interaction-with-node-affinity-and-node-selectors
Kubernetes skips the non-matching nodes from the skew calculations also for spec.nodeSelector. Are there plans that Karpenter will honor nodeSelector?

tzneal · 2023-02-27T19:24:20Z

Karpenter does honor that for nodes that exist. This case is for nodes that don't exist but which could be created in which custom topology domains are being provided by provisioners. You can submit a feature request, but I think it would be a complicated change as we would need to identify the subsets of provisioners that pods can schedule against and treat them as distinct topology groups.

kwarunek · 2023-02-27T19:36:48Z

Is the same applies to all keys/constraints like if we would have provisioner with different sets of AZ

tzneal · 2023-02-27T19:41:28Z

It's only required if you want to limit a pod to a particular subset of topology domains. The best way to do that is to use a required node affinity that specifically limits to that topology key since it works for Karpenter and kube-scheduler for existing and future nodes as well as clearly states the intent.

tarfik · 2023-02-28T14:34:14Z

We've changed pod configuration as below:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: tech.nodegroup/namespace
            operator: In
            values:
            - test
          - key: tech.nodegroup/name
            operator: In
            values:
            - testpending
  tolerations:
  - effect: NoSchedule
    key: dedicated
    value: test
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  topologySpreadConstraints:
  - labelSelector:
      matchLabels:
        app.kubernetes.io/name: nginx-pending
    maxSkew: 1
    topologyKey: kubernetes.io/hostname
    whenUnsatisfiable: ScheduleAnyway
  - labelSelector:
      matchLabels:
        app.kubernetes.io/name: nginx-pending
    maxSkew: 2
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway
  - labelSelector:
      matchLabels:
        app.kubernetes.io/name: nginx-pending
    maxSkew: 1
    topologyKey: capacity-spread
    whenUnsatisfiable: DoNotSchedule

but still:

  incompatible with provisioner "test-testpending-d2pv3ur8dm-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread (counts = map[1:1 2:1 3:0 4:0 5:0], podDomains = capacity-spread Exists, nodeDomains = capacity-spread In [2]); 
  incompatible with provisioner "test-testpending-d2pv3ur8dm-spot-on-demand", unsatisfiable topology constraint for topology spread, key=capacity-spread (counts = map[1:1 2:1 3:0 4:0 5:0], podDomains = capacity-spread Exists, nodeDomains = capacity-spread In [1]);

@tzneal

tzneal · 2023-02-28T14:36:38Z

You need to specify in your nodeAffinity a requirement for the capacity-spread that is limited to what you want the workload to schedule against.

tarfik · 2023-02-28T15:07:47Z

Thanks for Your help, it is working now.

One thing, this is not optimal to define a list of capacity-spread values in two places: provisioner and deployment, because I need to know this list when I create a deployment.
Operator "Exists" not working.

          - key: capacity-spread
            operator: In
            values:
            - "1"
            - "2"

tzneal · 2023-02-28T15:28:00Z

You only need to put it on the workload if you are trying to further restrict the workload scheduling. The set of all provisioners contain the "universe" of possible labels for capacity-spread, and in this case you are trying to restrict to a subset of those so it must be on the workload.

JacobHenner · 2025-01-09T14:31:23Z

You need to specify in your nodeAffinity a requirement for the capacity-spread that is limited to what you want the workload to schedule against.

I think it's dangerous to require an additional rule to convey a restriction to Karpenter that kube-scheduler understands without the additional rule. I've provided more detailed thoughts in kubernetes-sigs/karpenter#430 (comment)

tarfik added the bug Something isn't working label Feb 14, 2023

tarfik changed the title ~~karpenter is not provisioning new nodes with proper capacity-spread label~~ karpenter not provisioning new nodes with proper capacity-spread label Feb 14, 2023

tzneal added question Issues that are support related questions and removed bug Something isn't working labels Feb 27, 2023

tzneal closed this as completed Feb 28, 2023

JacobHenner mentioned this issue Jan 9, 2025

Support new topologySpread scheduling constraints kubernetes-sigs/karpenter#430

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

karpenter not provisioning new nodes with proper capacity-spread label #3397

karpenter not provisioning new nodes with proper capacity-spread label #3397

tarfik commented Feb 14, 2023 •

edited

Loading

ellistarn commented Feb 14, 2023 •

edited

Loading

ellistarn commented Feb 14, 2023

ellistarn commented Feb 14, 2023

ellistarn commented Feb 14, 2023

tarfik commented Feb 16, 2023

tarfik commented Feb 27, 2023 •

edited

Loading

tzneal commented Feb 27, 2023 •

edited

Loading

kwarunek commented Feb 27, 2023 •

edited

Loading

tarfik commented Feb 27, 2023

tzneal commented Feb 27, 2023

kwarunek commented Feb 27, 2023

tzneal commented Feb 27, 2023

tarfik commented Feb 28, 2023 •

edited

Loading

tzneal commented Feb 28, 2023

tarfik commented Feb 28, 2023

tzneal commented Feb 28, 2023

JacobHenner commented Jan 9, 2025

karpenter not provisioning new nodes with proper capacity-spread label #3397

karpenter not provisioning new nodes with proper capacity-spread label #3397

Comments

tarfik commented Feb 14, 2023 • edited Loading

Version

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Resource Specs and Logs

Community Note

ellistarn commented Feb 14, 2023 • edited Loading

ellistarn commented Feb 14, 2023

ellistarn commented Feb 14, 2023

ellistarn commented Feb 14, 2023

tarfik commented Feb 16, 2023

tarfik commented Feb 27, 2023 • edited Loading

tzneal commented Feb 27, 2023 • edited Loading

kwarunek commented Feb 27, 2023 • edited Loading

tarfik commented Feb 27, 2023

tzneal commented Feb 27, 2023

kwarunek commented Feb 27, 2023

tzneal commented Feb 27, 2023

tarfik commented Feb 28, 2023 • edited Loading

tzneal commented Feb 28, 2023

tarfik commented Feb 28, 2023

tzneal commented Feb 28, 2023

JacobHenner commented Jan 9, 2025

tarfik commented Feb 14, 2023 •

edited

Loading

ellistarn commented Feb 14, 2023 •

edited

Loading

tarfik commented Feb 27, 2023 •

edited

Loading

tzneal commented Feb 27, 2023 •

edited

Loading

kwarunek commented Feb 27, 2023 •

edited

Loading

tarfik commented Feb 28, 2023 •

edited

Loading