Karpenter doesn't wait for all the daemon sets to become Ready #6691

zip-chanko · 2024-08-09T06:21:07Z

Description

Observed Behavior:
We are seeing the pending pods are being scheduled to the new node where the daemon set is not Running in state yet after the new node is created. This behavior is not happening when using CAS where the node is tainted and wait for all the daemon sets are ready.

Why:
We use Linkerd and its CNI daemon set needs to update the config in the node first before the workload pods and being injected with sidercar proxy. There is an initContainer which checks the CNI config and failed because of the pending workload is scheduled to the new node before the CNI daemon set has done the config changes.

Expected Behavior:
Karpenter releases the new node to become ready only after all the daemon sets are in Running state.

Reproduction Steps (Please include YAML):

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2
  amiSelectorTerms:
  - id: ami-0bbb54e1eafc59d8c
  - name: amazon-eks-node-1.26-*
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required
  role: KarpenterNodeRole-dev-apse2
  securityGroupSelectorTerms:
  - tags:
      karpenter.sh/discovery: dev-apse2
  subnetSelectorTerms:
  - tags:
      karpenter.sh/discovery: dev-apse2
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  disruption:
    budgets:
    - nodes: 10%
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h
  limits:
    cpu: 1000
  template:
    spec:
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
      requirements:
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - spot
      - key: karpenter.k8s.aws/instance-category
        operator: In
        values:
        - c
        - m
        - r
      - key: karpenter.k8s.aws/instance-generation
        operator: Gt
        values:
        - "2"

helm repo add linkerd-edge https://helm.linkerd.io/edge

helm install linkerd-crds linkerd-edge/linkerd-crds \
  -n linkerd --create-namespace

helm install linkerd-cni -n linkerd-cni --create-namespace linkerd-edge/linkerd2-cni

step certificate create root.linkerd.cluster.local ca.crt ca.key \
--profile root-ca --no-password --insecure

step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \
--profile intermediate-ca --not-after 8760h --no-password --insecure \
--ca ca.crt --ca-key ca.key

helm install linkerd-control-plane \
  -n linkerd \
  --set cniEnabled=true \
  --set-file identityTrustAnchorsPEM=ca.crt \
  --set-file identity.issuer.tls.crtPEM=issuer.crt \
  --set-file identity.issuer.tls.keyPEM=issuer.key \
  linkerd-edge/linkerd-control-plane

curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/emojivoto.yml \
  | kubectl apply -f -

kubectl get -n emojivoto deploy -o yaml \
  | linkerd inject - \
  | kubectl apply -f -

Ref:
https://linkerd.io/2.15/tasks/install-helm/
https://linkerd.io/2.15/getting-started/

Versions:

Chart Version: 0.37.0
Kubernetes Version (kubectl version): 1.26

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

njtran · 2024-08-12T18:11:16Z

Seems like this is something you'd want to orchestrate with startupTaints if there's specific startup behavior you want to ensure is completed before pods are scheduled. What taint was being added with Cluster Autoscaler?

zip-chanko · 2024-08-13T03:02:45Z

Sorry my bad. CAS doesn't have any specific taint to wait the daemon sets are running. Not really sure why we are not facing the issue with CAS. I quickly search about startupTaints and found the similar thread #628 related with Cilium. I also notice that the latest version of Cilium has a capability of adding the taint node.cilium.io/agent-not-ready=true:NoExecute. So I think I will check with Linkerd community as well whether there is a similar approach.

zip-chanko · 2024-08-13T04:01:01Z

Closing this issue as seems like Linkerd has a fix in linkerd/linkerd2#11699

zip-chanko added bug Something isn't working needs-triage Issues that need to be triaged labels Aug 9, 2024

zip-chanko closed this as completed Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Karpenter doesn't wait for all the daemon sets to become Ready #6691

Karpenter doesn't wait for all the daemon sets to become Ready #6691

zip-chanko commented Aug 9, 2024 •

edited

Loading

njtran commented Aug 12, 2024

zip-chanko commented Aug 13, 2024

zip-chanko commented Aug 13, 2024

Karpenter doesn't wait for all the daemon sets to become Ready #6691

Karpenter doesn't wait for all the daemon sets to become Ready #6691

Comments

zip-chanko commented Aug 9, 2024 • edited Loading

Description

njtran commented Aug 12, 2024

zip-chanko commented Aug 13, 2024

zip-chanko commented Aug 13, 2024

zip-chanko commented Aug 9, 2024 •

edited

Loading