Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter doesn't wait for all the daemon sets to become Ready #6691

Closed
zip-chanko opened this issue Aug 9, 2024 · 3 comments
Closed

Karpenter doesn't wait for all the daemon sets to become Ready #6691

zip-chanko opened this issue Aug 9, 2024 · 3 comments
Labels
bug Something isn't working needs-triage Issues that need to be triaged

Comments

@zip-chanko
Copy link

zip-chanko commented Aug 9, 2024

Description

Observed Behavior:
We are seeing the pending pods are being scheduled to the new node where the daemon set is not Running in state yet after the new node is created. This behavior is not happening when using CAS where the node is tainted and wait for all the daemon sets are ready.

Why:
We use Linkerd and its CNI daemon set needs to update the config in the node first before the workload pods and being injected with sidercar proxy. There is an initContainer which checks the CNI config and failed because of the pending workload is scheduled to the new node before the CNI daemon set has done the config changes.

Expected Behavior:
Karpenter releases the new node to become ready only after all the daemon sets are in Running state.

Reproduction Steps (Please include YAML):

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2
  amiSelectorTerms:
  - id: ami-0bbb54e1eafc59d8c
  - name: amazon-eks-node-1.26-*
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required
  role: KarpenterNodeRole-dev-apse2
  securityGroupSelectorTerms:
  - tags:
      karpenter.sh/discovery: dev-apse2
  subnetSelectorTerms:
  - tags:
      karpenter.sh/discovery: dev-apse2
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  disruption:
    budgets:
    - nodes: 10%
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h
  limits:
    cpu: 1000
  template:
    spec:
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
      requirements:
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - spot
      - key: karpenter.k8s.aws/instance-category
        operator: In
        values:
        - c
        - m
        - r
      - key: karpenter.k8s.aws/instance-generation
        operator: Gt
        values:
        - "2"
helm repo add linkerd-edge https://helm.linkerd.io/edge

helm install linkerd-crds linkerd-edge/linkerd-crds \
  -n linkerd --create-namespace

helm install linkerd-cni -n linkerd-cni --create-namespace linkerd-edge/linkerd2-cni

step certificate create root.linkerd.cluster.local ca.crt ca.key \
--profile root-ca --no-password --insecure

step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \
--profile intermediate-ca --not-after 8760h --no-password --insecure \
--ca ca.crt --ca-key ca.key

helm install linkerd-control-plane \
  -n linkerd \
  --set cniEnabled=true \
  --set-file identityTrustAnchorsPEM=ca.crt \
  --set-file identity.issuer.tls.crtPEM=issuer.crt \
  --set-file identity.issuer.tls.keyPEM=issuer.key \
  linkerd-edge/linkerd-control-plane

curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/emojivoto.yml \
  | kubectl apply -f -

kubectl get -n emojivoto deploy -o yaml \
  | linkerd inject - \
  | kubectl apply -f -

Ref:
https://linkerd.io/2.15/tasks/install-helm/
https://linkerd.io/2.15/getting-started/

Versions:

  • Chart Version: 0.37.0
  • Kubernetes Version (kubectl version): 1.26
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@zip-chanko zip-chanko added bug Something isn't working needs-triage Issues that need to be triaged labels Aug 9, 2024
@njtran
Copy link
Contributor

njtran commented Aug 12, 2024

Seems like this is something you'd want to orchestrate with startupTaints if there's specific startup behavior you want to ensure is completed before pods are scheduled. What taint was being added with Cluster Autoscaler?

@zip-chanko
Copy link
Author

Sorry my bad. CAS doesn't have any specific taint to wait the daemon sets are running. Not really sure why we are not facing the issue with CAS. I quickly search about startupTaints and found the similar thread #628 related with Cilium. I also notice that the latest version of Cilium has a capability of adding the taint node.cilium.io/agent-not-ready=true:NoExecute. So I think I will check with Linkerd community as well whether there is a similar approach.

@zip-chanko
Copy link
Author

Closing this issue as seems like Linkerd has a fix in linkerd/linkerd2#11699

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage Issues that need to be triaged
Projects
None yet
Development

No branches or pull requests

2 participants