You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Observed Behavior:
We are seeing the pending pods are being scheduled to the new node where the daemon set is not Running in state yet after the new node is created. This behavior is not happening when using CAS where the node is tainted and wait for all the daemon sets are ready.
Why:
We use Linkerd and its CNI daemon set needs to update the config in the node first before the workload pods and being injected with sidercar proxy. There is an initContainer which checks the CNI config and failed because of the pending workload is scheduled to the new node before the CNI daemon set has done the config changes.
Expected Behavior:
Karpenter releases the new node to become ready only after all the daemon sets are in Running state.
Seems like this is something you'd want to orchestrate with startupTaints if there's specific startup behavior you want to ensure is completed before pods are scheduled. What taint was being added with Cluster Autoscaler?
Sorry my bad. CAS doesn't have any specific taint to wait the daemon sets are running. Not really sure why we are not facing the issue with CAS. I quickly search about startupTaints and found the similar thread #628 related with Cilium. I also notice that the latest version of Cilium has a capability of adding the taint node.cilium.io/agent-not-ready=true:NoExecute. So I think I will check with Linkerd community as well whether there is a similar approach.
Description
Observed Behavior:
We are seeing the pending pods are being scheduled to the new node where the daemon set is not
Running
in state yet after the new node is created. This behavior is not happening when using CAS where the node is tainted and wait for all the daemon sets are ready.Why:
We use Linkerd and its CNI daemon set needs to update the config in the node first before the workload pods and being injected with sidercar proxy. There is an
initContainer
which checks the CNI config and failed because of the pending workload is scheduled to the new node before the CNI daemon set has done the config changes.Expected Behavior:
Karpenter releases the new node to become ready only after all the daemon sets are in
Running
state.Reproduction Steps (Please include YAML):
Ref:
https://linkerd.io/2.15/tasks/install-helm/
https://linkerd.io/2.15/getting-started/
Versions:
0.37.0
kubectl version
): 1.26The text was updated successfully, but these errors were encountered: