-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods with SG for pods are slow in ContainerCreating when a new node is deployed #1252
Comments
IIUC, the k8s scheduler and karpenter are not aware of the pod security group concept. However, it should not cause incorrect scheduling and pod binding decisions. It is a matter of time for the node and the extra ENI to be ready for serving the new pods. The question is how long the delay is and why. Can you help confirm that your pods can recover after the ENI is created? This will allow us to focus on the latency issue instead of the correctness issue. I use this command to check when the ENI is online. |
Further, it's undesirable to flood the system with errors in the success case. At worst, those events can clog up etcd and slow the cluster down at scale. At best, it may trigger false alarms. |
Apparently my above example is a lucky outliner that doesn't show how long the pods will be stuck in the Ideally the VPC resource controller should perform a resync when the node becomes ready. We will need the VPC resource controller team's help to implement that logic. |
@felix-zhe-huang I did experience what you said, sometimes the pods would just unstuck themselves and ran with irregular timing, of up to 30 minutes. Now it makes sense. I can confirm that whenever a node gets an ENI, pods are able to get into Running state, but as long as it was running before (or apparently after a sync happens) |
This issue is stale because it has been open 25 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Please keep this issue open. |
Hello @tomerpeer63 , Add the (Instance types which support ENI trunking (https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/master/pkg/aws/vpc/limits.go) Here's an example Provisioner manifest:
|
@dewjam can we add this to Troubleshooting? |
Sure thing. Will work on that today. |
Thanks. looks like the workaround is working |
Great, glad to hear the workaround is working @tomerpeer63 ! Quick update on our efforts towards a permanent fix. We were working towards implementing a fix in the aws-vpc-resource-controller (aws/amazon-vpc-resource-controller-k8s#103), but have shifted efforts to looking into removing the "early binding" capability Karpenter uses to assign pods to nodes. Without early binding, SGP would work as expected. I'm in the process of validating this assumption by testing in some experimental code. I'll keep you in the loop. |
Closed with #1856 |
@tzneal - Does this mean that we should remove
|
Yes, this should work. |
Going to wait for the next release before upgrading back to v0.12.x, but good to know about the provisioners! |
Should this be re-opened as pre-binding was reverted? |
Version
Karpenter Version: v0.5.4
Kubernetes: v1.21
Expected Behavior
When trying to scale new pods that has Security groups for pods, and a new node is created in Karpenter, the pod will get attached to the new node and will be deployed successfully
Actual Behavior
The new pods get attached to the new node, but they are stuck in ContainerCreating even after the node is ready for use. Only when the node is ready, and I delete/create the pods manually, the pods are able to run. This happens only with pods that use Security groups for pods.
If a node is already ready when new pods needs to deploy, this doesn't happen. Only when a node is coming up, and a pod is attached to it, the problem gets reproduced
I think this happens because Karpenter doesn't honer pod's schedule restrictions. Karpenter bind pod that requires vpc.amazonaws.com/pod-eni resource to node without vpc.amazonaws.com/pod-eni resource. However, the VPCResourceController will ignore such pod bind event since the node is not been managed yet (https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/v1.1.0/controllers/core/pod_controller.go#L100).
Steps to Reproduce the Problem
Use Karpenter with and have a deployment with SG for pods (VPC CNI), and try to deploy a new provisioner with new pods.
Resource Specs and Logs
2022-01-23T09:40:50, pod default/krakend-deployment-7bd59cd948-zzzrl was created, vpc-resource-controller's webhook modified it to have schedule restriction vpc.amazonaws.com/pod-eni: 1
2022-01-23T09:40:53, node ip-xxx.eu-central-1.compute.internal was created by karpenter ahead of time(before EC2 instance is been initialized)
2022-01-23T09:40:53: karpenter binds pod default/krakend-deployment-7bd59cd948-zzzrl to node ip-xxx.eu-central-1.compute.internal. (even when the node haven't advertised to have the vpc.amazonaws.com/pod-eni resource yet, and vpc-resource-controller will ignore this pod update since node is not managed by it yet)
2022-01-23T09:43:44, node ip-xxx.eu-central-1.compute.internal was attached trunk-ENI and patched by vpc-resource-controller to have vpc.amazonaws.com/pod-eni resources.
2022-01-23T10:04:39, pod default/krakend-deployment-7bd59cd948-zzzrl was deleted by replication-controller(deletionTimestamp set, but pod object isn't deleted)
2022-01-23T10:04:40, pod default/krakend-deployment-7bd59cd948-zzzrl was modified to have branch-ENI annotation after branchENI attached to node.
The text was updated successfully, but these errors were encountered: