You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe this is similar to #1163 (and perhaps even #1258), but distinct enough that I thought it warranted another issue. This may also be the wrong repository, I just encountered it specifically with EKS and the aws-ebs-csi-driver.
When a cluster is scaled up in response to new pods with PVCs and a custom volume-attach-limit is set, somehow the restriction is not enforced happens leading to all pods being scheduled.
What you expected to happen?
Ideally, only the pods that can fit the volume attach limit can be scheduled to the node. I'm unsure if this is something that needs to happen in the Kubernetes nodevolumes filter, however.
How to reproduce it (as minimally and precisely as possible)?
Create a cluster with aws-ebs-csi-driver and cluster-autoscaler deployed and a node pool (node-pool: my-node-pool) for other pods
All 10 pods are scheduled on the newly created node even with the attach limit
If I then delete all the pods, only 5 will reschedule to the same node and the other 5 will wait for a new autoscaled node. This issue still occurs even if the PVCs are already provisioned.
Anything else we need to know?: N/A
Environment
Kubernetes version (use kubectl version): Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
Driver version: 1.5.3
The text was updated successfully, but these errors were encountered:
My conclusion from discussion in #1163 is that my situation is highly likely to be identical to this issue. Given the excellent reproducibility described by @steved , working this issue is probably easier (as opposed what seemingly is a mix of issues in #1163 )
`Driver version: 1.11.2 (and using backwards compatibility mode for kubernetes.io/aws-ebs provisioner)
steved
changed the title
Race condition with volume attach limits when new node joins the cluster leads oversubscribed node
Race condition with volume attach limits when new node joins the cluster leads to oversubscribed node
Dec 6, 2022
This actually appears to be a duplicate of kubernetes/kubernetes#95911. Closing this for now unless there's further evidence it can be fixed here instead.
/kind bug
What happened?
I believe this is similar to #1163 (and perhaps even #1258), but distinct enough that I thought it warranted another issue. This may also be the wrong repository, I just encountered it specifically with EKS and the aws-ebs-csi-driver.
When a cluster is scaled up in response to new pods with PVCs and a custom
volume-attach-limit
is set, somehow the restriction is not enforced happens leading to all pods being scheduled.What you expected to happen?
Ideally, only the pods that can fit the volume attach limit can be scheduled to the node. I'm unsure if this is something that needs to happen in the Kubernetes nodevolumes filter, however.
How to reproduce it (as minimally and precisely as possible)?
node-pool: my-node-pool
) for other pods--volume-attach-limit=5
on the node driverIf I then delete all the pods, only 5 will reschedule to the same node and the other 5 will wait for a new autoscaled node. This issue still occurs even if the PVCs are already provisioned.
Anything else we need to know?:
N/A
Environment
kubectl version
):Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
The text was updated successfully, but these errors were encountered: