-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter not respecting max volumes limit attached per node on AWS #260
Comments
Can you share your |
Digging into this a little further, it seems like we are hitting an issue here where we expect the StorageClass As a workaround, could you try updating your storageClass to use the |
This issue is compounded by the fact that limits were sometimes not being respected by the |
@jonathan-innis do you have a pointer to the code where you expect the StorageClass .spec.provisioner to match the spec[0].drivers.allocatable? Trying to understand karpenter a little bit better and it would help me a lot!! |
@r3kzi Sure, we are doing volume limit mapping here and then find the volume usage for pods on a node here. We detect the driver that the pod volume is using here. For context, we are planning to log errors now when users are using in-tree drivers so that there is an alerting mechanism to tell users when Karpenter can't detect limits for auto-scaling. ##267 add this error logging. We'll document this in our upstream docs, but our stance is that in-tree providers have largely been deprecated by upstream kubernetes and users should migrate to the new CSI drivers, particularly because a lot of these in-tree drivers are going to get removed in upstream kubernetes in upcoming versions very soon. |
Labeled for closure due to inactivity in 10 days. |
If anyone is continuing to encounter this issue on versions of Karpenter make sure you've added the following startup taint to your NodePools, as recommended by the ebs csi driver: startupTaints:
- key: ebs.csi.aws.com/agent-not-ready
effect: NoExecute This taint will be removed by the driver once it's ready and prevents a race condition where the kube-scheduler is able to schedule pods before limits can be discovered. |
Version
k8's version: 1.25
Karpenter version: v0.27.1
Expected Behavior
The max volumes attached per node on AWS is 26. whenever the node has 26 volumes already attached I expect Karpenter to spin up a new node and somehow communicate to kubernetes scheduler to not schedule any more pods with PV's onto that node. Pods without PV's can still be scheduled.
Actual Behavior
For some reason this is not happening, instead I see "Failed to attach volume" warnings and kubernetes scheduler still trying to schedule new pods with PV's on to the nodes where the max volumes are already attached
Steps to Reproduce the Problem
I had 100+ Deployments with one pod each. And each pod has a PV. Please note, this is not one Deployment with 100 pods. It is 100+ Deployment where each deployment can only have one pod with PV.
Resource Specs and Logs
here are some more logs
and this
If you see this one even though in the end it says SuccessfulAttachVolume why even try to attach volumes when max volumes are already attached per node?
Here is the csinode spec
Here is the provisioner spec
Community Note
The text was updated successfully, but these errors were encountered: