-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
karpenter disruption taint isn't correct for karpenter v1 #2158
Comments
Wow this is a very subtle breaking change. Thank you so much @willthames for getting back to me and finding an even deeper root cause. We'll take care of this before our next driver release, and try to find some mechanism to make sure this does not happen a third time. |
CSI drivers are not responsible for deleting In Karpenter v1.0.0 and beyond, Karpenter ensures that EBS volumes are properly unmounted/detached ( a) The CSI node For users who have disabled |
I think a nice improvement here would be maintaining the list of common tolerations in our helm chart, that way the pre-stop hook wouldn't need to be relied upon as a fallback. I don't think I would have caught this regression in driver cleanup due to a Karpenter upgrade myself either (it was a subtle line in the migration guide), which means we can alleviate some customer pain by making the driver 'just work' upon add-on creation. This would also help us migrate off of
To be fair, we could do a better job here by documenting these assumptions, because volumes not unmounting during node drains might surprise non-subject-matter-experts in Kubernetes Storage. I'll add an FAQ item in the fix PR, but we should also consider adding a note in our install guide. Either way, thank you @willthames for reporting this pain-point. |
/kind bug
What happened?
EBS CSI driver does not clean up after itself when node is terminating due to karpenter disruption after v1.0.1 karpenter upgrade
What you expected to happen?
EBS CSI driver removes all volume attachments when terminating due to node termination
How to reproduce it (as minimally and precisely as possible)?
Have an EBS volume attached, and don't configure any tolerations on the EBS CSI driver (so it gets terminated before the node terminates)
Anything else we need to know?:
The taint during karpenter disruption is:
But the
IsDisruptedTaint
looks fordisrupting
rather thandisrupted
because it's using the v1beta1 API rather than the v1 API.https://github.com/kubernetes-sigs/karpenter/blob/b69e975128ac9a511542f9f1d245a6d4c3f91112/pkg/apis/v1beta1/taints.go#L28
vs
https://github.com/kubernetes-sigs/karpenter/blob/b69e975128ac9a511542f9f1d245a6d4c3f91112/pkg/apis/v1/taints.go#L28
Environment
kubectl version
):The text was updated successfully, but these errors were encountered: