-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support node start-up taint to avoid race conditions #1069
Comments
/kind feature |
The related Fix on EBS is : kubernetes-sigs/aws-ebs-csi-driver#1588 |
I believe I'm encountering a similar issue. I can consistently reproduce it when the Horizontal Pod Autoscaler initiates a scale-up due to load metrics (the are pods receiving traffic). |
team please look into this |
Our team has acknowledged this issue and it's currently on our list. We are actively working it and aim to have this resolved and released by the end of this month. |
PR for this feature. Will merge this and release in the coming version, most probably by end of this month. |
PR addressing the feature request for supporting node start-up taint to avoid race conditions has been successfully merged. This feature is now included in the EFS CSI Driver as of version v1.7.2. |
Is your feature request related to a problem? Please describe.
In some cases when new nodes frequently join the cluster, workloads (application Pods) which require EFS volumes can be scheduled to a new EC2 Node before the efs-csi-node Pod has finished initialization and is ready on that Node. This race condition between workload pod and efs-csi-node Pod will cause the workload Pod to fail mounting the PVC.
We should apply a startup taint to prevent this race condition from occurring. The efs-csi-node Daemonset's efs-plugin container will apply this taint to the Node during driver initialization, and then will remove the taint once it is ready.
Describe the solution you'd like in detail
See the following PR on aws-ebs-csi-driver which implements this feature: kubernetes-sigs/aws-ebs-csi-driver#1581
From the overview of that PR:
The text was updated successfully, but these errors were encountered: