Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup Race Condition #670

Closed
tsunamishaun opened this issue Sep 9, 2021 · 3 comments
Closed

Startup Race Condition #670

tsunamishaun opened this issue Sep 9, 2021 · 3 comments
Assignees
Labels
feature New feature or request
Milestone

Comments

@tsunamishaun
Copy link

Version

Karpenter: v0.3.2

Kubernetes: v1.20

Expected Behavior

Scheduling a pod that has a previously attached volume will have the same volume when schedule on a new node provisioned by Karpenter.

Actual Behavior

Pod never gets volume re-attached due to aws-ebs-csi daemon pod not running at time of binding request. It should ultimately retry but this pod (along with aws-vpc-cni pod) are currently path to production for our node startup.

Steps to Reproduce the Problem

Install the aws-ebs-csi driver
Create a pod with a volume attachment
Shift pod somewhere else
Karpenter brings up pod on new node
aws-ebs-csi pod starts up ~1m later

More details in open issue with aws-ebs-csi driver 1030. cc @ellistarn this came out of the meeting sorry for the delay in creating the issue here.

@tsunamishaun tsunamishaun added the bug Something isn't working label Sep 9, 2021
@JacobGabrielson JacobGabrielson self-assigned this Sep 13, 2021
@bwagner5 bwagner5 added this to the v0.5.0 milestone Sep 23, 2021
@ellistarn
Copy link
Contributor

We've been exploring this and here's what we've landed on:

  • Short term solution: Fix the EBS CSI driver to retry and succeed
  • Long term solution: Write a NoAdmit KEP to enforce this ordering with taints in the kubelet (Support Startup Taints #628)

@ellistarn ellistarn added feature New feature or request and removed bug Something isn't working labels Jan 7, 2022
@ellistarn
Copy link
Contributor

Hey @tsunamishaun , I was able to get this working with #1015. Specifically, karpenter needed to apply the SelectedNodeAnnotation, since normally the kube scheduler would do this.

Can you try this in v0.5.4?

@ellistarn
Copy link
Contributor

ellistarn commented Feb 10, 2022

Closing this as volume support is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants