Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter should consider in-flight capacity when scaling out #1044

Closed
olemarkus opened this issue Dec 24, 2021 · 5 comments
Closed

Karpenter should consider in-flight capacity when scaling out #1044

olemarkus opened this issue Dec 24, 2021 · 5 comments
Assignees
Labels
consolidation feature New feature or request

Comments

@olemarkus
Copy link
Contributor

Tell us about your request
What do you want us to build?

If two apps are scaling up in succession, Karpenter will first react to app one's requirements and provision instances accordingly. Often there will be will be spare capacity on the provisioned nodes, especially if there are topology spread constraints involved.
When app two scales up, that spare capacity is not considered, and Karpenter may scale up an excessive amount of additional instances.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

The consequence of the above is that there is a large amount of under-utilised nodes.

Are you currently working around this issue?
Regularly rotating nodes will compact the cluster. This causes a lot of unnecessary bouncing of Pods though.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@olemarkus olemarkus added the feature New feature or request label Dec 24, 2021
@ellistarn
Copy link
Contributor

One of the benefits of Karpenter's current implementation is that it's almost entirely stateless. Introducing new concepts such as awareness of in-flight instances or delayed pod binding significantly increases the Karpenter's complexity surface.

The Cluster Autoscaler has gone down this road and attempted to be smart about what it thinks might happen in the future, and relies on various timeouts to allow the built-up state to reconcile with what actually happened. These time periods inevitably change over time, and require configuration for the performance characteristics of different environments.

There's definitely a tradeoff to be made, but just because we can do/know something, doesn't mean we should. I'm open to being convinced that it's worth taking on this complexity; perhaps easiest at working group or on slack.

@matti
Copy link

matti commented Jan 7, 2022

What about bringing https://github.com/kubernetes-sigs/descheduler to the mix? Descheduler can detect these kind of conditions and then delete/drain those nodes?

@olemarkus
Copy link
Contributor Author

I want to avoid pods being rescheduled multiple times during a normal deploy, so descheduler, just like defrag, is not a viable solution.

@ellistarn
Copy link
Contributor

This is also useful for spark workloads, which do not create all pods at the same time. #1291

@tzneal
Copy link
Contributor

tzneal commented May 3, 2022

Fixed with #1727

@tzneal tzneal closed this as completed May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consolidation feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants