Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a max pod batch to prevent large build ups #823

Merged
merged 2 commits into from
Nov 19, 2021

Conversation

bwagner5
Copy link
Contributor

@bwagner5 bwagner5 commented Nov 19, 2021

1. Issue, if available:
N/A

2. Description of changes:

  • During scale tests, I ran into Karpenter running out of memory and restarting, with a large batch of pods that built up which causes Karpenter to continuously fall over. The same thing can happen if Karpenter is started when there is already a large build up of pods.
  • This change adds a MaxPodsBatch set at 2,000 initially. This limit is very high for average cluster configurations and is adequate for large scale-ups as well.
  • If Karpenter does come back online w/ a huge batch of pods, it can progress a little more quickly as well since it doesn't need to wait for the 1 sec idle timeout:
karpenter-controller-7b69b7ff89-rpcbg manager 2021-11-19T19:13:06.789Z	INFO	controller.provisioning	Batched 2000 pods in 32.78526ms	{"commit": "19d476f", "provisioner": "default"}
  • It may be a good idea to make this configurable in CLI parameters at some point. If the reviewers feel strongly about doing that now, I can do that before we merge this.

3. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@netlify
Copy link

netlify bot commented Nov 19, 2021

✔️ Deploy Preview for karpenter-docs-prod canceled.

🔨 Explore the source changes: 65a200e

🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/619802e8c819e60007f1e75d

@bwagner5 bwagner5 requested a review from ellistarn November 19, 2021 19:18
@JacobGabrielson JacobGabrielson merged commit c78f3ab into aws:main Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants