-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS EKS cluster autoscaler only using ASGs that have at least one node #1676
Comments
That's probably because actual node's memory is slightly different from predicted memory based on machine type (due to kernel reservation, specific to a given OS/machine combination). It follows that least waste expander prefers existing groups as it believes they have less memory. #1643 will somehow improve the prediction by caching node templates, but it's not a complete solution: if there was any node in this group since last autoscaler restart, resources from that node will be used in simulations. Some options to fix this:
I would probably go with (1), others are more complicated. |
Thanks @aleksandra-malinowska. Focusing on (1), running another test with a pod that requests most of a node's memory resources seems to produce a larger difference between actual vs. predicted:
The rest of the suggested options sound promising, though. I'll think on it. |
@aleksandra-malinowska, in revisiting this I came up with an option 4 that, although not perfect, seems simple to implement and configure, and reliable. Autoscaler could compare a label value on each node (e.g. Does this seem reasonable? |
Seems so, @MaciekPytel WDYT? Although I wonder if it could be more generic than hard-coding a constant for each cloud provider. For example, why not make the key to use for node group comparison configurable? |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
@MaciekPytel would you mind having a look at this issue? I've forked and applied the fix I proposed, so let me know if a PR would be preferable. |
This issue extends beyond detected memory capacity and template-estimated memory capacity. The issues I've found so far which preclude even scale-ups from 0 include:
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…etes#1676) * Wait for webhooks server using probes * Delete KueueReadyForTesting * revert the setting of healthz * Add a comment about the readyz probe
CloudProvider: AWS EKS
Kubernetes version: 1.11.5
Cluster Autoscaler version: 1.3.6
With the autoscaler options shown below and multiple identical ASGs (other than AZ), new nodes always end up being created using an ASG that has at least 1 node. Looking at the logs, autoscaler considers these ASGs to have the least memory wastage:
If I manually set the desired count on the
m5_large_az2
ASG to 1, I then see:I believe this has started happening since I added taints to the nodes, and tolerations and nodeselectors to the pods.
Autoscaler options
The text was updated successfully, but these errors were encountered: