-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster-autoscaler 1.2.2 not scaling AWS ASG to zero #1555
Comments
I will have a look at this issue. |
I can not reproduce this issue easily.. Looks like you change nodegroup min/max to 0/0 in CA but I am not sure how's your ASG min/max setting. Based on the logs and config map status, your blue nodegroup has 3 nodes which can not be smaller than or equal to nodeGroup.MinSize(). Could you also share your ASG setting? BTW, explicit node setting does have some issues. It's not just in v1.2.x but all versions. I am not sure it's by design or a bug. I submit a issue #1559 and I can make a quick fix once maintainer confirm it. That will help reduce confusion when user set min/max in CA. I think the way you use it might not be a elegant way, CA should manage all the node groups based on utilization. Set to 0/0 doesn't make sense to me. |
/sig aws |
@Jeffwan I change the ASG min/max setting using terraform-aws-eks module as well. It did not seem to matter if I used CA autodiscovery or static nodegroups - the instances did not scale down. The blue nodegroup has the discrepancy because the ASG setting and the CA setting were changed but no nodes were scaled down. The ASG is using proect_from_scale_in on the instances to make sure that the ASG does not stomp on CA in terms of scaling up/down. In this below output, the blue nodegroup instances were manually terminated FYI.
CA should support scaling to/from 0 based on this issue: #166 |
The Sadly I don't really have an idea why the target size and the ASG could be out-of-sync, maybe it might be caused by the explicitly defined max being below the running instances. Did you try running with |
@johanneswuerbach I have also tried this same scenario with autodiscovery and maxSize of 0 is also not respected to scale a nodegroup to 0. I would expect setting the maxSize in the ASG (which is how I am telling CA to scale down) would terminate all instances in that ASG. Interesting point about Using your suggestion (leave maxSize alone and set minSize to 0), I've been able to get a nodegroup to scale down to zero (using autodiscovery) by cordoning the the appropriate nodes so they are detected as unneeded. I don't think this really solves the problem because it's still one step too many in terms of automating scale-up and scale-down. IMO, it should be enough to just specify a 0 for maxSize in the ASG and have CA terminate all instances in that group. Here are the logs from this test:
The config in this case is:
And the status configmap:
|
Yes, lowering the max size, doesn’t force instances to be terminated (unlike on AWS) Why do you need to scale up and down manually? The CA should scale down automatically once there is enough free capacity to move pods to another nodes and manual intervention shouldn’t be required. Could you explain a bit further what you are trying to do? |
@hobbsh Could you help confirm that once your ASG sets to 0:0, the log is still same after a few minutes? Try to restart to check logs. |
In #1555 (comment) @hobbsh mentioned that he is using |
@johanneswuerbach maxSize not forcing all instances to terminate is definitely the crux of the issue. I want to maintain blue/green worker groups where at least one of the two will always be scaled down to zero. The main reason for this is to roll out new AMIs and other worker configuration changes. It has proven to be a very painless way of updating workers except for this last hitch with CA. To force all the nodes to terminate with the existing functionality, I would have to just cordon the old worker group but I feel like that step could be avoided if CA just interpreted a maxSize of 0 to mean all instances should be terminated as this is how the ASG settings work.
I feel like there is definitely some missing functionality here or a bug. IMO CA should at the least behave the same as the ASG when setting min/max to 0. @Jeffwan I have waited up to 25 minutes for CA to scale down after setting to 0:0 and the log is still the same, including restarting the pod. |
Tbh I'm not sure whether CA is the best place to add such functionality as AWS would currently be the only cloud-provider supporting this and there might be various edge cases like what happens if there isn't enough space to fit currently running pods, what is this is actually not intended behaviour etc. by the user, etc. But maybe @aleksandra-malinowska could give a 👍 / 👎 on that. |
Limits constrain only autoscaler, not the user. E.g. if there's a spike way outside of expected operation range, user can manually resize the group without giving autoscaler permission to do the same, or having to disable it. Autoscaler enforcing size to its own minimum and maximum limits would only make such intervention more difficult. As for the use-case of utilizing autoscaler to drain the node group that is being manually removed, it doesn't sound like it's related to autoscaling at all. On the other hand, it already does some house-keeping tasks (removing unregistered nodes etc.), so perhaps it would be a reasonable feature as well. E.g. an annotation/taint essentially saying "this node is about to be removed, find space for all its pods and drain". @MaciekPytel WDYT?
Why? Please note that Cluster Autoscaler supports diverse environments and not all of them make the same assumptions. |
CA looking for a specific tag on the ASG to scale to zero would be ideal for this use-case, otherwise telling CA what taint to observe would probably be more complicated than just cordoning a node group. I understand that force termination may not fall under the umbrella of CA's purpose and if that is truly the case then this doc sure adds a lot of confusion to the mix. When CA integrates in an AWS environment, is it not intended to replace the functionality that the ASG provides in terms of handling scaling events? If so, then IMO it should support setting maxSize to zero terminating all instances in a node group. I also think this practice (blue/green ASGs) is relatively common with EKS. |
This logic would have to be in CA core. Triggering it in an environment-independent way sounds more realistic, although cloud provider code could probably artificially set this on nodes as well.
No, it's intended to autoscale Kubernetes clusters in a scheduling-aware way (which is what traditional metric-based autoscalers are missing). Using more than one autoscaler to scale the same resource usually ends badly, especially if they have completely different logic - one removes nodes, the other adds them back, the first one removes them again, and so on. This is why we recommend all other autoscalers are disabled. The linked section refers to a case when node group size is already at 0. Then there are no existing nodes to use as template and CA has no idea how a new node will look like (and therefore, whether it makes sense to add it at all). To avoid a situation when a pod without toleration causes scale up in empty node group with a taint, this taint needs to be in the template. This way it's included in the simulation and CA won't add unnecessary node. |
@aleksandra-malinowska Thanks for the clarification. Maybe the phraseology should be changed on the doc because It would be great if something in the cloud provider code (or something generic enough for core) could accommodate this if modifications for setting maxSize to zero terminating a node group is out of the question. |
"Scaling to 0" here means "removing the last node from the node group because it's empty/underutilized". It's trivial, the only problem is scaling back from 0 - it needs to be implemented so it's not one way only. |
As for improving it, PRs are always welcome. Note that this is part of AWS cloud provider documentation and not really supported by any of the regular maintainers right now. |
I'd be glad to take a stab at it, if nothing else to understand the internals more - however I have no Go experience so I may not be the best one to get a PR out anytime soon. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
We got bitten by this too (CA will not scale up ASG from 0). I feel this needs to be documented clearly. We use auto-discovery and had a pool which got scaled down to zero but will never scale back up. Is there any workaround for this? I would like a solution other than running an idle node all the time (minSize:1) |
@hobbsh I'm looking into doing the same thing that you're trying to do, blue/green ASGs to deploy new VM image changes. Did you come up with a solution for this using the cluster-autoscaler out of the box behavior or did you have to build custom behavior to drain the ASG when you wanted it to scale to zero for reasons other than utilization? |
@aackerman I started down the path of using Zalando's kube node drainer systemd unit which basically just does a |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@rverma-nikiai: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle rotten |
@rverma-nikiai Do you still see this issue in version > 1.2.2? |
I had created another issue, we can leave this closed.
…On Thu, 1 Aug, 2019, 11:14 PM Jiaxin Shan, ***@***.***> wrote:
@rverma-nikiai <https://github.com/rverma-nikiai> Do you still see this
issue in version > 1.2.2?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1555?email_source=notifications&email_token=AJEMLIFM4DNYSCOKTHBBUYDQCMOF7A5CNFSM4GN2T4RKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3LLWIA#issuecomment-517389088>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJEMLIDAMZXBSBIYE5NCDJLQCMOF7ANCNFSM4GN2T4RA>
.
|
Hi,
I have been unable to get the cluster-autoscaler to scale one of my two autoscaling groups to zero. In this scenario, the
blue
worker group should be scaled to zero. It's very likely I missed something but have been unable to track down what that might be based on the documentation/information available on the internet.I have tagged both ASGs with
k8s.io/cluster-autoscaler/node-template/label/eks_worker_group: [blue|green]
and nodes have labelseks_worker_group: [blue|green]
. Nodes are also tagged with the same tag on the EC2 side as well.Running with the following options:
Here is status:
And this is what I see in the logs, no smoking gun that I can see:
The text was updated successfully, but these errors were encountered: