Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support adding correct Cluster Autoscaler tags to a nodegroup without granting self-scaling permissions #2263

Open
TBBle opened this issue May 29, 2020 · 2 comments
Labels
area/autoscaler kind/feature New feature or request priority/backlog Not staffed at the moment. Help wanted. priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases

Comments

@TBBle
Copy link
Contributor

TBBle commented May 29, 2020

Why do you want this feature?

Currently, the correct Cluster Autoscaler tags are only created for a NodeGroup if that NodeGroup also has the Instance Role created to run the Cluster Autoscaler in that group.

This has a few misusages:

  • Only the nodegroup where the Cluster Autoscaler is running needs those permissions, but the documentation implies --asg-access (the command-line equivalent of the IAM instance role addon for CA) is the way to mark nodegroups that will be auto-scaled (or rather, it doesn't usefully canvas the alternative of adding those tags yourself).
  • When using IRSA, none of the nodegroups will have the IAM instance role, and so none get the tags they need.
  • You cannot run the Cluster Autoscaler in a NodeGroup that is not set up for autoscaling, because the same flag does both.

The current workaround is to manually insert the tags into each NodeGroup, while avoiding typos and bitrot, e.g, the pair of tags with a comment at the IRSA example config. The presence of the comment suggests this is not a clear UX.

See also #1481 (comment) which has separately discovered that it's weird to have the "can be autoscaled" setup only be done for "can host autoscaler" nodegroups, since in the scale-up-from-zero case, you'd clearly never try and host the autoscaler in a nodegroup that can reach zero running nodes.

The scaling targets do not need any additional IAM permissions, and the best practice would be to not grant this permission where it is not needed.

What feature/behavior/change do you want?

An extra config flag for NodeGroup, something like enableAutoscaling which would add the appropriate tags. The name needs to make clear that it makes that NodeGroup a candidate for auto-scaling, but does not set up any permissions that might be needed to run the Cluster Autoscaler.

Comments in #1481 suggested tagForAutoScaler as the config option name.

Enabling the IAM instance role addon for ClusterAutoscaler would also enable this config flag for that NodeGroup, preserving existing behaviour. (Or alternatively, the tags could be added in the presence of either, it's the same to the user either way.)

I'm not sure how that would look on the command-line. It's already weird that --asg-access both marks a NodeGroup for scaling, and applies permissions to allow running CA (and hence any pod on those nodes can go scale other node groups in the cluster, if it so wishes). I haven't used the command-line options for defining resources, so I don't have strong feelings about this.

Another option, is that NodeGroups are always tagged for Cluster Autoscaler discovery. This is already the case with ManagedNodeGroups. It seems reasonable that if I have a min/max/desired value set for a range of values, and install the Cluster Autoscaler, those values would be honoured by default.

I don't currently know of a use-case where you might want those min/max/desired values set to cover a range, but be ignored by the Cluster Autoscaler, but I haven't gone looking for one either.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Jan 18, 2021
@dougbyrne
Copy link

I still desire this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/autoscaler kind/feature New feature or request priority/backlog Not staffed at the moment. Help wanted. priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases
Projects
None yet
Development

No branches or pull requests

4 participants