Support adding correct Cluster Autoscaler tags to a nodegroup without granting self-scaling permissions #2263
Labels
area/autoscaler
kind/feature
New feature or request
priority/backlog
Not staffed at the moment. Help wanted.
priority/important-longterm
Important over the long term, but may not be currently staffed and/or may require multiple releases
Why do you want this feature?
Currently, the correct Cluster Autoscaler tags are only created for a NodeGroup if that NodeGroup also has the Instance Role created to run the Cluster Autoscaler in that group.
This has a few misusages:
--asg-access
(the command-line equivalent of the IAM instance role addon for CA) is the way to mark nodegroups that will be auto-scaled (or rather, it doesn't usefully canvas the alternative of adding those tags yourself).The current workaround is to manually insert the tags into each NodeGroup, while avoiding typos and bitrot, e.g, the pair of tags with a comment at the IRSA example config. The presence of the comment suggests this is not a clear UX.
See also #1481 (comment) which has separately discovered that it's weird to have the "can be autoscaled" setup only be done for "can host autoscaler" nodegroups, since in the scale-up-from-zero case, you'd clearly never try and host the autoscaler in a nodegroup that can reach zero running nodes.
What feature/behavior/change do you want?
An extra config flag for
NodeGroup
, something likeenableAutoscaling
which would add the appropriate tags. The name needs to make clear that it makes that NodeGroup a candidate for auto-scaling, but does not set up any permissions that might be needed to run the Cluster Autoscaler.Comments in #1481 suggested
tagForAutoScaler
as the config option name.Enabling the IAM instance role addon for ClusterAutoscaler would also enable this config flag for that NodeGroup, preserving existing behaviour. (Or alternatively, the tags could be added in the presence of either, it's the same to the user either way.)
I'm not sure how that would look on the command-line. It's already weird that
--asg-access
both marks a NodeGroup for scaling, and applies permissions to allow running CA (and hence any pod on those nodes can go scale other node groups in the cluster, if it so wishes). I haven't used the command-line options for defining resources, so I don't have strong feelings about this.Another option, is that NodeGroups are always tagged for Cluster Autoscaler discovery. This is already the case with ManagedNodeGroups. It seems reasonable that if I have a min/max/desired value set for a range of values, and install the Cluster Autoscaler, those values would be honoured by default.
I don't currently know of a use-case where you might want those min/max/desired values set to cover a range, but be ignored by the Cluster Autoscaler, but I haven't gone looking for one either.
The text was updated successfully, but these errors were encountered: