You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Which component are you using?:
I'm using cluster-autoscaler's balance-similar-node-groups feature (on EKS-based clusters)
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.
I am facing the same issue described here.
We use AWS Spot Node Groups with many instance types.
For a cluster, we have 3 node groups (1 per Availability Zone) and I get an uneven balance of nodes across AZ because some nodes are not considered similar.
This is happening because the capacity.memory 's difference is too high. I understand the purpose of the MaxCapacityMemoryDifferenceRatio not being too high, but in this case, we want to run spot instances with as many instance types in the mix in order to avoid running out of options (and having the cluster shrinking because the cloud provider reclaims the instances)
It seems that in my case, I end up with nodes of relatively similar capacity (m5d.xlarge and t3.xlarge instance types are both supposed to have 4vCPU and 16Gib), having a capacity.memory difference ratio higher than the currently implemented 1.5% MaxCapacityMemoryDifferenceRatio as specified in this part of the code.
Example with an m5d.xlarge node vs t3.xlarge node
m5d.xlarge:
Describe the solution you'd like.:
Allow changing this maxDifferenceRatio to fit cloud/user-specific use cases in an cloud-independant way.
I suggest allowing controlling this value through an optional config flag to be specified when starting cluster-autoscaler. For example:
Describe any alternative solutions you've considered.:
Optionally, create a new NodeInfoComparator which would be trusting the end-user's judgment.
For instance, we could always consider 2 node groups to be similar if the nodes have a specific label in common.
Example:
Which component are you using?:
I'm using cluster-autoscaler's
balance-similar-node-groups
feature (on EKS-based clusters)Is your feature request designed to solve a problem? If so describe the problem this feature should solve.
I am facing the same issue described here.
MaxCapacityMemoryDifferenceRatio
not being too high, but in this case, we want to run spot instances with as many instance types in the mix in order to avoid running out of options (and having the cluster shrinking because the cloud provider reclaims the instances)It seems that in my case, I end up with nodes of relatively similar capacity (m5d.xlarge and t3.xlarge instance types are both supposed to have 4vCPU and 16Gib), having a
capacity.memory
difference ratio higher than the currently implemented1.5% MaxCapacityMemoryDifferenceRatio
as specified in this part of the code.Example with an m5d.xlarge node vs t3.xlarge node
m5d.xlarge:
t3.xlarge
Describe the solution you'd like.:
Allow changing this maxDifferenceRatio to fit cloud/user-specific use cases in an cloud-independant way.
I suggest allowing controlling this value through an optional config flag to be specified when starting cluster-autoscaler. For example:
Describe any alternative solutions you've considered.:
Optionally, create a new
NodeInfoComparator
which would be trusting the end-user's judgment.For instance, we could always consider 2 node groups to be similar if the nodes have a specific label in common.
Example:
The text was updated successfully, but these errors were encountered: