Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: set IgnoreDaemonSetsUtilization per nodegroup for AWS #5672

Conversation

vadasambar
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

We want to support the ability to specify IgnoreDaemonSetsUtilization per nodegroup through ASG tags. Why? It is to cater for the following case (among other possible cases):

  1. IgnoreDaemonSetsUtilization is set false globally because we want to consider daemonsets when calculating node resource utilization. This works well for ASGs/nodegroups which have large nodes and daemonsets utilization doesn't contribute much to node resource utilization. Due to this, we have no problem during node scale down.
  2. But, we also have some ASGs/nodegroups with small nodes where daemonsets utilization contributes a lot to the node resource utilization. Due to this, the nodes might not get scaled down because node resource utilization doesn't fall below ScaleDownUtilizationThreshold.

To solve this problem, we want to support setting IgnoreDaemonSetsUtilization per ASG/nodegroup through tag like k8s.io/cluster-autoscaler/node-template/autoscaling-options/ignoredaemonsetsutilization: true which would override the global value of IgnoreDaemonSetsUtilization (which is false)

More details: #5399

Which issue(s) this PR fixes:

Fixes #5399

Special notes for your reviewer:

Nothing in particular.

Does this PR introduce a user-facing change?

Added: support to ignore daemonset resource utilization per nodegroup for AWS using the tag `k8s.io/cluster-autoscaler/node-template/autoscaling-options/ignoredaemonsetsutilization`

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 10, 2023
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 10, 2023
@vadasambar vadasambar force-pushed the feat/5399/ignore-daemonsets-utilization-per-nodegroup branch from b192159 to 7fa229d Compare April 10, 2023 06:09
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 10, 2023
@vadasambar
Copy link
Member Author

@drmorr0 , @feiskyer
This is a WIP implementation. I will mention the reviewers here once I am done. Meanwhile, if you have any feedback on the PR, I would love to have it. Please note that at the time of writing this PR, it's a quick, dirty and I think an incomplete implementation of the solution. It needs more polishing and thinking (I have outright replaced an interface).

@vadasambar vadasambar changed the title feat: set IgnoreDaemonSetsUtilization per nodegroup feat: set IgnoreDaemonSetsUtilization per nodegroup for AWS Apr 10, 2023
@vadasambar
Copy link
Member Author

vadasambar commented Apr 10, 2023

There are many cloud providers which have GetOptions implemented and many which don't. We might want to support this feature for all cloud providers which have implemented GetOptions and support setting tags on their specific nodegroup implementations. Not sure if doing it in this PR is a good idea. I think it is but I wonder if we'd need reviews from owners of all cloud providers maintainers which might explode the scope of this PR. If anyone has any feedback/opinion on this, would love to hear it.

`grep -R ") GetOptions" ./cloudprovider -A 2 --exclude-dir=vendor`
suraj@suraj:~/sandbox/autoscaler/cluster-autoscaler$ grep -R ") GetOptions" ./cloudprovider -A 2 --exclude-dir=vendor
./cloudprovider/rancher/rancher_nodegroup.go:func (ng *nodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/rancher/rancher_nodegroup.go-   return nil, cloudprovider.ErrNotImplemented
./cloudprovider/rancher/rancher_nodegroup.go-}
--
./cloudprovider/hetzner/hetzner_node_group.go:func (n *hetznerNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/hetzner/hetzner_node_group.go-  return nil, cloudprovider.ErrNotImplemented
./cloudprovider/hetzner/hetzner_node_group.go-}
--
./cloudprovider/tencentcloud/tencentcloud_auto_scaling_group.go:func (asg *tcAsg) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/tencentcloud/tencentcloud_auto_scaling_group.go-        return nil, nil
./cloudprovider/tencentcloud/tencentcloud_auto_scaling_group.go-}
--
./cloudprovider/aws/aws_cloud_provider.go:func (ng *AwsNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/aws/aws_cloud_provider.go-      if ng.asg == nil || ng.asg.Tags == nil || len(ng.asg.Tags) == 0 {
./cloudprovider/aws/aws_cloud_provider.go-              return &defaults, nil
--
./cloudprovider/test/test_cloud_provider.go:func (tng *TestNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/test/test_cloud_provider.go-    return tng.opts, nil
./cloudprovider/test/test_cloud_provider.go-}
--
./cloudprovider/magnum/magnum_nodegroup.go:func (ng *magnumNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/magnum/magnum_nodegroup.go-     return nil, cloudprovider.ErrNotImplemented
./cloudprovider/magnum/magnum_nodegroup.go-}
--
./cloudprovider/digitalocean/digitalocean_node_group.go:func (n *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/digitalocean/digitalocean_node_group.go-        return nil, cloudprovider.ErrNotImplemented
./cloudprovider/digitalocean/digitalocean_node_group.go-}
--
./cloudprovider/scaleway/scaleway_node_group.go:func (ng *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/scaleway/scaleway_node_group.go-        return nil, cloudprovider.ErrNotImplemented
./cloudprovider/scaleway/scaleway_node_group.go-}
--
./cloudprovider/cherryservers/cherry_node_group.go:func (ng *cherryNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/cherryservers/cherry_node_group.go-     return nil, cloudprovider.ErrNotImplemented
./cloudprovider/cherryservers/cherry_node_group.go-}
--
./cloudprovider/mocks/NodeGroup.go:func (_m *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/mocks/NodeGroup.go-     ret := _m.Called(defaults)
./cloudprovider/mocks/NodeGroup.go-
--
./cloudprovider/bizflycloud/bizflycloud_node_group.go:func (n *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/bizflycloud/bizflycloud_node_group.go-  return nil, cloudprovider.ErrNotImplemented
./cloudprovider/bizflycloud/bizflycloud_node_group.go-}
--
./cloudprovider/kamatera/kamatera_node_group.go:func (n *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/kamatera/kamatera_node_group.go-        return nil, cloudprovider.ErrNotImplemented
./cloudprovider/kamatera/kamatera_node_group.go-}
--
./cloudprovider/brightbox/brightbox_node_group.go:func (ng *brightboxNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/brightbox/brightbox_node_group.go-      return nil, cloudprovider.ErrNotImplemented
./cloudprovider/brightbox/brightbox_node_group.go-}
--
./cloudprovider/cloudstack/cloudstack_node_group.go:func (asg *asg) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/cloudstack/cloudstack_node_group.go-    return nil, cloudprovider.ErrNotImplemented
./cloudprovider/cloudstack/cloudstack_node_group.go-}
--
./cloudprovider/azure/azure_scale_set.go:func (scaleSet *ScaleSet) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/azure/azure_scale_set.go-       template, err := scaleSet.getVMSSFromCache()
./cloudprovider/azure/azure_scale_set.go-       if err != nil {
--
./cloudprovider/azure/azure_kubernetes_service_pool.go:func (agentPool *AKSAgentPool) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/azure/azure_kubernetes_service_pool.go- return nil, cloudprovider.ErrNotImplemented
./cloudprovider/azure/azure_kubernetes_service_pool.go-}
--
./cloudprovider/azure/azure_agent_pool.go:func (as *AgentPool) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/azure/azure_agent_pool.go-      return nil, cloudprovider.ErrNotImplemented
./cloudprovider/azure/azure_agent_pool.go-}
--
./cloudprovider/externalgrpc/externalgrpc_node_group.go:func (n *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/externalgrpc/externalgrpc_node_group.go-        ctx, cancel := context.WithTimeout(context.Background(), grpcTimeout)
./cloudprovider/externalgrpc/externalgrpc_node_group.go-        defer cancel()
--
./cloudprovider/packet/packet_node_group.go:func (ng *packetNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/packet/packet_node_group.go-    return nil, cloudprovider.ErrNotImplemented
./cloudprovider/packet/packet_node_group.go-}
--
./cloudprovider/ionoscloud/ionoscloud_cloud_provider.go:func (n *nodePool) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/ionoscloud/ionoscloud_cloud_provider.go-        return nil, cloudprovider.ErrNotImplemented
./cloudprovider/ionoscloud/ionoscloud_cloud_provider.go-}
--
./cloudprovider/baiducloud/baiducloud_cloud_provider.go:func (asg *Asg) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/baiducloud/baiducloud_cloud_provider.go-        return nil, cloudprovider.ErrNotImplemented
./cloudprovider/baiducloud/baiducloud_cloud_provider.go-}
--
./cloudprovider/gce/gce_cloud_provider.go:func (mig *gceMig) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/gce/gce_cloud_provider.go-      return mig.gceManager.GetMigOptions(mig, defaults), nil
./cloudprovider/gce/gce_cloud_provider.go-}
--
./cloudprovider/civo/civo_node_group.go:func (n *NodeGroup) GetOptions(autoscaler.NodeGroupAutoscalingOptions) (*autoscaler.NodeGroupAutoscalingOptions, error) {
./cloudprovider/civo/civo_node_group.go-        return n.getOptions, nil
./cloudprovider/civo/civo_node_group.go-}
--
./cloudprovider/huaweicloud/huaweicloud_auto_scaling_group.go:func (asg *AutoScalingGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/huaweicloud/huaweicloud_auto_scaling_group.go-  return nil, cloudprovider.ErrNotImplemented
./cloudprovider/huaweicloud/huaweicloud_auto_scaling_group.go-}
--
./cloudprovider/vultr/vultr_node_group.go:func (n *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/vultr/vultr_node_group.go-      return nil, cloudprovider.ErrNotImplemented
./cloudprovider/vultr/vultr_node_group.go-}
--
./cloudprovider/exoscale/exoscale_node_group_sks_nodepool.go:func (n *sksNodepoolNodeGroup) GetOptions(_ config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/exoscale/exoscale_node_group_sks_nodepool.go-   return nil, cloudprovider.ErrNotImplemented
./cloudprovider/exoscale/exoscale_node_group_sks_nodepool.go-}
--
./cloudprovider/exoscale/exoscale_node_group_instance_pool.go:func (n *instancePoolNodeGroup) GetOptions(_ config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/exoscale/exoscale_node_group_instance_pool.go-  return nil, cloudprovider.ErrNotImplemented
./cloudprovider/exoscale/exoscale_node_group_instance_pool.go-}
--
./cloudprovider/linode/linode_node_group.go:func (n *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/linode/linode_node_group.go-    return nil, cloudprovider.ErrNotImplemented
./cloudprovider/linode/linode_node_group.go-}
--
./cloudprovider/clusterapi/clusterapi_nodegroup.go:func (ng *nodegroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/clusterapi/clusterapi_nodegroup.go-     return nil, cloudprovider.ErrNotImplemented
./cloudprovider/clusterapi/clusterapi_nodegroup.go-}
--
./cloudprovider/oci/oci_instance_pool.go:func (ip *InstancePoolNodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/oci/oci_instance_pool.go-       return nil, cloudprovider.ErrNotImplemented
./cloudprovider/oci/oci_instance_pool.go-}
--
./cloudprovider/alicloud/alicloud_auto_scaling_group.go:func (asg *Asg) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/alicloud/alicloud_auto_scaling_group.go-        return nil, cloudprovider.ErrNotImplemented
./cloudprovider/alicloud/alicloud_auto_scaling_group.go-}
--
./cloudprovider/kubemark/kubemark_linux.go:func (nodeGroup *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/kubemark/kubemark_linux.go-     return nil, cloudprovider.ErrNotImplemented
./cloudprovider/kubemark/kubemark_linux.go-}
--
./cloudprovider/ovhcloud/ovh_cloud_node_group.go:func (ng *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
./cloudprovider/ovhcloud/ovh_cloud_node_group.go-       // If node group autoscaling options nil, return defaults
./cloudprovider/ovhcloud/ovh_cloud_node_group.go-       if ng.Autoscaling == nil {

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 11, 2023
}

type utilizationThresholdGetter interface {
type nodeGroupConfigGetter interface {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed utilizationThresholdGetter -> nodeGroupConfigGetter here because the name doesn't make sense after adding GetIgnoreDaemonSetsUtilization. My understanding is, we are using this interface to only expose the required functions from NodeGroupConfigProcessor (ref1, ref2)

return simulator.UnexpectedError, nil
}

gpuConfig := context.CloudProvider.GetNodeGpuConfig(node)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code from line 141 to 146 here is just copy and paste from line 121 to 126 here. I have moved it down because I want to use nodegroup when calling GetIgnoreDaemonSetsUtilization to get the per ASG value for IgnoreDaemonSetsUtilization and then pass it to utilization.Calculate on line 142 below.


allTestCases := testCases

// run all test cases again with `IgnoreDaemonSetsUtilization` set to true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that we test against IgnoreDaemonSetsUtilization: true and IgnoreDaemonSetsUtilization: false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause different test cases to have the same desc, making them harder to debug. For the sake of readability, I'd just have one long list of test cases to check, but if you have to generate them programatically like this, please update desc as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion.

I have started adding a suffix now like <test-case-desc> IgnoreDaemonSetsUtilization=true and <test-case-desc> IgnoreDaemonSetsUtilization=false and

@@ -137,14 +158,17 @@ func TestFilterOutUnremovable(t *testing.T) {
}
}

type staticThresholdGetter struct {
threshold float64
type staticNodeGroupConfigProcessor struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced staticThresholdGetter here with staticNodeGroupConfigProcessor because I think it is a better implementation since we don't hard code the value of threshold to 0.5 but use the default values set here.

// from NodeGroupConfigProcessor interface
type actuatorNodeGroupConfigGetter interface {
// GetIgnoreDaemonSetsUtilization returns IgnoreDaemonSetsUtilization value that should be used for a given NodeGroup.
GetIgnoreDaemonSetsUtilization(context *context.AutoscalingContext, nodeGroup cloudprovider.NodeGroup) (bool, error)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface is to limit the functions that can be used from processors.NodeGroupConfigProcessor interface. Since I only need to use GetIgnoreDaemonSetsUtilization here, I don't see the need to expose all the functions from the processors.NodeGroupConfigProcessor interface.

@vadasambar
Copy link
Member Author

There are many cloud providers which have GetOptions implemented and many which don't. We might want to support this feature for all cloud providers which have implemented GetOptions and support setting tags on their specific nodegroup implementations. Not sure if doing it in this PR is a good idea. I think it is but I wonder if we'd need reviews from owners of all cloud providers maintainers which might explode the scope of this PR. If anyone has any feedback/opinion on this, would love to hear it.
grep -R ") GetOptions" ./cloudprovider -A 2 --exclude-dir=vendor

I think this PR is becoming too big. I will be supporting this tag only for AWS in this PR.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2023
@vadasambar vadasambar force-pushed the feat/5399/ignore-daemonsets-utilization-per-nodegroup branch from bcbd017 to ca74c9c Compare April 19, 2023 04:31
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 19, 2023
},
{
nodeGroup: testNg2,
testCases: getStartDeletionTestCases(testNg2),
Copy link
Member Author

@vadasambar vadasambar Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am running the test cases twice one for each nodegroup defined above where in the first nodegroup, IgnoreDaemonSetsUtilization: false and for the second nodegroup IgnoreDaemonSetsUtilization: true. This is to test IgnoreDaemonSetsUtilization option on the nodegroup which is called here which is called by StartDeletion (function we are trying to test here) via deleteAsyncDrain and deleteAsyncEmpty.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2023
@vadasambar vadasambar force-pushed the feat/5399/ignore-daemonsets-utilization-per-nodegroup branch from 0cf8581 to 4400d80 Compare April 20, 2023 06:17
@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 26, 2023
@vadasambar vadasambar force-pushed the feat/5399/ignore-daemonsets-utilization-per-nodegroup branch from 97ce06c to c55f018 Compare July 4, 2023 06:47
@k8s-ci-robot k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jul 4, 2023
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 5, 2023
Signed-off-by: vadasambar <[email protected]>

fix: test cases failing for actuator and scaledown/eligibility
- abstract default values into `config`
Signed-off-by: vadasambar <[email protected]>

refactor: rename global `IgnoreDaemonSetsUtilization` -> `GlobalIgnoreDaemonSetsUtilization` in code
- there is no change in the flag name
- rename `thresholdGetter` -> `configGetter` and tweak it to accomodate `GetIgnoreDaemonSetsUtilization`
Signed-off-by: vadasambar <[email protected]>

refactor: reset help text for `ignore-daemonsets-utilization` flag
- because per nodegroup override is supported only for AWS ASG tags as of now
Signed-off-by: vadasambar <[email protected]>

docs: add info about overriding `--ignore-daemonsets-utilization` per ASG
- in AWS cloud provider README
Signed-off-by: vadasambar <[email protected]>

refactor: use a limiting interface in actuator in place of `NodeGroupConfigProcessor` interface
- to limit the functions that can be used
- since we need it only for `GetIgnoreDaemonSetsUtilization`
Signed-off-by: vadasambar <[email protected]>

fix: tests failing for actuator
- rename `staticNodeGroupConfigProcessor` -> `MockNodeGroupConfigGetter`
- move `MockNodeGroupConfigGetter` to test/common so that it can be used in different tests
Signed-off-by: vadasambar <[email protected]>

fix: go lint errors for `MockNodeGroupConfigGetter`
Signed-off-by: vadasambar <[email protected]>

test: add tests for `IgnoreDaemonSetsUtilization` in cloud provider dir
Signed-off-by: vadasambar <[email protected]>

test: update node group config processor tests for `IgnoreDaemonSetsUtilization`
Signed-off-by: vadasambar <[email protected]>

test: update eligibility test cases for `IgnoreDaemonSetsUtilization`
Signed-off-by: vadasambar <[email protected]>

test: run actuation tests for 2 NGS
- one with `IgnoreDaemonSetsUtilization`: `false`
- one with `IgnoreDaemonSetsUtilization`: `true`
Signed-off-by: vadasambar <[email protected]>

test: add tests for `IgnoreDaemonSetsUtilization` in actuator
- add helper to generate multiple ds pods dynamically
- get rid of mock config processor because it is not required
Signed-off-by: vadasambar <[email protected]>

test: fix failing tests for actuator
Signed-off-by: vadasambar <[email protected]>

refactor: remove `GlobalIgnoreDaemonSetUtilization` autoscaling option
- not required
Signed-off-by: vadasambar <[email protected]>

fix: warn message `DefaultScaleDownUnreadyTimeKey` -> `DefaultIgnoreDaemonSetsUtilizationKey`
Signed-off-by: vadasambar <[email protected]>

refactor: use `generateDsPods` instead of `generateDsPod`
Signed-off-by: vadasambar <[email protected]>

refactor: `globaIgnoreDaemonSetsUtilization` -> `ignoreDaemonSetsUtilization`
Signed-off-by: vadasambar <[email protected]>
- instead of passing all the processors (we only need `NodeGroupConfigProcessor`)
Signed-off-by: vadasambar <[email protected]>
- add suffix to tests with `IgnoreDaemonSetsUtilization` set to `true` and `IgnoreDaemonSetsUtilization` set to `false`
Signed-off-by: vadasambar <[email protected]>
@vadasambar vadasambar force-pushed the feat/5399/ignore-daemonsets-utilization-per-nodegroup branch from 38a9a4d to e1a22da Compare July 6, 2023 05:20
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 6, 2023

for _, testSet := range testSets {
for tn, tc := range testSet {
t.Run(tn, func(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code from this line and below is just a copy and paste of the older code (shifted by some lines). I haven't changed anything in the code until https://github.com/kubernetes/autoscaler/pull/5672/files#r1174841347

@vadasambar
Copy link
Member Author

@x13n I've addressed all the review comments. Can you please review this PR again 🙏

@vadasambar vadasambar requested a review from x13n July 6, 2023 09:57
@vadasambar
Copy link
Member Author

@x13n I've addressed all the review comments. Can you please review this PR again pray

@x13n wanted to remind you about this ^

Copy link
Member

@x13n x13n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, looks good to me!

/lgtm

@@ -67,6 +67,15 @@ func BuildTestPod(name string, cpu int64, mem int64) *apiv1.Pod {
return pod
}

// BuildDSTestPod creates a DaemonSet pod with cpu and memory.
func BuildDSTestPod(name string, cpu int64, mem int64) *apiv1.Pod {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR makes a lot of sense, thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 12, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MaciekPytel, vadasambar, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@x13n
Copy link
Member

x13n commented Jul 12, 2023

Oh, and the hold was meant for me, so removing:

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 12, 2023
@k8s-ci-robot k8s-ci-robot merged commit c6893e9 into kubernetes:master Jul 12, 2023
@seh
Copy link

seh commented Jul 12, 2023

I read #5399 first, and then the discussion here, and it's still not clear to me what problem you encounter by ignoring DaemonSet utilization globally. The description notes:

IgnoreDaemonSetsUtilization is set false globally because we want to consider daemonsets when calculating node resource utilization. This works well for ASGs/nodegroups which have large nodes and daemonsets utilization doesn't contribute much to node resource utilization. Due to this, we have no problem during node scale down.

If the nodes are large enough and the contribution from DaemonSets is proportionally small, what harm comes from ignoring their contribution on these large nodes too? What is the benefit of noting their contribution there?

@vadasambar
Copy link
Member Author

@seh

One problem with this (I think) is scale down wouldn't consider node resource utilization accurately because it would always ignore daemonset resource utilization.

#5399 (comment)

If the nodes are large enough and the contribution from DaemonSets is proportionally small, what harm comes from ignoring their contribution on these large nodes too? What is the benefit of noting their contribution there?

If you look at really small consumption in daemonsets, I think ignoring daemonset utilization makes sense. e.g., only 1 CPU is consumed by daemonsets on a 32 CPU node. It might not matter as much.

But if you look at relatively larger consumption in daemonsets, say 1CPU in a 10 CPU node. For example imagine a scenario like this:

  • IgnoreDaemonSetsUtilization is set to true (globally)
  • Cluster has 2 nodegroups
    • nodegroup 1: 10 CPUs
    • nodegroup 2: 2 CPU
  • Daemonsets running per node use 1 CPU
  • --scale-down-utilization-threshold is set to 0.7 (i.e., scale down nodes with total cpu requests less than 70% of node's cpu capacity)
  • nodegroup 1 runs application workload which in total uses 6 CPUs.
  • nodegroup 2 runs short-lived workloads which in total use 0.5 CPU
  1. CPU utilization in nodegroup1 according to CA (cluster-autoscaler): 6/10 = > 60% (we are ignoring daemonsets)
  2. CPU utilization in nodegroup2 according to CA: 0.5/2 => 25 %

In case of 2, we want the node to get scaled down. But in case of 1, we might not want the node to get scaled down. Ignoring daemonset utilization for nodegroup1 means CA would scale down nodes in nodegroup 1 even though the node is actually using 70% of cpu (if you include daemonset utilization). This means, the pods would need to find a new home now. This can create potential disruption (application downtime until new new node is brought in by CA)

@seh
Copy link

seh commented Jul 12, 2023

Thank you for the explanation. Regarding case one, the target utilization of 0.7 (70%) is higher than I've ever used, and I still don't see why you want to count the DaemonSet's contribution there. If you want the node to survive your 70% threshold, it either needs 10% more "real" workload, or your actual threshold should be closer to 60% to get what you want. Counting the DaemonSet as part of that utilization is falsifying the baseline atop which you sum the rest of the non-daemon workload's utilization. You could adjust the threshold instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Override --ignore-daemonsets-utilization for Cluster Autoscaler on a per AWS ASG basis using ASG label
6 participants