Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: disable ASG AZRebalance of EKS by default #127

Closed
wants to merge 4 commits into from

Conversation

maxsxu
Copy link
Member

@maxsxu maxsxu commented Feb 27, 2024

Motivation

By default AWS rebalance the auto-scaling groups when there are are more nodes in one zone than the other. For Kubernetes clusters where there is already a Cluster Autoscaler operating the AutoScalingGroup (ASG) it could have some undesired effects.

Example there is an unbalanced ASG, but cluster autoscaler removes a node in the zone or add a new node in the unbalanced zone (which will balance again the number of machines), it can result in the two mechanisms take corrective actions at the same time provoking an unexpected result.

Ideally, we have only one controller that decides when to expand or shrink the node pool.

Modifications

  • Suspend AZRebalance via terraform_data resource

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Documentation

Check the box below.

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

References

Currently, AWS doesn't support manipulate the ASGs implicitly created by EKS managed Node Groups. There are some related discussions:

@maxsxu maxsxu self-assigned this Feb 27, 2024
Copy link
Contributor

@maxsxu:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@github-actions github-actions bot added doc-info-missing This pr needs to mark a document option in description and removed doc-info-missing This pr needs to mark a document option in description labels Feb 27, 2024
Copy link
Contributor

@maxsxu:Thanks for providing doc info!

@github-actions github-actions bot added the no-need-doc This pr does not need any document label Feb 27, 2024
@maxsxu maxsxu closed this May 4, 2024
@maxsxu maxsxu deleted the max/disable-asg-azrebalance branch May 4, 2024 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-need-doc This pr does not need any document
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant