-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS: cache instance requirements #6245
Merged
k8s-ci-robot
merged 1 commit into
kubernetes:master
from
alexanderConstantinescu:aws-cache-instance-requirements
Dec 4, 2023
Merged
AWS: cache instance requirements #6245
k8s-ci-robot
merged 1 commit into
kubernetes:master
from
alexanderConstantinescu:aws-cache-instance-requirements
Dec 4, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
k8s-ci-robot
added
kind/bug
Categorizes issue or PR as related to a bug.
cncf-cla: yes
Indicates the PR's author has signed the CNCF CLA.
labels
Nov 3, 2023
k8s-ci-robot
added
area/cluster-autoscaler
size/M
Denotes a PR that changes 30-99 lines, ignoring generated files.
labels
Nov 3, 2023
gjtempleton
reviewed
Nov 16, 2023
/assign @gjtempleton |
alexanderConstantinescu
force-pushed
the
aws-cache-instance-requirements
branch
from
November 20, 2023 11:14
7837664
to
705143a
Compare
Thanks! |
k8s-ci-robot
added
the
lgtm
"Looks good to me", indicates that a PR is ready to be merged.
label
Dec 4, 2023
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexanderConstantinescu, gjtempleton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
k8s-ci-robot
added
the
approved
Indicates a PR has been approved by an approver from all required OWNERS files.
label
Dec 4, 2023
lyoung-confluent
pushed a commit
to lyoung-confluent/autoscaler
that referenced
this pull request
Mar 12, 2024
…ache-instance-requirements AWS: cache instance requirements
lyoung-confluent
pushed a commit
to lyoung-confluent/autoscaler
that referenced
this pull request
Mar 12, 2024
…ache-instance-requirements AWS: cache instance requirements
lyoung-confluent
pushed a commit
to lyoung-confluent/autoscaler
that referenced
this pull request
Mar 12, 2024
…ache-instance-requirements AWS: cache instance requirements
k8s-ci-robot
added a commit
that referenced
this pull request
Mar 12, 2024
…ease-1.28 Backport #6245 [CA] AWS: cache instance requirements into CA 1.28
k8s-ci-robot
added a commit
that referenced
this pull request
Mar 12, 2024
…ease-1.27 Backport #6245 [CA] AWS: cache instance requirements into CA 1.27
k8s-ci-robot
added a commit
that referenced
this pull request
Mar 12, 2024
…ease-1.26 Backport #6245 [CA] AWS: cache instance requirements into CA 1.26
This was referenced Oct 11, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
approved
Indicates a PR has been approved by an approver from all required OWNERS files.
area/cluster-autoscaler
area/provider/aws
Issues or PRs related to aws provider
cncf-cla: yes
Indicates the PR's author has signed the CNCF CLA.
kind/bug
Categorizes issue or PR as related to a bug.
lgtm
"Looks good to me", indicates that a PR is ready to be merged.
size/M
Denotes a PR that changes 30-99 lines, ignoring generated files.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
We can discuss this, but I am going to go ahead and label this:
/kind bug
since it forces users of the CA to re-configure the CA with a work-around that is sub-optimal, see below for more information.
What this PR does / why we need it:
The cluster-autoscaler on AWS currently builds a representation of the Node object during each scale up/down computation. This computation is performed depending on the
scan-interval
defined. Parts of this computation requires setting the Node capacity. The capacity is built by fetching instance requirements from the AWS API. For clusters which use mixed instance policy and which specify a launch template, the API call made isDescribeLaunchTemplateVersions
. This information does not need to be fetched during each scale up/down computation, but can instead be cached along with all the other information required by the cluster-autoscaler running on AWS. By doing this the cluster-autoscaler can reduce the amount of API calls performed. If thescan-interval
is 10s; the amount of API calls to this endpoint will be: 6 calls per minute, or 8640 calls per day. This obviously incurs unnecessary costs for people operating AWS clusters at scale and moreover: only leaves them with the possibility of increasing thescan-interal
as to reduce the amount of API calls. This is not great since that has secondary impacts on how quickly the CA can react to changes on the cluster. The currentrefreshInterval
is set to one minute and defines the interval at which the CA refreshes its cache, so by caching the instance requirements we can make sure that this API endpoint is only called once per minute, regardless of thescan-interval
chosen.This PR therefore caches the instance requirements as done with all the other
MixedInstancesPolicy
data, and uses the cached data when executingbuildNodeFromTemplate
.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: