-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement dynamically adjustment of NodeDeleteDelayAfterTaint based on round trip time between CA and api-server #6019
Conversation
83ef332
to
e472bb1
Compare
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
e472bb1
to
dd3258a
Compare
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
dd3258a
to
f20b65c
Compare
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like gofmting is required:
Please run hack/update-gofmt.sh to fix the following files:
./cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
10577b4
to
e76699e
Compare
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker_test.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker_test.go
Show resolved
Hide resolved
8a8dfac
to
e3bb2ce
Compare
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker_test.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/core/scaledown/actuation/update_latency_tracker_test.go
Outdated
Show resolved
Hide resolved
732a966
to
7c0bdec
Compare
Implement dynamically adjustment of NodeDeleteDelayAfterTaint based on round trip time between ca and apiserver
7c0bdec
to
42691d3
Compare
Come to think about it, the observed latency is something that may be worth having a metric for. But maybe that can be done as a follow-up change. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damikag, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Implement dynamically adjustment of NodeDeleteDelayAfterTaint based on round trip time between CA and api-server
What type of PR is this?
/kind bug
What this PR does / why we need it:
Since CA and Scheduler are not synced, there could be race condition because scheduler might put pods on tainted nodes for deletion on scale down. Currently the probability of a race condition is reduced by waiting
NodeDeleteDelayAfterTaint
amount of time before deleting a node after tainting. The value ofNodeDeleteDelayAfterTaint
affects scale down throughput so this PR makes it dynamic based on the round trip time between CA and api-server.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
When
--dynamic-node-delete-delay-after-taint-enabled
set to true CA will adjustnodeDeleteDelayAfterTaint
based on the round-trip time between CA and api-server.nodeDeleteDelayAfterTaint
will be set to 3 times the round-trip time between CA and api-server.release-note
A new flag(
--dynamic-node-delete-delay-after-taint-enabled
) is introduced which enables dynamic adjustment ofnodeDeleteDelayAfterTaint
based on the round-trip time between CA and api-server.Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: