Add rate limiter to the Vertical Pod autoscaler updater component #2326

guillaumebreton · 2019-09-11T13:49:45Z

This patch adds a rate limiter to the vertical pod autoscaler updater.
It adds two flags

eviction-rate-limit to control the number of pods that can be evicted
every seconds.
eviction-rate-limit-burst to control the burst of that can be evicted
immediately.

k8s-ci-robot · 2019-09-11T13:49:46Z

Welcome @guillaumebreton!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

bskiba · 2019-09-16T16:58:57Z

@guillaumebreton thanks! This looks ok, I have some minor comments but first I wanted to clarify that this is what we want.
@milesbxf does global setting like this solve your issues is there need for more fine grained control (for example ability to set this per-deployment)?

vertical-pod-autoscaler/pkg/updater/logic/updater.go

bandesz · 2019-09-17T14:09:57Z

vertical-pod-autoscaler/pkg/updater/logic/updater.go

@@ -144,6 +149,11 @@ func (u *updater) RunOnce() {
 			if !evictionLimiter.CanEvict(pod) {
 				continue
 			}
+			err := u.evictionRateLimiter.Wait(ctx)


~~Wait always asks for one token, so it doesn't make much sense to set the burst rate to anything else than 1. If I'm correct then we should remove the eviction-rate-limit-burst parameter.~~

Sorry, the golang.org/x/time/rate documentation is not really clear. Looking at the code burst is also the maximum number of tokens you can accrue over time. This is important as VPA couldn't suddenly evict lots of pods after not evicting anything for a long time.

Without knowing the autoscaler well I'm a little bit worried if Wait will block here for long time (e.g. minutes). The state of the cluster might change significantly during that time.

An idea for solving the Wait()/blocking problem:

Let's collect all pods to be evicted in a list

Randomize the list (to give all pods a fair chance to be evicted in a RunOnce run)

Go through the list and evict a pod if u.evictionRateLimiter.Allow() returns true.

If u.evictionRateLimiter.Allow() returns false then return from the RunOnce function. Any remaining pods to be evicted will be retried in the next RunOnce run

Thanks for your input 🙂

I'm not really sure why the Wait is an issue here :

A context with timeout it passed to the Wait function so the Wait will be cancelled if the delay is larger than the tick interval.

As far as I understand it. The state of the cluster is updated by the recommender, and the admission controller applies it. The updater is only responsible for eviction here, and when the pod restarts, the last value of the VPA are applied by the admission controller.

In your proposed solution, we would hit the rate limiter quickly and exit. So we would "waste" n seconds of the ticker loop doing nothing. Running on fresher data is an excellent idea, but the only option I see would be to refresh the pod list between every Wait, which doesn't seem ideal to me.

The default tick for the updater is 1 minute, so as @guillaumebreton writes, the Wait will be canceled after that time at most (unless someone uses a custom tick). So we can basically look at info that is stale up to 1 minute, which is also true if updater has a lot of work to do. I'm inclined to agree this solution is acceptable.

As to randomizing the list of pods to evict - the updater has a priority mechanism to select which pods to evict first based on multiple factors (including how far away they are from their recommended resources). Shuffling the list wouldn't play well with that.

milesbxf · 2019-09-20T09:40:36Z

@guillaumebreton thanks! This looks ok, I have some minor comments but first I wanted to clarify that this is what we want.
@milesbxf does global setting like this solve your issues is there need for more fine grained control (for example ability to set this per-deployment)?

Yes, it does! 🙌 FYI @guillaumebreton and I are on the same team, so we've spoken about this offline 🙂

bskiba · 2019-09-20T11:37:02Z

@milesbxf Thanks! In that case I'll get back to this early next week, appreciate your patience I am a bit short on time atm :)

guillaumebreton · 2019-09-20T12:23:48Z

@bskiba No worries 👍 Thank you for reviewing it.

vertical-pod-autoscaler/pkg/updater/logic/updater.go

bskiba · 2019-10-03T11:25:17Z

vertical-pod-autoscaler/pkg/updater/logic/updater.go

+func getRateLimiter(evictionRateLimit float64, evictionRateLimitBurst int) *rate.Limiter {
+	var evictionRateLimiter *rate.Limiter
+	if evictionRateLimit == -1 || evictionRateLimit == 0 {
+		evictionRateLimiter = rate.NewLimiter(rate.Inf, evictionRateLimitBurst)


I haven't dived into the rate package docs, is this equivalent to having no rateLimiter at all? i.e. will the Wait() return immedately?

Also, is the Burst completely ignored?

Indeed, if we set the rate limite to rate.Inf the burst rate is completely ignored, and Wait will return immediately.
-> Documentation about burst being ignored if rate equals to Rate.inf https://github.com/golang/time/blob/master/rate/rate.go#L37
-> Wait() returning immediately : https://github.com/golang/time/blob/master/rate/rate.go#L307

vertical-pod-autoscaler/pkg/updater/logic/updater.go

vertical-pod-autoscaler/pkg/updater/main.go

bskiba · 2019-10-03T11:29:50Z

vertical-pod-autoscaler/pkg/updater/logic/updater_test.go

+	}{
+		{0.0, 1, rate.NewLimiter(rate.Inf, 1)},
+		{-1.0, 2, rate.NewLimiter(rate.Inf, 2)},
+		{10.0, 3, rate.NewLimiter(rate.Every(time.Duration(1.0/10*float64(time.Second))), 3)},


From @bandesz's comment above, looks like this can be simplified.

bskiba · 2019-10-03T11:37:12Z

Thanks for your patience and apologies for the delay.

I've added some comments. My main concern is to make sure that the current behavior stays the same (i.e. if we are not using the rate limiter). Other comments are mostly nit's

bskiba

This looks good, only minor comments left. Can you address them and squash your commits into one.

Thanks for answering my questions and getting this done!

vertical-pod-autoscaler/pkg/updater/main.go

vertical-pod-autoscaler/pkg/updater/logic/updater.go

guillaumebreton · 2019-10-22T15:35:04Z

@bskiba What would be the next step with this PR ? Should I ask @jbartosik and @mwielgus for a review ?

bskiba · 2019-10-22T15:41:58Z

I'll review tomorrow, sorry for the wait!

guillaumebreton · 2019-10-23T07:52:14Z

@bskiba No worries :) Github was showing you already approved it so I wasn't sure

bskiba · 2019-10-24T08:50:53Z

vertical-pod-autoscaler/pkg/updater/main.go

+		`Number of pods that can be evicted per seconds. A rate limit set to 0 or -1 will disable
+		the rate limiter.`)
+
+	evictionRateBurst = flag.Int("eviction-rate-limit-burst", 1, `Burst of pods that can be evicted.`)


can you also rename the flag to eviction-rate-burst?

bskiba · 2019-10-24T08:51:43Z

One last small comment (sorry, I missed this in the previous review :( )

This patch adds a rate limiter to the vertical pod autoscaler updater. It adds two flags - eviction-rate-limit to control the number of pods that can be evicted every seconds. - eviction-rate-limit-burst to control the burst of that can be evicted immediately.

guillaumebreton · 2019-10-25T12:50:00Z

@bskiba Thank you 👍 I missed it and I fixed it.

bskiba · 2019-10-25T13:07:46Z

Thanks!
/lgtm
/approve

k8s-ci-robot · 2019-10-25T13:08:15Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bskiba

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~vertical-pod-autoscaler/OWNERS~~ [bskiba]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

guillaumebreton force-pushed the eviction-rate-limiter branch from 19844a2 to 4ae1ddd Compare September 11, 2019 13:50

k8s-ci-robot requested review from jbartosik and mwielgus September 11, 2019 13:50

k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Sep 11, 2019

guillaumebreton marked this pull request as ready for review September 11, 2019 14:01

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 11, 2019

losipiuk added the area/vertical-pod-autoscaler label Sep 15, 2019

bandesz reviewed Sep 17, 2019

View reviewed changes

vertical-pod-autoscaler/pkg/updater/logic/updater.go Outdated Show resolved Hide resolved

bandesz reviewed Sep 17, 2019

View reviewed changes

bskiba reviewed Oct 3, 2019

View reviewed changes

vertical-pod-autoscaler/pkg/updater/logic/updater.go Outdated Show resolved Hide resolved

bskiba reviewed Oct 3, 2019

View reviewed changes

vertical-pod-autoscaler/pkg/updater/logic/updater.go Outdated Show resolved Hide resolved

bskiba reviewed Oct 3, 2019

View reviewed changes

vertical-pod-autoscaler/pkg/updater/main.go Outdated Show resolved Hide resolved

bskiba reviewed Oct 3, 2019

View reviewed changes

vertical-pod-autoscaler/pkg/updater/main.go Outdated Show resolved Hide resolved

bskiba reviewed Oct 3, 2019

View reviewed changes

guillaumebreton force-pushed the eviction-rate-limiter branch 2 times, most recently from 3e88a62 to e380765 Compare October 9, 2019 14:48

guillaumebreton requested a review from bskiba October 9, 2019 15:17

bskiba approved these changes Oct 11, 2019

View reviewed changes

guillaumebreton force-pushed the eviction-rate-limiter branch from 02cd198 to b60628f Compare October 11, 2019 14:20

bskiba reviewed Oct 24, 2019

View reviewed changes

guillaumebreton force-pushed the eviction-rate-limiter branch from d7b7727 to 59df00f Compare October 25, 2019 12:46

k8s-ci-robot assigned bskiba Oct 25, 2019

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 25, 2019

k8s-ci-robot merged commit 0e425f3 into kubernetes:master Oct 25, 2019

evnsio mentioned this pull request Mar 30, 2020

Eviction Rate Limiter on latest VPA releases #2996

Closed

jabdoa2 mentioned this pull request Aug 1, 2022

VPA restarts my pods but does not modify CPU or memory settings #4667

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rate limiter to the Vertical Pod autoscaler updater component #2326

Add rate limiter to the Vertical Pod autoscaler updater component #2326

guillaumebreton commented Sep 11, 2019

k8s-ci-robot commented Sep 11, 2019

bskiba commented Sep 16, 2019

bandesz Sep 17, 2019 •

edited

Loading

bandesz Sep 17, 2019

bandesz Sep 17, 2019

guillaumebreton Sep 17, 2019 •

edited

Loading

bskiba Oct 3, 2019

milesbxf commented Sep 20, 2019

bskiba commented Sep 20, 2019

guillaumebreton commented Sep 20, 2019

bskiba Oct 3, 2019

bskiba Oct 3, 2019 •

edited

Loading

guillaumebreton Oct 9, 2019

bskiba Oct 3, 2019

bskiba commented Oct 3, 2019

bskiba left a comment

guillaumebreton commented Oct 22, 2019

bskiba commented Oct 22, 2019

guillaumebreton commented Oct 23, 2019

bskiba Oct 24, 2019

bskiba commented Oct 24, 2019

guillaumebreton commented Oct 25, 2019

bskiba commented Oct 25, 2019

k8s-ci-robot commented Oct 25, 2019

Add rate limiter to the Vertical Pod autoscaler updater component #2326

Add rate limiter to the Vertical Pod autoscaler updater component #2326

Conversation

guillaumebreton commented Sep 11, 2019

k8s-ci-robot commented Sep 11, 2019

bskiba commented Sep 16, 2019

bandesz Sep 17, 2019 • edited Loading

Choose a reason for hiding this comment

bandesz Sep 17, 2019

Choose a reason for hiding this comment

bandesz Sep 17, 2019

Choose a reason for hiding this comment

guillaumebreton Sep 17, 2019 • edited Loading

Choose a reason for hiding this comment

bskiba Oct 3, 2019

Choose a reason for hiding this comment

milesbxf commented Sep 20, 2019

bskiba commented Sep 20, 2019

guillaumebreton commented Sep 20, 2019

bskiba Oct 3, 2019

Choose a reason for hiding this comment

bskiba Oct 3, 2019 • edited Loading

Choose a reason for hiding this comment

guillaumebreton Oct 9, 2019

Choose a reason for hiding this comment

bskiba Oct 3, 2019

Choose a reason for hiding this comment

bskiba commented Oct 3, 2019

bskiba left a comment

Choose a reason for hiding this comment

guillaumebreton commented Oct 22, 2019

bskiba commented Oct 22, 2019

guillaumebreton commented Oct 23, 2019

bskiba Oct 24, 2019

Choose a reason for hiding this comment

bskiba commented Oct 24, 2019

guillaumebreton commented Oct 25, 2019

bskiba commented Oct 25, 2019

k8s-ci-robot commented Oct 25, 2019

bandesz Sep 17, 2019 •

edited

Loading

guillaumebreton Sep 17, 2019 •

edited

Loading

bskiba Oct 3, 2019 •

edited

Loading