Update random pod scaledown KEP for beta #2691

damemi · 2021-05-07T18:20:45Z

damemi

/cc @alculquicondor @soltysh
I've tried updating this with the PRR answers required for beta. Though, imo many of them seem irrelevant since this is a minor change to a previously-undocumented behavior. For example, I don't think we need a metric to expose this behavior (but it could be easy to add a simple boolean, or in some way measure the logarithmic performance of cumulative scaledowns on the cluster). Please let me know if you feel there are better responses we could provide.

alculquicondor · 2021-05-07T18:25:26Z

You also need to update the file in /keps/prod-readiness

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/kep.yaml

wojtek-t · 2021-05-10T09:54:41Z

/assign

wojtek-t · 2021-05-10T14:17:39Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

@@ -263,37 +263,36 @@ _This section must be completed when targeting alpha to a release._
 _This section must be completed when targeting beta graduation to a release._

 * **How can a rollout fail? Can it impact already running workloads?**
-  Try to be as paranoid as possible - e.g., what if some components will restart
-   mid-rollout?
+  Rollouts should have no effect, as this only relates to scaledown of replicasets


One can imagine a bug that the new logic panics in some cases.

[agree that it should never impact running workloads]

@wojtek-t maybe you can clarify, because I'm a bit confused by the rollout/rollback terminology: is this referring to a rollout as in a new deployment, or rolling out a cluster upgrade?

wojtek-t · 2021-05-10T14:18:22Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

-  Describe manual testing that was done and the outcomes.
-  Longer term, we may want to require automated upgrade/rollback tests, but we
-  are missing a bunch of machinery and tooling and can't do that now.
+  N/A


It's always applicable.

Every single feature should be updage->downgrade->upgrade tested.

@alculquicondor could you please provide manual testing for this?

Sure, I can do it once the code changes are in.

Please indicate the steps that should be applied in the manual test (probably just create a Replicaset and down scale?)

Sure, I can do it once the code changes are in.

Would that just be the beta promotion change kubernetes/kubernetes#101767? I'm also confused about how this should be tested with the upgrade->downgrade->upgrade path

Please add a text that you will test that before the feature will graduate to beta.
Also - please update the KEP within the release cycle [see https://github.com//pull/2538 as an example where we did that.]

So manual testing is literal: you create a cluster, you upgrade, you run something, you downgrade, run the same thing, upgrade again, report.

Now, the question is, what is the "something" that we should run? Ideally the results can be observed in metrics, ~~but I don't think we can do that for this feature.~~

EDIT: I just noticed the recommended metric below :)

wojtek-t · 2021-05-10T14:20:18Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

-  - [ ] Other (treat as last resort)
-    - Details:
+  - [x] Other (treat as last resort)
+    - Details: The health of this feature is dependent on the overall health of the 


This isn't helpful - it's true for almost any feature.

What we need is a signal to determine if the kcm is doing what it should be doing.

@soltysh are you aware of any KCM metrics (or where I could find a list) that would be good indicators for this feature?

Maybe something measuring successful deletions, or deletions-vs-recreations to detect errant victim selection?

I doubt we have any metrics around that, that particular piece of doesn't have any specific metrics. Looking at the overall deletions vs creations won't help either since the summary numbers should not change, the new algorithm affects which we pick not how many we pick.

Can we come up with some metrics for this? Even if whitebox-y, that's still much better than nothing.

Maybe we can expose histogram of something like:

|age of deleted pod| / |age of yougest pod|
[We expect all values to be within [1,2], so having values outside of it is a good signal that something is wrong.

[Well - technically it will collide with https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/2255-pod-cost so maybe we additionally need a metric label whether such annotation was set or not]
[And then the SLO would be that there are no values >2 when the pod_cost_set=false]

|age of deleted pod| / |age of yougest pod|

are these currently available, or is there work needed to add them to KCM? @soltysh where are the KCM metrics set up so that we can start work on that?

wojtek-t · 2021-05-10T14:20:46Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

+    - Details: The health of this feature is dependent on the overall health of the 
+  control plane services, specifically kube-controller-manager but also to an extent 
+      services like API server, kubelet, etc.
+      We could add a metric to KCM to indicate that this feature gate is enabled

 * **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**


Please fill in these two questions too.

@damemi suggestion, the previous deletions vs recreations are identical to current rates or close to similar.

I understand the intention, but I don't think it's a good SLO - SLI/SLO needs precise definition and this value will heavily depend on the load (and can change over time).

That may be a good metric informing rollback though...

@wojtek-t do you want to use the metric you suggested above in #2691 (comment)?

I think we should build SLO on that metric - yes.

wojtek-t · 2021-05-10T14:27:18Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

@@ -369,18 +356,11 @@ details). For now, we leave it here.
 _This section must be completed when targeting beta graduation to a release._

 * **How does this feature react if the API server and/or etcd is unavailable?**
+  It would be irrelevant, because the feature relies on Pod timestamps. If the API server


kcm won't be able to scale-up/down replicasets, so it doesn't matter.

The important thing is that it's not feature of "running workloads".

wojtek-t · 2021-05-10T14:27:43Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md


 * **What are other known failure modes?**
-  For each of them, fill in the following information by copying the below template:


add n/a (if none)

soltysh · 2021-05-10T20:11:33Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

@@ -263,37 +263,39 @@ _This section must be completed when targeting alpha to a release._
 _This section must be completed when targeting beta graduation to a release._

 * **How can a rollout fail? Can it impact already running workloads?**
-  Try to be as paranoid as possible - e.g., what if some components will restart
-   mid-rollout?
+


Missing answer here?

I'm waiting for some clarification on #2691 (comment)

Maybe something along:

It shouldn't impact already running workloads. One possible problem during rollout is a panic in the algorithm which might crash controller-manager.

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

soltysh · 2021-05-10T20:14:44Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

-  - [ ] Other (treat as last resort)
-    - Details:
+  - [x] Other (treat as last resort)
+    - Details: The health of this feature is dependent on the overall health of the 


I doubt we have any metrics around that, that particular piece of doesn't have any specific metrics. Looking at the overall deletions vs creations won't help either since the summary numbers should not change, the new algorithm affects which we pick not how many we pick.

soltysh · 2021-05-10T20:15:23Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

+    - Details: The health of this feature is dependent on the overall health of the 
+  control plane services, specifically kube-controller-manager but also to an extent 
+      services like API server, kubelet, etc.
+      We could add a metric to KCM to indicate that this feature gate is enabled

 * **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**


@damemi suggestion, the previous deletions vs recreations are identical to current rates or close to similar.

soltysh · 2021-05-10T20:17:21Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

@@ -369,27 +359,21 @@ details). For now, we leave it here.
 _This section must be completed when targeting beta graduation to a release._

 * **How does this feature react if the API server and/or etcd is unavailable?**
+  N/a - this is not a feature of running workloads


The controller won't work, if api and/or etcd is unavailable thus it can't delete and create pods.

alculquicondor · 2021-05-11T17:42:37Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md


 * **What are the SLIs (Service Level Indicators) an operator can use to determine 
 the health of the service?**
  - [ ] Metrics
    - Metric name:
    - [Optional] Aggregation method:
    - Components exposing the metric:
-  - [ ] Other (treat as last resort)
-    - Details:
+  - [x] Other (treat as last resort)


This is not "other". It's a metric, just one that needs to be added.

wojtek-t

Some suggestions/answers - hope these are useful.

wojtek-t · 2021-05-11T18:40:15Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

-  - [ ] Metrics
-    - Metric name:
+  - [x] Metrics
+    - Metric name: deleted_pod_age_ratio


If this is going to be the metric that I proposed in: #2691 (comment)

can you please define it somewhere in the KEP (and probably add a beta criteria that this metric is implemented)?

[If it's going to be a different metric, please define that too :) ]

wojtek-t · 2021-05-11T18:40:49Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

+    - Details: The health of this feature is dependent on the overall health of the 
+  control plane services, specifically kube-controller-manager but also to an extent 
+      services like API server, kubelet, etc.
+      We could add a metric to KCM to indicate that this feature gate is enabled

 * **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**


I think we should build SLO on that metric - yes.

wojtek-t · 2021-05-11T18:42:44Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

-  - 99% percentile over day of absolute value from (job creation time minus expected
-    job creation time) for cron job <= 10%
-  - 99,9% of /health requests per day finish with 200 code
+  There should be no values `>2` in the above metric when the Pod Cost annotation is set 


s/set/unset/, right?

[Or to be more specific - if this is unset for all pods of the replicaset...]

Basically (it's a comment toward my ask above to define the metric in the kep), I think that the metric should have a field of "pod-cost-annotation-exists" (better name welcome :-) ) that replicaset controller will be computing and setting when exporing that metric (and then the check is trivial).

wojtek-t · 2021-05-12T09:17:53Z

/assign @soltysh [we will need SIG approval too]

soltysh

One suggestion for that one missing bit, the rest
/lgtm
/approve

soltysh · 2021-05-12T14:02:20Z

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/README.md

@@ -263,37 +263,39 @@ _This section must be completed when targeting alpha to a release._
 _This section must be completed when targeting beta graduation to a release._

 * **How can a rollout fail? Can it impact already running workloads?**
-  Try to be as paranoid as possible - e.g., what if some components will restart
-   mid-rollout?
+


Maybe something along:

It shouldn't impact already running workloads. One possible problem during rollout is a panic in the algorithm which might crash controller-manager.

damemi · 2021-05-12T14:25:05Z

Updated with the latest feedback, and squashed.

One note about the metric (mentioned this to @alculquicondor offline) is that the defined range of [1,2] is only valid for pods where their deletion was based on sorting by timestamp.

Similar to how this conflicts with the Pod Cost Annotation, there are other criteria which could put a pod older than 2 * youngest_pod_age as higher priority for deletion, such as a Pending status phase. I've tried to make this clear in the definitions of the new metric.

soltysh · 2021-05-12T14:28:19Z

/lgtm

wojtek-t · 2021-05-12T15:04:36Z

/lgtm
/approve

Thanks!

k8s-ci-robot · 2021-05-12T15:04:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damemi, soltysh, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [wojtek-t]
~~keps/sig-apps/OWNERS~~ [soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels May 7, 2021

k8s-ci-robot requested review from kow3ns and mattfarina May 7, 2021 18:20

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 7, 2021

damemi commented May 7, 2021

View reviewed changes

k8s-ci-robot requested review from alculquicondor and soltysh May 7, 2021 18:23

alculquicondor mentioned this pull request May 7, 2021

Random Pod selection on ReplicaSet downscaling #2185

Closed

12 tasks

alculquicondor reviewed May 7, 2021

View reviewed changes

wojtek-t reviewed May 10, 2021

View reviewed changes

keps/sig-apps/2185-random-pod-select-on-replicaset-downscale/kep.yaml Show resolved Hide resolved

k8s-ci-robot assigned wojtek-t May 10, 2021

wojtek-t reviewed May 10, 2021

View reviewed changes

soltysh reviewed May 10, 2021

View reviewed changes

alculquicondor reviewed May 11, 2021

View reviewed changes

wojtek-t reviewed May 11, 2021

View reviewed changes

soltysh approved these changes May 12, 2021

View reviewed changes

k8s-ci-robot assigned soltysh May 12, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2021

Update random pod scaledown KEP for beta

95abb6b

damemi force-pushed the random-downscale-beta branch from 1fe373c to 95abb6b Compare May 12, 2021 14:23

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 12, 2021

k8s-ci-robot merged commit e0f4972 into kubernetes:master May 12, 2021

k8s-ci-robot added this to the v1.22 milestone May 12, 2021

damemi deleted the random-downscale-beta branch May 12, 2021 15:08

damemi mentioned this pull request Jun 10, 2021

Promote LogarithmicScaleDown to Beta kubernetes/kubernetes#101767

Merged


		* What are other known failure modes?
		For each of them, fill in the following information by copying the below template:

Update random pod scaledown KEP for beta #2691

Update random pod scaledown KEP for beta #2691

Conversation

damemi commented May 7, 2021

damemi left a comment

Choose a reason for hiding this comment

alculquicondor commented May 7, 2021

wojtek-t commented May 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alculquicondor May 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wojtek-t left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wojtek-t commented May 12, 2021

soltysh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

damemi commented May 12, 2021

soltysh commented May 12, 2021

wojtek-t commented May 12, 2021

k8s-ci-robot commented May 12, 2021

alculquicondor May 11, 2021 •

edited

Loading