Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add description of replica controller scaledown sort logic #26993

Merged
merged 1 commit into from
Mar 22, 2021

Conversation

damemi
Copy link
Contributor

@damemi damemi commented Mar 10, 2021

@k8s-ci-robot k8s-ci-robot added this to the 1.21 milestone Mar 10, 2021
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 10, 2021
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Mar 10, 2021

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit 10162f7

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/6052698d4a5dfa0007b4f7f4

@k8s-ci-robot k8s-ci-robot added language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 10, 2021
@PI-Victor
Copy link
Member

@damemi thank you for opening this!

/assign
/milestone 1.21
/sig apps

@k8s-ci-robot k8s-ci-robot added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Mar 10, 2021
@damemi
Copy link
Contributor Author

damemi commented Mar 10, 2021

/cc @soltysh @alculquicondor

is unknown, and a pod whose phase is unknown comes before a running pod.
3. If exactly one of the pods is ready, the pod that is not ready comes
before the ready pod.
4. If controller.kubernetes.io/pod-deletion-cost annotation is set, then
Copy link
Member

@alculquicondor alculquicondor Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @ahg-g

Maybe we can have a single docs PR for both KEPs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually lets keep them separate, the other PR is ready for review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I'd be happy to see this level of detail covered in the reference section of the docs.

@@ -310,6 +310,33 @@ assuming that the number of replicas is not also changed).
A ReplicaSet can be easily scaled up or down by simply updating the `.spec.replicas` field. The ReplicaSet controller
ensures that a desired number of Pods with a matching label selector are available and operational.

When scaling down, the ReplicaSet controller chooses which pods to delete by sorting the available pods to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that this is too detailed to be useful to users. Also we should avoid locking the criteria in a way that users start relying on it, making it harder to change in the future. In particular, we should state that these are preferences and not something that is guaranteed.

Important downscale selection criteria IMO:

  1. Pending (including unscheduled) pods.
  2. pod-deletion-cost annotation (probably worth stating the default)
  3. too many pods in a node (this is the rank, although not sure if we should include it in the documentation).
  4. lower running time (bucketed)

And if they all match, selection is random (the UIDs is 100% an implementation detail that shouldn't go in the documentation).

@damemi damemi force-pushed the dev-1.21-random-downscale branch from d238302 to 9655963 Compare March 17, 2021 17:59
@damemi
Copy link
Contributor Author

damemi commented Mar 17, 2021

Updated this description to be less implementation detailed following @alculquicondor's suggestion in #26993 (comment)

When scaling down, the ReplicaSet controller chooses which pods to delete by sorting the available pods to
prioritize scaling down pods based on the following general algorithm:
1. Pending (and unschedulable) pods are scaled down first
2. If controller.kubernetes.io/pod-deletion-cost annotation is set, then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also list controller.kubernetes.io/pod-deletion-cost in https://kubernetes.io/docs/reference/labels-annotations-taints/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add that in #26739

@alculquicondor we should do that for the indexed job annotation too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #27106

1. Pending (and unschedulable) pods are scaled down first
2. If controller.kubernetes.io/pod-deletion-cost annotation is set, then
the pod with the lower value will come first.
3. If the pods' ranks differ, the pod with greater rank comes before the pod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rank is an internal detail to the controller utils. The rank we use in replicaset controller is the number of pods in a node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better as just something like 3. Pods on nodes with greater replicas come before pods on nodes with fewer replicas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. But should it be greater or more? Or perhaps greater number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with more to keep it clear :)

@damemi damemi force-pushed the dev-1.21-random-downscale branch from 9655963 to 10162f7 Compare March 17, 2021 20:41
@alculquicondor
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 17, 2021
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 7f583552244a1cf8783d3487be4c5fc82f89c75d

@PI-Victor
Copy link
Member

Hi @damemi , thank you for having your Doc PR ready for review, friendly reminder about the upcoming doc related dates for the 1.21 release:
Upcoming doc related dates for the 1.21 release:

  • Docs Ready for Review deadline is March 24
  • Docs Ready to Merge deadline is March 31

@damemi
Copy link
Contributor Author

damemi commented Mar 22, 2021

@PI-Victor is there anything else this needs to be Ready to Merge? It looks like concerns were all addressed and /lgtm from sig-scheduling. @soltysh does this look good from apps?

@PI-Victor
Copy link
Member

from my perspective it looks like all concerns were addressed.
@reylejano or @sftim can you approve this?

@reylejano
Copy link
Member

Thank you for the updates.
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: reylejano

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 22, 2021
@k8s-ci-robot k8s-ci-robot merged commit fddac2d into kubernetes:dev-1.21 Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants