New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

KEP-1282: Requeue Strategy #1311

Merged

k8s-ci-robot merged 4 commits into kubernetes-sigs:main from nstogner:kep-1282

Dec 27, 2023

Contributor

nstogner commented Nov 6, 2023

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Document how we could add support for configurable queue ordering strategies upon eviction.

Which issue(s) this PR fixes:

Part of #1282

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

k8s-ci-robot added kind/documentation do-not-merge/release-note-label-needed labels

netlify bot commented Nov 6, 2023 •

edited

Loading

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Name	Link
🔨 Latest commit	`dc9c32e`
🔍 Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/65824a42916aaa00087887ac

k8s-ci-robot added the cncf-cla: yes label

k8s-ci-robot requested review from denkensk and tenzen-y

November 6, 2023 21:08

k8s-ci-robot added the needs-ok-to-test label

Contributor

k8s-ci-robot commented Nov 6, 2023

Hi @nstogner. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the size/L label

nstogner force-pushed the kep-1282 branch from cbc7299 to 8b49361 Compare

November 6, 2023 21:10

nstogner mentioned this pull request

WIP: Add RequeuingStrategy to ClusterQueue #1318

Closed

Contributor

mimowo commented Nov 15, 2023

/ok-to-test
/assign

k8s-ci-robot assigned mimowo

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels

mimowo reviewed

View reviewed changes

keps/1282-requeue-strategy/README.md Outdated


		Possible settings:

		* `UseEvictionTimestamp` (Back of queue)

Contributor

mimowo Nov 15, 2023

Is't it the other way round? UseEvictionTimestamp -> front of the queue?

Contributor Author

nstogner Nov 15, 2023

I meant UseEvictionTimestamp as the time in which to consider for entry to queue. Because the eviction would have occurred more recently than the initial .metadata.creationTimestamp, this would put the Workload at the "back of the queue" immediately following eviction. If you read this differently, perhaps it would be good to change the terminology.

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

Contributor

alculquicondor commented Nov 15, 2023

/release-note-none

k8s-ci-robot added release-note-none and removed do-not-merge/release-note-label-needed labels

alculquicondor reviewed

View reviewed changes

keps/1282-requeue-strategy/README.md Outdated

+              > The case of preemption might be more tricky: we want the preempted jobs to be readmitted as soon as possible. But if a job is waiting for more than one job to be preempted and the queueing strategy is BestEffortFIFO, we don't want the preempted pods to take the head of the queue.
+              > Maybe we need to hold them until the preemptor job is admitted, and then they should use the regular priority.
+              The `pkg/queue` package could have the existing `queueOrderingFunc()` modified to add sorting based on who is the preemptor (might need to add a condition to the Workload to track this).

Contributor

alculquicondor Nov 15, 2023

It looks like my concern was unfounded.
We handle this problem here:

kueue/pkg/queue/cluster_queue_best_effort_fifo.go

Line 43 in c2f21f4

    
           return cq.requeueIfNotPresent(wInfo, reason == RequeueReasonFailedAfterNomination || reason == RequeueReasonPendingPreemption)

Basically, if the workload has preempted any workloads, it's put back in the queue immediately.

Since nothing changes about it and it will have a higher priority than the preempted workloads, it should be at the head of the queue.

alculquicondor reviewed

View reviewed changes

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated

+                  # ...
+                  requeueStrategy:
+                    priorityPreemption: UseEvictionTimestamp
+                    podsReadyTimeout: UseCreationTimestamp

Contributor

alculquicondor Nov 15, 2023

We consider multiple aspects for sorting, the timestamp being one of them.

Having UseCreationTimestamp as the name of the option seems to imply that we ignore any other aspects.

Maybe it should be something like:

requeueStrategy:
  podsReadyTimeout:
    timestampSource: Creation | Eviction

That is assuming we are going to offer configuration for every eviction mechanism, otherwise, I would put the setting under the existing waitForPodsReady:

waitForPodsReady:
  requeuingTimestamp: Creation | Eviction

Member

denkensk Nov 16, 2023

I'm struggling to grasp how varying eviction methods could result in distinct types of queue sorting. Can you share some cases?

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved

nstogner added 2 commits

December 11, 2023 11:45


          Add kep-1282

56a072a


          Update toc

ca829b7

nstogner force-pushed the kep-1282 branch from 8b49361 to ca829b7 Compare

December 11, 2023 16:45


          Update based on comments

1ba5103

nstogner requested review from alculquicondor and mimowo

December 14, 2023 18:13

alculquicondor reviewed

View reviewed changes

keps/1282-requeue-strategy/README.md Outdated

+              #### Integration tests
+              - Add integration test that matches user story 1.
+              - Add an integration test to detect if flapping associated with preempted workloads being readmitted before the preemptor workload when `priorityPreemption: UseEvictionTimestamp` is set.

Contributor

alculquicondor Dec 14, 2023

Suggested change

      
            - Add an integration test to detect if flapping associated with preempted workloads being readmitted before the preemptor workload when `priorityPreemption: UseEvictionTimestamp` is set.
          
            - Add an integration test to detect if flapping associated with preempted workloads being readmitted before the preemptor workload when `requeuingTimestamp: Creation` is set.

?

Contributor Author

nstogner Dec 20, 2023

Good catch, updated.

keps/1282-requeue-strategy/README.md Outdated


		## Drawbacks

		When used with `StrictFIFO`, the `requeuingTimestamp: Creation` (front of queue) policy could lead to a blocked queue. This was called out in the issue that set the hardcoded [back-of-queue behavior](https://github.com/kubernetes-sigs/kueue/issues/599). This could be mitigated by recommending administrators select `BestEffortFIFO` when using this setting.

Contributor

alculquicondor Dec 14, 2023

There are other drawbacks, for example, if you want to use waitForPodsReady to prevent jobs with invalid images to block the queue, then you would continuosly block the queue.
Until we have a proper mechanism to fail pods in this situation.
kubernetes/kubernetes#122300

Contributor Author

nstogner Dec 20, 2023

Good callout, added an bullet point.

keps/1282-requeue-strategy/README.md Outdated Show resolved Hide resolved


          Address comments

dc9c32e

nstogner requested a review from alculquicondor

December 20, 2023 01:59

alculquicondor reviewed

View reviewed changes

Contributor

alculquicondor left a comment

/lgtm
/approve

k8s-ci-robot assigned alculquicondor

k8s-ci-robot added the lgtm label

Contributor

k8s-ci-robot commented Dec 27, 2023

LGTM label has been added.

Git tree hash: b0a5586f1b484b10a88ae5316c68c95ca66dce39

Contributor

k8s-ci-robot commented Dec 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, nstogner

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [alculquicondor]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the approved label

k8s-ci-robot merged commit 542da4f into kubernetes-sigs:main

13 checks passed

k8s-ci-robot added this to the v0.6 milestone

Member

tenzen-y commented Dec 27, 2023

@alculquicondor @nstogner I apologize for being too late in my comment.
Can we add a knob to set the limit on the number of requeueing?

I guess that infinite requeueing could happen when node resources are balanced (which means not bin-packing).
Not bin-packing means there are enough node resources in the cluster, although there aren't enough node resources in a single node.

Contributor

alculquicondor commented Dec 27, 2023

What would you like to happen when the threshold is reached? In a way, it seems independent from what place in the queue the workload should have after being requeued.

Member

tenzen-y commented Dec 27, 2023

What would you like to happen when the threshold is reached? In a way, it seems independent from what place in the queue the workload should have after being requeued.

When we use requeuingTimestamp=Creation, I meant the reached threshold job is 1. evicted and abandoned or 2. put on the tail of the queue.

Contributor

alculquicondor commented Dec 27, 2023

I think it's simpler to abandon the job, but again, it seems like a separate feature.

But if you have a proposal based on a user story, maybe you can send an update to this KEP?

Member

tenzen-y commented Dec 27, 2023

I think it's simpler to abandon the job, but again, it seems like a separate feature.

But if you have a proposal based on a user story, maybe you can send an update to this KEP?

Yes,sure. I can update this KEP.

alculquicondor mentioned this pull request

Add MaxWaitingTime in case of job starvation of lower priority #754

Open

3 tasks

tenzen-y mentioned this pull request

Add resource check before dispatching workloads #1538

Closed

3 tasks

Contributor

alculquicondor commented Jan 3, 2024 •

edited

Loading

@tenzen-y I was talking about this with @mwielgus, and we came to the conclusion that it might be better to introduce a form of backoff: for some period of time, the workload doesn't enter the queue. And the backoff could be exponential.

This might be better than reducing priority, as that would make the workload more susceptible to preemptions once admitted.

Member

tenzen-y commented Jan 3, 2024 •

edited

Loading

@tenzen-y I was talking about this with @mwielgus, and we came to the conclusion that it might be better to introduce a form of backoff: for some period of time, the workload doesn't enter the queue. And the backoff could be exponential.

This might be better than reducing priority, as that would make the workload more susceptible to preemptions once admitted.

@alculquicondor Totally SGTM. Will we work on the backoff mechanism in the first iteration?

Also, I still think we should introduce a timeout mechanism, as I said above. For example, we could avoid jobs without proper image credentials staying in the queue forever. WDYT?

Contributor

alculquicondor commented Jan 3, 2024

Will we work on the backoff mechanism in the first iteration?

I think it's doable. Let me get back to you tomorrow.

Also, I still think we should introduce a timeout mechanism, as I said above. For example, we could avoid jobs without proper image credentials staying in the queue forever. WDYT?

It becomes less important once we have an exponential backoff. Maybe we can add a max number of retries to the backoff, after which the Workload is considered failed.

Member

tenzen-y commented Jan 3, 2024

It becomes less important once we have an exponential backoff. Maybe we can add a max number of retries to the backoff, after which the Workload is considered failed.

That makes sense. I prefer a max number of retries to the backoff rather than timeout.

Member

tenzen-y commented Jan 5, 2024

Will we work on the backoff mechanism in the first iteration?

I think it's doable. Let me get back to you tomorrow.

@alculquicondor Is there any progress about iterations for the backoff mechanism?

Contributor

alculquicondor commented Jan 5, 2024

@nstogner will first work on what is already designed and then he will work on a backoff mechanism. Hopefully we can have both for the release.

If you have some time to finalize the design, feel free to open a PR.

Member

tenzen-y commented Jan 5, 2024

@nstogner will first work on what is already designed and then he will work on a backoff mechanism. Hopefully we can have both for the release.

If you have some time to finalize the design, feel free to open a PR.

That makes sense. Thanks!

tenzen-y mentioned this pull request

Policy to put Workloads in front of the queue after eviction #1282

Closed

3 tasks

Member

tenzen-y commented Jan 5, 2024

I will update this KEP for the backoff mechanism.

tenzen-y mentioned this pull request

Introduce RetryStrategy API to ProvisioningRequestConfig #3375

Merged

kannon92 pushed a commit to openshift-kannon92/kubernetes-sigs-kueue that referenced this pull request


          KEP-1282: Requeue Strategy (kubernetes-sigs#1311)

106580c

* Add kep-1282

* Update toc

* Update based on comments

* Address comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cncf-cla: yes kind/documentation lgtm ok-to-test release-note-none size/L