-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MaxWaitingTime in case of job starvation of lower priority #754
Comments
what would happen when the workload hits the maxWaitingTime? |
It will be popped out for scheduling whatever the priority. |
You mean it will go to the head of the ClusterQueue? It seems a bit aggressive. Another way this can be modeled is that the priority actually goes up. And how to prevent abuse? |
Only moving the workload to the head of the queue will be pointless when preemption is enabled, it will just be evicted at the next scheduler cycle. |
I like this.
Maybe, ClusterQueue? |
Yes, a bit aggressive but straightforward if some workloads have the deadline to run. But I'm not quite sure about this, I think we can wait for more feedbacks from the community.
But how much to go up each time?
localQueue or clusterQueue both make sense to me.
I think it's hard to answer, this is configurable but considering localQueue should be created by admin, we can be optimistic about this? |
If resources are insufficient, this is a problem. |
Hi, |
Yes, you can take it if you like, but we haven't reached an agreement about the implementation, like a maxWaitingTime, or increate the priority per scheduling cycle, or something else. And we should also consider the preemption. |
Sure, thanks! I will read this thread in detail and clarify it. |
I would also consider adding an interface somewhere that allows us to implement different policies for ordering/priorities. If you have a concrete use case, that would help us understand what the API can look like. |
Hi, I did some research this weekend and summarized my proposal. I am considering adding the following fields to clusterQueue.spec: Use cases: What do you think? As discussed, since it is set for clusterQueue, developers wouldn't abuse this. spec:
timeoutStrategy:
labels:
environment: prod
system: recommendation
maxWaitingTime: 1hour
strategy: Prioritize
addingPriority: 10 |
Maybe we should consider adding a separate object basePriority: 100
dynamicPolicy:
delay: 1h
action: PriorityDelta
value: 10 Open questions:
Definitely worth a KEP: https://github.com/kubernetes-sigs/kueue/tree/main/keps |
Hi @alculquicondor, Thanks for your response. I have a question.
I think we should consider implementing another action going to the head of the ClusterQueue if it times out, which was the initial idea @kerthcet mentioned.
I don't have a strong opinion on this, but I think it's better to update the priority in the object. If so, admins can easily change the priority manually.
I agree. I will create it. |
Yes, I don't think it's realistic that all jobs in a ClusterQueue would have the same behavior.
I'm not too sure. Because this workload could be immediately preempted in the next iteration. It's better if it's priority changes.
Yes, but also to my point above about not being easily preempted. Can you look around in other queueing systems if they have dynamic priority and how it works? |
Thanks for your explanation! @alculquicondor
I have one question about preemption.
I agree. It's important.
Sure. I will look around and rethink the design based on your explanation. |
Exactly. That's probably desired? Which means we should probably add a |
Sounds good! I wasn't quite clear on the intention behind So, if I understand correctly, |
Almost there. So after 1h, we have |
I see. Thanks. Then I will create KEP (and also investigate on other queue systems). /assign |
I created a simplified version of this feature request #973 |
I think we can leverage the workload priority to mitigate this issue. The general idea would like when waiting time ends, we'll raise the workload priority to a higher value and job will enqueue. Next scheduling cycle, the job will not be preempted as designed. cc @B1F030 |
More generally, we will be needing a policy for dynamic priority. Another use case that popped out is to decrease priority based on evictions #1311 (comment) |
For this case, maybe we can be more straightforward here like:
Intuitive and easy to control the job queueing. |
Here I drawed a simple design of this mechanism: The And based on the But the second design may be a little complex, and I think we should not expose the What do you think? |
Let's take a step back... do you have a real requirement from an end-user that you can share? What is the behavior they expect, regardless of how the API looks like. To me, the idea of a Job getting the highest priority after some timeout sounds odd. Is the job really the most important of all? Additionally, this implies that the job is potentially never going to be preempted. Is this acceptable? Have you looked into older systems for inspiration? |
The goal is to prevent job starvation, which is also part of fair-sharing, low priority job will pending in the queue forever if higher priority jobs enqueueing continuously. Volcano has similar design https://github.com/volcano-sh/volcano/blob/master/docs/design/sla-plugin.md, however, the mechanism is when hitting the max waiting time, it will admit the queue and try to reserve resources for it until ready. Some of our users are using Volcano, we hope to provide smooth migrations for them. At the least, I think this demand still sounds reasonable, hope I'm not the only one. But we can think more about the API design.
This is tricky... but we're somehow a preemption based system. Seems no better ways, I mean this design #754 (comment). |
Complex + 1 .., let's discuss the rationality first. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/lifecycle froze |
/lifecycle frozen |
We enqueue the jobs via the priorities mainly, which will cause job starvation of lower priorities,we can provide a mechanism to avoid this.
What would you like to be added:
A new field MaxWaitingTime to avoid job starvation.
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
The text was updated successfully, but these errors were encountered: