Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow skipping over tasks in schedulingQueue if wait limits hit #23470

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented Sep 17, 2024

Description

If we are iterating over tasks in scheduling queue we may see that
there are already many tasks waiting for node lease for given requirements.
In such case we are skipping a task and move to the next one, so one node does not block
scheduling other tasks in queue which may have different node requirements.

We are limiting how many tasks may be skipped to 20. This is to avoid pessimistic case that we
are iterating over whole scheduling queue on each change in the cluster when it is highly utilized.

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Sep 17, 2024
@github-actions github-actions bot added the docs label Sep 17, 2024
@losipiuk losipiuk requested review from wendigo and sopel39 September 17, 2024 22:27
@losipiuk
Copy link
Member Author

I have a suspicion it could be written more simply

@losipiuk losipiuk force-pushed the lukaszos/allow-skipping-over-tasks-in-schedulingqueue-if-wait-limits-hit-f2b8e5 branch 2 times, most recently from 45ae8f2 to 7f0ce9a Compare September 18, 2024 16:16
@@ -1913,6 +1905,14 @@ public Set<Map.Entry<ScheduledTask, PreSchedulingTaskContext>> listContextEntrie
return contexts.entrySet();
}

private long getWaitingForNodeTasksCount(TaskExecutionClass executionClass)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: getTasksWaitingForNodeCount

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets split into two methods later and Tasks is dropped from the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I did some renames. PTAL

@losipiuk losipiuk force-pushed the lukaszos/allow-skipping-over-tasks-in-schedulingqueue-if-wait-limits-hit-f2b8e5 branch from 7f0ce9a to ba98dc3 Compare September 19, 2024 10:27
@losipiuk losipiuk requested a review from wendigo September 19, 2024 10:27
If we are iterating over tasks in scheduling queue we may see that
thera are already many tasks waiting for node lease for given requirements.
In such case we are skipping a task and move to the next one, so one node does not block
scheduling other tasks in queue which may have different node requirements.

We are limiting how many tasks may be skipped to 20. This is to avoid pessimistic case that we
are iterating over whole scheduling queue on each change in the cluster when it is highly utilized.
@losipiuk losipiuk force-pushed the lukaszos/allow-skipping-over-tasks-in-schedulingqueue-if-wait-limits-hit-f2b8e5 branch from ba98dc3 to f3d7900 Compare September 19, 2024 12:55
@losipiuk
Copy link
Member Author

losipiuk commented Oct 7, 2024

Replaced with #23562

@losipiuk losipiuk closed this Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants