Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: set job on system stack for CSI feasibility check #15372

Merged
merged 1 commit into from
Nov 23, 2022

Conversation

tgross
Copy link
Member

@tgross tgross commented Nov 23, 2022

Fixes #15094

When the scheduler checks feasibility of each node, it creates a "stack" which carries attributes of the job and task group it needs to check feasibility for. The system and sysbatch scheduler use a different stack than service and batch jobs. This stack was missing the call to set the job ID and namespace for the CSI check. This prevents CSI volumes from being scheduled for system jobs whenever the volume is in a non-default namespace.

Set the job ID and namespace to match the generic scheduler.

@tgross tgross force-pushed the b-csi-set-namespace-and-job-id-system-stack branch from 405a0e1 to 637fcb2 Compare November 23, 2022 20:42
@tgross tgross added this to the 1.4.4 milestone Nov 23, 2022
@tgross tgross force-pushed the b-csi-set-namespace-and-job-id-system-stack branch from 637fcb2 to e4b4336 Compare November 23, 2022 20:58
@tgross
Copy link
Member Author

tgross commented Nov 23, 2022

This is currently failing the linter check because of #15373 but once that's merged I can rebase this on main (or we can merge as-is if everything else is green). Fixed

When the scheduler checks feasibility of each node, it creates a "stack" which
carries attributes of the job and task group it needs to check feasibility
for. The `system` and `sysbatch` scheduler use a different stack than `service`
and `batch` jobs. This stack was missing the call to set the job ID and
namespace for the CSI check. This prevents CSI volumes from being scheduled for
system jobs whenever the volume is in a non-default namespace.

Set the job ID and namespace to match the generic scheduler.
Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice catch! I was investigating this, but got sidetracked and didn't have the chance to get back to it. Thanks for taking it over!

@tgross tgross added backport/1.2.x backport to 1.1.x release line backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line labels Nov 23, 2022
@tgross tgross merged commit 8018dd0 into main Nov 23, 2022
@tgross tgross deleted the b-csi-set-namespace-and-job-id-system-stack branch November 23, 2022 21:47
tgross added a commit that referenced this pull request Nov 23, 2022
…#15376)

When the scheduler checks feasibility of each node, it creates a "stack" which
carries attributes of the job and task group it needs to check feasibility
for. The `system` and `sysbatch` scheduler use a different stack than `service`
and `batch` jobs. This stack was missing the call to set the job ID and
namespace for the CSI check. This prevents CSI volumes from being scheduled for
system jobs whenever the volume is in a non-default namespace.

Set the job ID and namespace to match the generic scheduler.
tgross added a commit that referenced this pull request Nov 23, 2022
…#15376) (#15376)

When the scheduler checks feasibility of each node, it creates a "stack" which
carries attributes of the job and task group it needs to check feasibility
for. The `system` and `sysbatch` scheduler use a different stack than `service`
and `batch` jobs. This stack was missing the call to set the job ID and
namespace for the CSI check. This prevents CSI volumes from being scheduled for
system jobs whenever the volume is in a non-default namespace.

Set the job ID and namespace to match the generic scheduler.

Co-authored-by: Tim Gross <[email protected]>
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/1.2.x backport to 1.1.x release line backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line theme/storage type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sysbatch/system type jobs fail to be scheduled when using a multi-node-multi-writer CSI volume
2 participants