You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the ability for the leader Pod to be in its own affinity group when using the subgroup feature. For example, when deploying a leader Pod that should be scheduled on a CPU-only VM and worker Pods that should be scheduled on multiple TPU slices:
What would you like to be added:
Add the ability for the leader Pod to be in its own affinity group when using the subgroup feature. For example, when deploying a leader Pod that should be scheduled on a CPU-only VM and worker Pods that should be scheduled on multiple TPU slices:
Currently the leader Pod is put in subgroup 0 which causes it to have the same affinity key as the workers in subgroup 0: https://github.com/kubernetes-sigs/lws/blob/main/pkg/webhooks/pod_webhook.go#L132. This causes the leader Pod in my example to be unscheduable because of the CPU instance type node selectors.
Why is this needed:
To support deploying leader-worker architectures where the leader should be scheduled in separate topologies from the worker groups.
Completion requirements:
An option in
subGroupPolicy
that causes the leader to have its own affinity key.This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
The text was updated successfully, but these errors were encountered: