Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for pod integration #1103

Merged
merged 4 commits into from
Sep 29, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions site/content/en/docs/tasks/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ As a batch user, you can learn how to:
Kueue supports MPIJob v2beta1, PyTorchJob, TFJob, and XGBoostJob.
- [Run a Kueue managed KubeRay RayJob](/docs/tasks/run_rayjobs).
- [Submit Kueue jobs from Python](/docs/tasks/run_python_jobs).
- [Run a Kueue managed plain Pod](/docs/tasks/run_plain_pods)

### Platform developer

Expand Down
81 changes: 81 additions & 0 deletions site/content/en/docs/tasks/run_plain_pods.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
title: "Run A Plain Pod"
date: 2023-09-27
weight: 6
description: >
Run a Kueue scheduled Pod.
---

This page shows how to leverage Kueue's scheduling and resource management capabilities when running plain Pods.

This guide is for [batch users](/docs/tasks#batch-user) that have a basic understanding of Kueue. For more information, see [Kueue's overview](/docs/overview).

## Before you begin

1. By default, the integration for `v1/pod` is not enabled.
Learn how to [install Kueue with a custom manager configuration](/docs/installation/#install-a-custom-configured-released-version)
and enable the `pod` integration.

Example `integrations` section of manager configuration with enabled pod integration:
achernevskii marked this conversation as resolved.
Show resolved Hide resolved
```yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the top level fields, like kind

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I create a minimal working configuration, or just add some of the top level fields?

Should we do the same thing for the following pages?
run_mpijobs.md
run_jobsets.md

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just the top level fields, to have a good context of where in the configuration integrations fit.

Yes, we should probably do the same for others. You can do that in a follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated example configuration in adfe10d

integrations:
frameworks:
- "pod"
podOptions:
# You can change namespaceSelector to define in which
# namespaces kueue will manage the pods.
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: [ kube-system, kueue-system ]
# Kueue uses podSelector to manage pods with particular
# labels. The default podSelector will match all the pods.
podSelector:
matchExpressions:
- key: kueue-job
operator: In
values: [ "true", "True", "yes" ]
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that Pods that belong to some CRDs that we manage are excluded from being queued.

And that kueue adds a label to indicate which nodes are managed.


2. Check [Administer cluster quotas](/docs/tasks/administer_cluster_quotas) for details on the initial Kueue setup.

## Pod definition

When running Pods on Kueue, take into consideration the following aspects:

### a. Queue selection

The target [local queue](/docs/concepts/local_queue) should be specified in the `metadata.labels` section of the Pod configuration.

```yaml
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
```

### b. Configure the resource needs

The resource needs of the workload can be configured in the `spec.containers`.

```yaml
- resources:
requests:
cpu: 3
```

### c. Limitations

- A Kueue managed Pod cannot be created in `kube-system` or `kueue-system` namespaces.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that preemption actually terminates and deletes the pod.

When mentioning preemption, link to the documentation related to it.


## Example Pod

Here is a sample Pod that just sleep for a few seconds:
achernevskii marked this conversation as resolved.
Show resolved Hide resolved

{{< include "examples/pods-kueue/kueue-pod.yaml" "yaml" >}}

achernevskii marked this conversation as resolved.
Show resolved Hide resolved

```sh
# Create the pod
kubectl apply -f kueue-pod.yaml
```
19 changes: 19 additions & 0 deletions site/static/examples/pods-kueue/high-prio-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: v1
kind: Pod
metadata:
name: high-prio-1m-sleep
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
containers:
- name: sleep
image: busybox
command:
- sleep
args:
- 1m
resources:
requests:
cpu: 9
restartPolicy: OnFailure
priorityClassName: high-prio
18 changes: 18 additions & 0 deletions site/static/examples/pods-kueue/kueue-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: v1
kind: Pod
metadata:
generateName: kueue-sleep-
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
containers:
- name: sleep
image: busybox
command:
- sleep
args:
- 3s
resources:
requests:
cpu: 3
restartPolicy: OnFailure
19 changes: 19 additions & 0 deletions site/static/examples/pods-kueue/low-prio-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: v1
kind: Pod
metadata:
name: low-prio-1m-sleep
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
containers:
- name: sleep
image: busybox
command:
- sleep
args:
- 1m
resources:
requests:
cpu: 9
restartPolicy: OnFailure
priorityClassName: low-prio
17 changes: 17 additions & 0 deletions site/static/examples/pods-kueue/prios.yaml
achernevskii marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-prio
value: 1000000
globalDefault: false
description: "high priority"

---

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: low-prio
value: 100
globalDefault: true
description: "low priority"