Add support for pod groups #1319

achernevskii · 2023-11-10T03:33:52Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Introduce new ComposableJob interface for jobs which has to be composed of different API objects.
Add ComposableJob implementation for pod groups.
Add webhook checks for pod group labels and annotations.

Which issue(s) this PR fixes:

Related issue: #976

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Add support for groups of plain Pods.

k8s-ci-robot · 2023-11-10T03:33:55Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

netlify · 2023-11-10T03:33:59Z

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Name	Link
🔨 Latest commit	`4f3eb4f`
🔍 Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/6561207714dcdb00088ca4f8

alculquicondor · 2023-11-13T14:41:42Z

cc @trasc

trasc · 2023-11-14T07:14:00Z

/test all

trasc · 2023-11-14T11:39:32Z

/test all

pkg/controller/jobframework/reconciler.go

pkg/controller/jobframework/interface.go

pkg/controller/jobframework/reconciler.go

pkg/controller/jobs/pod/pod_controller.go

alculquicondor · 2023-11-14T21:02:37Z

pkg/controller/jobs/pod/pod_controller.go

+	}); err != nil {
+		return err
+	}
+
 	return jobframework.SetupWorkloadOwnerIndex(ctx, indexer, gvk)
 }

 func (p *Pod) Finalize(ctx context.Context, c client.Client) error {


Finalizers are tricky.

What sometimes happens is that you can enter this "Finalize" function, but not all the pods are there, and we are left with a Pod with a stuck finalizer.

But if we see this happens in integration tests, we can fix in a follow up.

pkg/controller/jobs/pod/pod_controller.go

pkg/controller/jobframework/interface.go

pkg/controller/jobframework/reconciler.go

pkg/controller/jobs/pod/pod_controller.go

pkg/controller/jobframework/reconciler.go

pkg/controller/jobs/pod/pod_webhook.go

test/integration/controller/jobs/pod/pod_controller_test.go

pkg/controller/jobs/pod/pod_controller.go

pkg/controller/jobframework/reconciler.go

pkg/controller/jobs/pod/pod_controller.go

alculquicondor · 2023-11-16T15:45:45Z

pkg/controller/jobs/pod/pod_controller.go

+
+	var resultPodSets []kueue.PodSet
+
+	for _, podInGroup := range podsInGroup.Items {


we need to ignore the Pods with phase Failed.

Actually, we can't fully ignore the Pods with phase Failed, otherwise we would think that the Workload object no longer matches.

One possible workaround is that we ignore the Failed objects, but we can change the logic for equivalentToWorkload a little, to allow for number of pods lower than the saved counter in the Workload object.

Changed this behavior in 09c09ce

test/integration/controller/jobs/pod/pod_controller_test.go

alculquicondor · 2023-11-24T15:42:28Z

pkg/controller/jobs/pod/pod_controller.go

+	}
+
+	if wl != nil && p.isGroup {
+		if evCond := apimeta.FindStatusCondition(wl.Status.Conditions, kueue.WorkloadEvicted); evCond != nil && evCond.Status == metav1.ConditionTrue {


why do we need to check for WorkloadEvicted inside Stop? Wouldn't Stop only be called when the workload is evicted or unadmitted?

Stop could be called when:

wl has a deletion timestamp

job is unsuspended, but wl is not admitted

job is unsuspended, but wl is not found

In all of those cases, we should issue Pod deletes, if the Pods don't already have DeletionTimestamps.

That's what we're doing. On the other hand, if workload is evicted and all pods are stopped, or at least one pod is running, we don't stop the group.

Ok, the logic is slightly off:

If the wl is deleted or it doesn't exist: issue deletes for everything, if we didn't already.

otherwise (eviction): delete anything that isn't suspended. There is no need to return an error if there is a running Pod, as we won't remove the reservation yet

kueue/pkg/controller/jobframework/reconciler.go

Line 303 in 44ad0d5

if !job.IsActive() {

alculquicondor · 2023-11-24T15:49:47Z

pkg/controller/jobs/pod/pod_controller.go

+			return true, nil
+		}
+	} else {
+		if p.isStopped() {


We can't remove the Workload or Pod finalizers even if all the pods have deletion timestamp.

Otherwise the Workload would be removed and it wouldn't be possible to send replacement pods.

The finalizers are removed here if workload is nil and all the pods have deletion timestamp. There's nothing to replace if workload has been already removed.

But this code needs a change anyway. FindMatchingWorkloads could return some workloads to delete, in this case we shouldn't finalize the pods.

Ah gotcha. However, it looks like we are duplicating code. We already finalize the job in the following scenarios:

If the workload is finished:

kueue/pkg/controller/jobframework/reconciler.go

Line 263 in 9692f3a

if err := r.finalizeJob(ctx, job); err != nil {

If there is no matching Workload:

kueue/pkg/controller/jobframework/reconciler.go

Line 454 in 9692f3a

if err := r.finalizeJob(ctx, job); err != nil {

So it looks like this logic is unnecessary here

This code is there for the case, when wl is deleted.

First reconcile will hit this branch, stop the job, finalize the wl and return:

kueue/pkg/controller/jobframework/reconciler.go

Line 247 in 9692f3a

if wl != nil && !wl.DeletionTimestamp.IsZero() {

Second reconcile will finalize the pods after ComposableJob.Load call.

We could finalize the pods on the first reconcile, but I don't think it's right from the interface perspective. Stop != Delete for the generic job. It's only true for pods. That's why the logic to finalize the pods if wl is nil is in ComposableJob.Load and not in the generic reconciler.

We could finalize the pods on the first reconcile.

Please let me know what do you think about this idea.

I see what you mean, for the case when the workload is deleted. However, the case when the Workload is finished is already covered.

Then, we need a single line:

return wl != nil && p.isStopped(), nil

And accompany it with a comment about why we need to finalize in that case.

alculquicondor · 2023-11-24T15:55:35Z

pkg/controller/jobs/pod/pod_controller_test.go

+			},
+			workloadCmpOpts: defaultWorkloadCmpOpts,
+		},
+		"workload is not deleted if one of the pods in the finished group is deleted": {


if ALL of the pods are deleted (have a finalizer and a deletionTimestamp)

pkg/controller/jobs/pod/pod_controller_test.go

alculquicondor · 2023-11-24T15:58:18Z

pkg/controller/jobs/pod/pod_controller_test.go

+			wantWorkloads:   []kueue.Workload{},
+			workloadCmpOpts: defaultWorkloadCmpOpts,
+			deleteWorkloads: true,
+		},


Some of the added tests are not accurate. Left some comments.

test/integration/controller/jobs/pod/pod_controller_test.go

alculquicondor · 2023-11-24T16:11:49Z

test/integration/controller/jobs/pod/pod_controller_test.go

+					gomega.Consistently(func(g gomega.Gomega) {
+						g.Expect(k8sClient.Get(ctx, pod1LookupKey, createdPod1)).To(gomega.Succeed())
+						g.Expect(k8sClient.Get(ctx, pod2LookupKey, createdPod2)).To(gomega.Succeed())
+					}, util.ConsistentDuration, util.Interval).Should(gomega.Succeed())


No need for this. But we might want to check that the Pods get a DeletionTimestamp.

alculquicondor · 2023-11-24T16:13:41Z

test/integration/controller/jobs/pod/pod_controller_test.go

+				ginkgo.By("creating the replacement pod and readmitting the workload will unsuspended the replacement", func() {
+					gomega.Expect(k8sClient.Create(ctx, replacementPod)).Should(gomega.Succeed())
+
+					gomega.Expect(k8sClient.Get(ctx, wlLookupKey, createdWorkload)).To(gomega.Succeed())


worth checking that this is the same Workload that was initially created. The UID should match

pkg/controller/jobs/pod/pod_controller.go

alculquicondor · 2023-11-24T19:20:20Z

pkg/controller/jobs/pod/pod_controller.go

+	}
+
+	if wl != nil && p.isGroup {
+		if evCond := apimeta.FindStatusCondition(wl.Status.Conditions, kueue.WorkloadEvicted); evCond != nil && evCond.Status == metav1.ConditionTrue {


Ok, the logic is slightly off:

If the wl is deleted or it doesn't exist: issue deletes for everything, if we didn't already.

otherwise (eviction): delete anything that isn't suspended. There is no need to return an error if there is a running Pod, as we won't remove the reservation yet

kueue/pkg/controller/jobframework/reconciler.go

Line 303 in 44ad0d5

if !job.IsActive() {

pkg/controller/jobframework/reconciler.go

test/integration/controller/jobs/pod/pod_controller_test.go

alculquicondor · 2023-11-24T22:03:53Z

/approve

Ready for squash :)

k8s-ci-robot · 2023-11-24T22:04:02Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achernevskii, alculquicondor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [alculquicondor]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2023-11-24T22:04:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achernevskii, alculquicondor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [alculquicondor]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

alculquicondor · 2023-11-24T22:11:14Z

/lgtm

k8s-ci-robot · 2023-11-24T22:11:20Z

LGTM label has been added.

Git tree hash: 9ddd6799a46ce66840939683fe2b40f79063d2ec

* Introduce new ComposableJob interface for jobs which has to be composed of different API objects. * Add custom get. A composable job can get all it's elements at the beginning of the reconcile. * Add ComposableJob implementation for pod groups. * Add webhook checks for pod group labels and annotations. * Update Finished method for pod group * IsSuspended and Stop methods of the pod controller now interact with all the pods at once. * Update IsActive function to check if at least one pod in the group is running. * Change podSuspended method. * Add stop skip for pods in group that already have a delition timestamp. * Add IsComposableJobActive * Add UnretryableError error, that doesn't require reconcile retry. * Add ValidateLabelAsCRDName call for the pod-group, make pod-group label immutable. * Add unit tests for pod group integration

alculquicondor · 2023-11-24T22:30:02Z

/lgtm

k8s-ci-robot · 2023-11-24T22:30:08Z

LGTM label has been added.

Git tree hash: d7e27626543786f8d09179a504306d418cb4d139

* Introduce new ComposableJob interface for jobs which has to be composed of different API objects. * Add custom get. A composable job can get all it's elements at the beginning of the reconcile. * Add ComposableJob implementation for pod groups. * Add webhook checks for pod group labels and annotations. * Update Finished method for pod group * IsSuspended and Stop methods of the pod controller now interact with all the pods at once. * Update IsActive function to check if at least one pod in the group is running. * Change podSuspended method. * Add stop skip for pods in group that already have a delition timestamp. * Add IsComposableJobActive * Add UnretryableError error, that doesn't require reconcile retry. * Add ValidateLabelAsCRDName call for the pod-group, make pod-group label immutable. * Add unit tests for pod group integration

k8s-ci-robot requested a review from denkensk November 10, 2023 03:33

k8s-ci-robot requested a review from mimowo November 10, 2023 03:33

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 10, 2023

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2023

trasc force-pushed the feature/pod_groups branch from d59e250 to 9e63490 Compare November 14, 2023 11:38

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 14, 2023

alculquicondor reviewed Nov 14, 2023

View reviewed changes

alculquicondor reviewed Nov 15, 2023

View reviewed changes

pkg/controller/jobs/pod/pod_controller.go Show resolved Hide resolved

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 16, 2023

alculquicondor reviewed Nov 16, 2023

View reviewed changes

achernevskii force-pushed the feature/pod_groups branch from 92a15d7 to 676cecf Compare November 17, 2023 04:38

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 17, 2023

achernevskii changed the title ~~Draft: Add support for pod groups~~ Add support for pod groups Nov 17, 2023

achernevskii marked this pull request as ready for review November 17, 2023 04:45

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 17, 2023

k8s-ci-robot requested a review from trasc November 17, 2023 04:45

k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Nov 17, 2023

alculquicondor reviewed Nov 23, 2023

View reviewed changes

test/integration/controller/jobs/pod/pod_controller_test.go Show resolved Hide resolved

alculquicondor reviewed Nov 23, 2023

View reviewed changes

test/integration/controller/jobs/pod/pod_controller_test.go Outdated Show resolved Hide resolved

achernevskii force-pushed the feature/pod_groups branch from 0d878c0 to 4c035ab Compare November 24, 2023 05:49

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 24, 2023

alculquicondor reviewed Nov 24, 2023

View reviewed changes

test/integration/controller/jobs/pod/pod_controller_test.go Outdated Show resolved Hide resolved

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 24, 2023

achernevskii force-pushed the feature/pod_groups branch from ca3354b to c264f4c Compare November 24, 2023 22:10

k8s-ci-robot assigned alculquicondor Nov 24, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 24, 2023

achernevskii force-pushed the feature/pod_groups branch from c264f4c to 4f3eb4f Compare November 24, 2023 22:15

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 24, 2023

k8s-ci-robot requested a review from alculquicondor November 24, 2023 22:15

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 24, 2023

k8s-ci-robot merged commit 879d798 into kubernetes-sigs:main Nov 24, 2023
3 checks passed

k8s-ci-robot added this to the v0.6 milestone Nov 24, 2023

achernevskii mentioned this pull request Nov 30, 2023

Fix the equivalentToWorkload method for pod groups #1384

Merged

trasc deleted the feature/pod_groups branch March 12, 2024 08:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for pod groups #1319

Add support for pod groups #1319

achernevskii commented Nov 10, 2023 •

edited

Loading

k8s-ci-robot commented Nov 10, 2023

netlify bot commented Nov 10, 2023 •

edited

Loading

alculquicondor commented Nov 13, 2023

trasc commented Nov 14, 2023

trasc commented Nov 14, 2023

alculquicondor Nov 14, 2023

alculquicondor Nov 16, 2023

alculquicondor Nov 16, 2023

achernevskii Nov 17, 2023

alculquicondor Nov 24, 2023

achernevskii Nov 24, 2023 •

edited

Loading

alculquicondor Nov 24, 2023

achernevskii Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

achernevskii Nov 24, 2023

alculquicondor Nov 24, 2023

achernevskii Nov 24, 2023

achernevskii Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor Nov 24, 2023

alculquicondor commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

alculquicondor commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

alculquicondor commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023


		var resultPodSets []kueue.PodSet

		for _, podInGroup := range podsInGroup.Items {

Add support for pod groups #1319

Add support for pod groups #1319

Conversation

achernevskii commented Nov 10, 2023 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

k8s-ci-robot commented Nov 10, 2023

netlify bot commented Nov 10, 2023 • edited Loading

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

alculquicondor commented Nov 13, 2023

trasc commented Nov 14, 2023

trasc commented Nov 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

achernevskii Nov 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alculquicondor commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

alculquicondor commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

alculquicondor commented Nov 24, 2023

k8s-ci-robot commented Nov 24, 2023

achernevskii commented Nov 10, 2023 •

edited

Loading

netlify bot commented Nov 10, 2023 •

edited

Loading

achernevskii Nov 24, 2023 •

edited

Loading