Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep adjusted workload resources in sync with limitRanges and runtimeClasses #653

Merged

Conversation

trasc
Copy link
Contributor

@trasc trasc commented Mar 20, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

Keeps in sync the adjusted workload resource needs (which are computed based on the workload's spec, potential cluster resourceClass definition and potential namespace limitRange definition) , with the cluster context (limitRanges and runtimeClasses).

In addition, now, the admission status struct keeps track of the workload's resource needs at the admission time, this values being the source for computing the resource needs for the admitted workloads.

Which issue(s) this PR fixes:

Fixes #611

Special notes for your reviewer:

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. labels Mar 20, 2023
@k8s-ci-robot k8s-ci-robot requested review from ahg-g and kerthcet March 20, 2023 13:16
@netlify
Copy link

netlify bot commented Mar 20, 2023

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit a956f88
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/641c6b8849b9bd0008234cc1

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 20, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @trasc. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 20, 2023
@alculquicondor
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 20, 2023
@trasc trasc force-pushed the keep_adjusted_workload_resources_in_sync branch from 0490283 to ff53e85 Compare March 20, 2023 19:39
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 20, 2023
@trasc trasc marked this pull request as ready for review March 20, 2023 19:39
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 20, 2023
@trasc
Copy link
Contributor Author

trasc commented Mar 20, 2023

/cc @alculquicondor
/cc @mwielgus

@k8s-ci-robot k8s-ci-robot requested a review from mwielgus March 20, 2023 19:40
apis/kueue/v1beta1/workload_types.go Outdated Show resolved Hide resolved
apis/kueue/v1beta1/workload_types.go Outdated Show resolved Hide resolved
@@ -71,6 +71,12 @@ type PodSetFlavors struct {

// Flavors are the flavors assigned to the workload for each resource.
Flavors map[corev1.ResourceName]ResourceFlavorReference `json:"flavors,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are on time to change PodSetFlavors by something else, given that it's no longer just about flavors.

Maybe PodSetAssignments is a bit more generic?

And we could have a single list holding both the quantity and the flavor, instead of two maps. I'm not to sure about this one. How does the calculation of workload.Info change if we do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first part done.

For the second one ... , it will imply adding two addition conversion methods ... but it will look strange. Also since ResourceList is "standard", I'd say to keep it as is. (also the time that will take to re-re-work the tests might not be negligible )

@trasc trasc force-pushed the keep_adjusted_workload_resources_in_sync branch 2 times, most recently from ff0c7e3 to 36f230d Compare March 21, 2023 15:02
apis/kueue/v1beta1/workload_types.go Outdated Show resolved Hide resolved
apis/kueue/v1beta1/workload_types.go Outdated Show resolved Hide resolved
pkg/workload/workload.go Outdated Show resolved Hide resolved
pkg/controller/core/workload_controller.go Outdated Show resolved Hide resolved
pkg/controller/core/workload_controller.go Outdated Show resolved Hide resolved
pkg/controller/core/indexer/indexer.go Outdated Show resolved Hide resolved
pkg/controller/core/indexer/indexer.go Outdated Show resolved Hide resolved
pkg/cache/snapshot_test.go Outdated Show resolved Hide resolved
pkg/scheduler/preemption/preemption_test.go Outdated Show resolved Hide resolved
@trasc trasc force-pushed the keep_adjusted_workload_resources_in_sync branch 2 times, most recently from f2a40c7 to 7c4469a Compare March 22, 2023 07:43
@trasc
Copy link
Contributor Author

trasc commented Mar 22, 2023

/retest

pkg/controller/core/workload_controller.go Outdated Show resolved Hide resolved
pkg/controller/core/workload_controller.go Outdated Show resolved Hide resolved
pkg/controller/core/workload_controller.go Outdated Show resolved Hide resolved
pkg/controller/core/workload_controller.go Outdated Show resolved Hide resolved
test/util/util.go Outdated Show resolved Hide resolved
test/integration/scheduler/workload_controller_test.go Outdated Show resolved Hide resolved
test/integration/scheduler/workload_controller_test.go Outdated Show resolved Hide resolved
test/integration/scheduler/workload_controller_test.go Outdated Show resolved Hide resolved
@alculquicondor
Copy link
Contributor

just nits

@trasc trasc force-pushed the keep_adjusted_workload_resources_in_sync branch from 77073b5 to 39c14be Compare March 23, 2023 09:39
Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Just couple of typos

test/integration/scheduler/workload_controller_test.go Outdated Show resolved Hide resolved
test/integration/scheduler/workload_controller_test.go Outdated Show resolved Hide resolved
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, trasc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 23, 2023
@trasc trasc force-pushed the keep_adjusted_workload_resources_in_sync branch from 39c14be to f9017ca Compare March 23, 2023 14:15
@alculquicondor
Copy link
Contributor

/label tide/merge-method-squash
/lgtm

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 23, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 23, 2023
@alculquicondor
Copy link
Contributor

/lgtm cancel

It looks like this is introducing a flaky test. Please investigate

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 23, 2023
@alculquicondor
Copy link
Contributor

https://testgrid.k8s.io/sig-scheduling#periodic-kueue-test-integration-main testgrid looks fine, so I don't think this is a problem in the current test suite.

gomega.Expect(k8sClient.Create(ctx, localQueue)).To(gomega.Succeed())
})
ginkgo.AfterEach(func() {
ginkgo.By("Resource consumption should be 0", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ginkgo.By("Resource consumption should be 0", func() {
ginkgo.By("Resource consumption should be 0 and no pending workloads", func() {

//
// Beside what is provided in podSet's specs, this calculation takes into account
// the LimitRange defaults and RuntimeClass overheads at the moment of admission.
ResourceUsage corev1.ResourceList `json:"resourceUsage,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make a PR that includes the field name changes without adding ResourceUsage?

I would like to merge that before cutting the release.

If this PR ends up taking longer, we can leave it to the next release.

@trasc trasc force-pushed the keep_adjusted_workload_resources_in_sync branch from f9017ca to a956f88 Compare March 23, 2023 15:08
@alculquicondor
Copy link
Contributor

passed 1/3

/test pull-kueue-test-integration-main

@alculquicondor
Copy link
Contributor

passed 2/3

/test pull-kueue-test-integration-main

/milestone v0.3

@alculquicondor alculquicondor added this to the v0.3 milestone Mar 23, 2023
@alculquicondor
Copy link
Contributor

passed 3/3

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 23, 2023
@k8s-ci-robot k8s-ci-robot merged commit db8e38f into kubernetes-sigs:main Mar 23, 2023
@trasc trasc deleted the keep_adjusted_workload_resources_in_sync branch April 20, 2023 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Resync the workload resource values upon LimitRange changes
3 participants