-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Hierarchical Cohorts] Define Cohort API #2693
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
//+kubebuilder:validation:MaxLength=253 | ||
//+kubebuilder:validation:Pattern="^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$" | ||
// | ||
Parent string `json:"parent,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should be a pointer, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CQ also defines like this
Cohort string `json:"cohort,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, an empty string plays the role of no parent. That's fine, consistency is important.
//+kubebuilder:resource:scope=Cluster | ||
|
||
// Cohort is the Schema for the cohorts API | ||
type Cohort struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer to have some prototype or at least minimal functionality implemented before merging the API. It would increase the confidence in the API. Still, we can merge it as a separate PR, but would be good to see the implementation more on the horizon. WDYT @tenzen-y ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is possible, but my intention was to keep the PRs as small as possible for reviewability. Also, this is in v1alpha1
, and not yet cut into a minor release, so I'd argue we can change it freely for some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to select either approach
- a single big bung PR containing all implementations and APIs.
- multiple small PRs, but API changes PR are merged in the final phase.
TBH, I would prefer to opt 2 since it's challenging to review the opt 1 PR.
In the case of opt 2, we should expose APIs in the last phase since it is better not to expose the unimplemented APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need this type for development of the rest of the features - in this case, should I just define it in pkg
for now, and then move it to api
later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... my intention was to keep the PRs as small as possible for reviewability.
That's for sure, I was also thinking about merging this PR separately, but once seeing some PoC implementation to increase the confidence the API can be released.
Also, this is in v1alpha1
Good point, and the first iteration of the API looks quite minimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The api was discussed and merged in https://github.com/kubernetes-sigs/kueue/tree/main/keps/79-hierarchical-cohorts. I'm not sure if adding basic implementation to this PR (that has very little to do with the API - it is mostly about scheduling and fitting the workloads) would make it more convincing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that you can learn how we should go here: #1714
Yeah, with the caveat that fair sharing for API only introduced fields rather than API. For a feature which requires new API it might be harder to develop it's logic without that API merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The api was discussed and merged in https://github.com/kubernetes-sigs/kueue/tree/main/keps/79-hierarchical-cohorts.
Indeed, together with the fact this is still alpha API to reduce the burden of rebases I would be leaning to merge it.
The only downside is that in case we need to release 0.9 urgently we will release hollow alpha API.Are we good with this @mwielgus @tenzen-y ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only downside is that in case we need to release 0.9 urgently we will release hollow alpha API.Are we good with this @mwielgus @tenzen-y ?
That is my primary concern, as I mentioned above.
In the case of opt 2, we should expose APIs in the last phase since it is better not to expose the unimplemented APIs.
In the case of urgently minor release, let's revert all PRs related to Hierarchical Cohorts...
I hope that we never face the situation...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of urgently minor release, let's revert all PRs related to Hierarchical Cohorts... I hope that we never face the situation...
I guess it will be dependent on the completeness of the feature at the moment of releasing 0.9.
I synced with @gabesaba and we are ok to rollback the PRs related to the new API if the feature is still vastly unfinished when doing 0.9.
// CohortSpec defines the desired state of Cohort | ||
type CohortSpec struct { | ||
// Parent references the name of the Cohort's parent, if | ||
// any. It satisfies one of three cases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a cycle. What happens in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We disable all members of the Cohort graph. I updated documentation.
apis/kueue/v1alpha1/cohort_types.go
Outdated
// | ||
// BorrowingLimit limits how much members of this Cohort | ||
// subtree can borrow from the parent subtree. This limit must | ||
// only be set when the Cohort has a parent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens otherwise? We catch it at validation phase, invalidate the cohort or let it be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be validated by the webhook, and we will reject the create/update. Updated the documentation.
} | ||
|
||
// CohortStatus defines the observed state of Cohort | ||
type CohortStatus struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you expect it to be left empty? If yes - drop the struct for now. If not - please add the expected content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the Conditions
field, and the CohortActive field. One note: I modified the condition from the KEP slightly, to match ClusterQueue
KEP
CohortActive = "CohortActive"
This PR
CohortActive = "Active"
/lgtm |
LGTM label has been added. Git tree hash: ccee3899d6dcb722430ab44b1a0a1b7f90a9ffe5
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gabesaba, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
thanks for the reviews! |
/release-note-edit
|
/remove-kind api-change |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
Define Cohort API #79
Special notes for your reviewer:
Webhook and Reconciliation logic will be defined in follow-up PRs, to keep this PR small.
Does this PR introduce a user-facing change?