-
Notifications
You must be signed in to change notification settings - Fork 71
add job priority for kube-batch scheduling #141 #45
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
job_controller/job.go
Outdated
if jc.Config.EnableGangScheduling { | ||
minAvailableReplicas := getTotalReplicas(replicas) | ||
priorityClassName := getPriorityClassName(runPolicy) | ||
//_, err := pc.SyncPodGroup(job, minAvailableReplicas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why keep this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reference the code about EnableGangScheduling of other operators and add the priority‘s’attribute and function.
job_controller/api/v1/types.go
Outdated
@@ -188,4 +188,8 @@ type RunPolicy struct { | |||
// job, for example `minAvailable` for gang-scheduling. | |||
type SchedulingPolicy struct { | |||
MinAvailable *int32 `json:"minAvailable,omitempty"` | |||
|
|||
//add PriorityClassName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this line
job_controller/job.go
Outdated
@@ -40,7 +40,7 @@ func (jc *JobController) deletePodsAndServices(runPolicy *apiv1.RunPolicy, job i | |||
} | |||
return nil | |||
} | |||
|
|||
// TTL means time to live |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line, too.
@@ -299,3 +309,14 @@ func (jc *JobController) cleanupJob(runPolicy *apiv1.RunPolicy, jobStatus apiv1. | |||
jc.WorkQueue.AddRateLimited(key) | |||
return nil | |||
} | |||
func getPriorityClassName(runPolicy *apiv1.RunPolicy) string { | |||
priorityClassName := *runPolicy.SchedulingPolicy.PriorityClassName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime error here
func getTotalReplicas(replicas map[apiv1.ReplicaType]*apiv1.ReplicaSpec) int32 { | ||
jobReplicas := int32(0) | ||
for _, r := range replicas { | ||
jobReplicas += *r.Replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check if r.Replicas is nil here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it
I try to correct it |
Hmm I just asked the author of the other PR in MPI operator to submit the changes here: kubeflow/mpi-operator#141 Are you in the same team? It would be better if you can send a note before submitting the PR to avoid redundant work in the future. Anyways, thanks for your contribution! |
cc @4everming |
@terrytangyuan Yes, we are in same team. I will be more careful next time. Thank you for reminding me. |
@terrytangyuan Since the MPI operator does not use the common operator now, should we keep the similar logic in these two repos? |
Yes I think it’s fine to do that for now. Though the types like RunPolicy is used in MPI operator so that should be reused. |
Yeah, I think so. Really appreciate it if you could open an issue to illustrate why we need the change and how to implement it in several operators. Personally, I know that there is a queue in PodGroup and we can set it to define the priority. And the default value is |
ok,I will open an issue about the PR |
that's great, I'd like to contribute too :) |
Sure thing. Thanks a lot. |
@4everming and @YesterdayxD are working on a proposal, @ me and k82cn@ if you think it is ready to review. |
/close |
@terrytangyuan: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Closing this due to inactivity. Feel free to re-open if this is picked up again. |
Co-authored-by: Paul Angerer <[email protected]>
This change is