Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor schedule type #967

Merged
merged 1 commit into from
Nov 29, 2021
Merged

Refactor schedule type #967

merged 1 commit into from
Nov 29, 2021

Conversation

dddddai
Copy link
Member

@dddddai dddddai commented Nov 15, 2021

Signed-off-by: dddddai [email protected]

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
As discussed with @RainbowMango at last meeting, it's not a good idea to branch by schedule types because it's tricky for us to extend scheduler behavior

Get ready for #829

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@karmada-bot karmada-bot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Nov 15, 2021
@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 15, 2021
@RainbowMango
Copy link
Member

Many thanks to @dddddai , I'll look into it ASAP.
After a quick look, I see you removed AvoidSchedule and Unknown in this PR, is that right?

@dddddai
Copy link
Member Author

dddddai commented Nov 16, 2021

After a quick look, I see you removed AvoidSchedule and Unknown in this PR, is that right?

These types were used to choose branches, I removed them since they are no longer needed
Other schedule types still remain for metrics, for exmaple:

metrics.BindingSchedule(string(FirstSchedule), metrics.SinceInSeconds(start), err)

@RainbowMango
Copy link
Member

cc @Garrybest @mrlihanbo

@Garrybest
Copy link
Member

I'll check it later.

@Garrybest
Copy link
Member

/assign


clusterPolicyName := util.GetLabelValue(clusterResourceBinding.Labels, policyv1alpha1.ClusterPropagationPolicyLabel)
start := time.Now()
if !helper.IsBindingReady(&rb.Status) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, the Scheduled condition is just to indicate that the ResourceBinding or ClusterResourceBinding has been scheduled once, not to indicate that the status of every scheduling process, right?

Copy link
Member Author

@dddddai dddddai Nov 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please refer to #821 (comment), and I'm not sure if #988 meets what @RainbowMango said

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is out of my expectation, I used to think that this condition is to indicate the status of every scheduling process. If first schedule, rescedule or scale schedule failed, it should be set to False. For now, we will keep the schedule result of last success scheduling if reschedule failed. But for users, they don't know why schedule failed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @RainbowMango Is it the expected behavior of this condition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need a condition to reflect the schedule failed state. I thought it might be too complicated, so we didn't handle the failed case in #821.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh-huh, it seems like the "First" schedule is a kind of special "Reschedule"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh there is a problem: for example, if FailoverSchedule failed, then Scheduled condition would be false, then scheduler would regard it as FirstSchedule whose behavior is different from FailoverSchedule

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about to do it like:

func (s *Scheduler) doScheduleBinding(namespace, name string) error {
	rb, err := s.bindingLister.ResourceBindings(namespace).Get(name)
	if err != nil {
		if apierrors.IsNotFound(err) {
			// the binding does not exist, do nothing
			return nil
		}
		return err
	}

	start := time.Now()

	defer func() {
		if err != nil {
			// set "Scheduled" condition false
		} else {
			// set "Scheduled" condition true
		}
	}()

	policyPlacement, policyPlacementStr, err := s.getPlacement(rb)
	if err != nil {
		return err
	}
	appliedPlacement := util.GetLabelValue(rb.Annotations, util.PolicyPlacementAnnotation)
	if policyPlacementStr != appliedPlacement {
		// policy placement changed, need reschedule
		klog.Infof("Reschedule ResourceBinding(%s/%s) as placement changed", namespace, name)
		err = s.scheduleResourceBinding(rb)
		metrics.BindingSchedule(string(ReconcileSchedule), metrics.SinceInSeconds(start), err)
		return err
	}

Scheduled condition is to indicate the schedule result of every schedule process. If schedule failed, we won't update the applied-placement annation, but will set the Scheduled condition as false. The scheduler behavior will be consistent. The schedule will scheduleResourceBinding once applied policy is not the same as current policy. Otherwise, perform failover schedule or scale schedule.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it sounds reasonable, i think we can get this one merged then do this job in #988

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree

@mrlihanbo
Copy link

/lgtm
/cc @Garrybest

@karmada-bot karmada-bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Nov 26, 2021
@dddddai dddddai force-pushed the sched-type branch 2 times, most recently from c0872e0 to de55101 Compare November 26, 2021 09:57
@Garrybest
Copy link
Member

Generally LGTM.

Signed-off-by: dddddai <[email protected]>
@Garrybest
Copy link
Member

Thanks!
/lgtm

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 27, 2021
@RainbowMango
Copy link
Member

/approve

We can get benefits from this path, like:

Thanks @dddddai.

@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 29, 2021
@karmada-bot karmada-bot merged commit 668eaee into karmada-io:master Nov 29, 2021
@dddddai dddddai deleted the sched-type branch November 29, 2021 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants