-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: fix TestIncomingPodsMetrics unit test #120434
scheduler: fix TestIncomingPodsMetrics unit test #120434
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc |
7e2fb1e
to
7a5000e
Compare
/retest |
addUnschedulablePodBackToUnschedulablePods = func(logger klog.Logger, queue *PriorityQueue, pInfo *framework.QueuedPodInfo) { | ||
// To simulate the pod is failed in scheduling in the real world, Pop() the pod from activeQ before AddUnschedulableIfNotPresent() below. | ||
queue.activeQ.Add(queue.newQueuedPodInfo(pInfo.Pod)) | ||
if p, err := queue.Pop(); err != nil || p.Pod != pInfo.Pod { | ||
expectNoError("Add", queue.activeQ.Add(queue.newQueuedPodInfo(pInfo.Pod))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have preferred that myself. But t
is not passed down into these operations. I could change that, but then the diff becomes larger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or did you mean not having a expectNoError
function (DAMP)?
I find expectNoError
more readable than an if check. But if that's just me, then I can change to if ... t.Fatal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both done (using t.Fatal
and removal of expectNoError
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! More importantly I wanted you to use t.Fatal
.
But I also don't like adding "assert" libraries.
expectNoError("Add", queue.activeQ.Add(queue.newQueuedPodInfo(pInfo.Pod))) | ||
p, err := queue.Pop() | ||
expectNoError("Pop", err) | ||
if p.Pod != pInfo.Pod { | ||
panic(fmt.Sprintf("Expected: %v after Pop, but got: %v", pInfo.Pod.Name, p.Pod.Name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uhm... this one slipped, please use t.Fatal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if err != nil { | ||
panic(fmt.Sprintf("Unexpected error from %s: %v", what, err)) | ||
} | ||
} | ||
addUnschedulablePodBackToUnschedulablePods = func(logger klog.Logger, queue *PriorityQueue, pInfo *framework.QueuedPodInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's change the function name to: popAndRequeueAsUnschedulable
. Similarly for the other one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
// A concurrent event forces the pod into the backoff queue instead of the unschedulable queue. | ||
queue.MoveAllToActiveOrBackoffQueue(logger, NodeAdd, nil, nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the fact that, to understand what each test case is doing, I have to go back and read this function.
I would prefer if we would just have:
operations: []{pop, moveAll, addUnschedulable, ...}
But up to you if you want to do it or leave it for someone else to follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone else, please 😓
7a5000e
to
4ffc9a4
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: 7ad32faa35c49b86862d918ca068b3f5f9b2c9c1
|
/test pull-kubernetes-node-e2e-containerd |
The tests pass in their branch, but not when merged into master. That's because of 0d3eafd: together with that, some pods in the unit tests no longer go to the expected queues:
I'll rebase and check whether the unit tests made some invalid assumptions about how the queue works. |
addUnschedulablePodBackToBackoffQ happened to put the pod into the backoff queue because - the pod was not popped earlier and thus not in flight - the PodInfo had UnschedulablePlugins set - determineSchedulingHintForInFlightPod has code for "if UnschedulablePlugins is set and pod not in flight -> internal error, use backoff" Relying on such special code is not good. A better way to force backoff is by recording some concurrent event. isPodWorthRequeuing then calls the queueHintReturnQueueAfterBackoff function and the pod goes to the backoff queue.
4ffc9a4
to
819edda
Compare
The relevant difference is that pods with no unschedulable plugins now go into the backoff queue. To simulate putting a pod into the unschedulable queue, diff --git a/pkg/scheduler/internal/queue/scheduling_queue_test.go b/pkg/scheduler/internal/queue/scheduling_queue_test.go
index 13a9a89bb02..b17e5a3c3cc 100644
--- a/pkg/scheduler/internal/queue/scheduling_queue_test.go
+++ b/pkg/scheduler/internal/queue/scheduling_queue_test.go
@@ -2288,6 +2288,8 @@ var (
}
popAndRequeueAsUnschedulable = func(t *testing.T, logger klog.Logger, queue *PriorityQueue, pInfo *framework.QueuedPodInfo) {
// To simulate the pod is failed in scheduling in the real world, Pop() the pod from activeQ before AddUnschedulableIfNotPresent() below.
+ // UnschedulablePlugins will get cleared by Pop, so make a copy first.
+ unschedulablePlugins := pInfo.UnschedulablePlugins.Clone()
if err := queue.activeQ.Add(queue.newQueuedPodInfo(pInfo.Pod)); err != nil {
t.Fatalf("Unexpected error during Add: %v", err)
}
@@ -2298,6 +2300,8 @@ var (
if p.Pod != pInfo.Pod {
t.Fatalf("Expected: %v after Pop, but got: %v", pInfo.Pod.Name, p.Pod.Name)
}
+ // Simulate plugins that are waiting for some events.
+ p.UnschedulablePlugins = unschedulablePlugins
if err := queue.AddUnschedulableIfNotPresent(logger, p, 1); err != nil {
t.Fatalf("Unexpected error during AddUnschedulableIfNotPresent: %v", err)
}
@@ -2314,10 +2318,7 @@ var (
if p.Pod != pInfo.Pod {
t.Fatalf("Expected: %v after Pop, but got: %v", pInfo.Pod.Name, p.Pod.Name)
}
-
- // A concurrent event forces the pod into the backoff queue instead of the unschedulable queue.
- queue.MoveAllToActiveOrBackoffQueue(logger, NodeAdd, nil, nil, nil)
-
+ // When there is no known unschedulable plugin, pods always go to the backoff queue.
if err := queue.AddUnschedulableIfNotPresent(logger, p, 1); err != nil {
t.Fatalf("Unexpected error during AddUnschedulableIfNotPresent: %v", err)
} |
/approve cancel |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: eeb1b2270bc6d52f11ae720e7eb784398d93b2e8
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pohly, sanposhiho The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@pohly: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
addUnschedulablePodBackToBackoffQ happened to put the pod into the backoff queue because
Relying on such special code is not good. A better way to force backoff is by recording some concurrent event. isPodWorthRequeuing then calls the queueHintReturnQueueAfterBackoff function and the pod goes to the backoff queue.
Which issue(s) this PR fixes:
Related-to: #120413 (comment)
Does this PR introduce a user-facing change?