-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle update network status failed after add and del attachments #237
Handle update network status failed after add and del attachments #237
Conversation
3d1610d
to
0320270
Compare
0320270
to
c1a00ec
Compare
This pr will cancel(delete) the added attachment in two cases:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this feature makes sense. I don't know if it would be possible to add unit tests for it?
ok, I will try. |
61ec648
to
c2829d1
Compare
Right now, it will return an err if we want to delet net1 in ut. If we change this, some previous ut will failed. So my ut expect |
79be62f
to
1c54179
Compare
1c54179
to
24c5dbe
Compare
pkg/controller/pod.go
Outdated
if shouldDeleteattachmentsAdded { | ||
attachmentsToCancel = attachmentsToAdd | ||
} else if addfailedIndex != -1 { | ||
attachmentsToCancel = append(attachmentsToCancel, attachmentsToAdd[addfailedIndex]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attachmentsToCancel = append(attachmentsToCancel, attachmentsToAdd[addfailedIndex]) | |
attachmentsToCancel = append(attachmentsToCancel, attachmentsToAdd[:addfailedIndex]) |
Why having reverted this changes? All non failing attachment added must be reverted no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good point ...
I agree with @LionelJouin; shouldn't you delete all the attachments you've just added ? i.e. setting the pod back to its original state ?
otherwise, the pod's network-status and the pod's actual interfaces will not be consistent.
pkg/controller/pod_test.go
Outdated
Eventually(<-eventRecorder.Events).Should(Equal(expectedEventPayload)) | ||
}) | ||
|
||
It("the pod network-status dosen't change", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It("the pod network-status dosen't change", func() { | |
It("the pod network-status doesn't change", func() { |
pkg/controller/pod_test.go
Outdated
Eventually(<-eventRecorder.Events).Should(Equal(expectedEventPayload)) | ||
|
||
expectedEventPayload = fmt.Sprintf( | ||
"Warning FailedRemovingInterface pod [%s]: failed removing interface %s from network: %s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add networkConfig(multuscni.CmdDel, "net1", macAddr),
to the fake multus client line 126 to test the simple case if you want (the simple case I mean: everything works except status update).
pkg/controller/pod_test.go
Outdated
Eventually(<-eventRecorder.Events).Should(Equal(expectedEventPayload)) | ||
}) | ||
|
||
It("the pod network-status dosen't change", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is required as the status update has been prevented, so this might test Kubernetes.
expectedError := errors.New("someerror") | ||
k8sClient.PrependReactor("update", "pods", func(_ k8stesting.Action) (bool, runtime.Object, error) { | ||
return true, nil, expectedError | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be moved to the JustBeforeEach
13eaa7b
to
2cd1a8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR. I agree we should keep the pod's interfaces consistent with the reported network-status at all times. Extra thanks for complying and adding the unit test @LionelJouin requested.
I'm requesting some changes to improve the overall code quality, and I also wonder if your approach (only deleting one interface) wouldn't break the pod interface / pod network status consistency.
pkg/controller/pod.go
Outdated
defer func() { | ||
pnc.handleResult(err, podNamespacedName, pod, results, attachmentsToAdd, netnsPath, podSandboxID, addFailedIndex) | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like how this looks.
How about we define a Rollback
interface and pass that to the handle result func ?
It would be something like
type interface Rollback {
Rollback() error
}
type asd struct {
netnsPath string
podSandboxID string
attachmentsToRollback []nadv1.NetworkSelectionElement
}
func (a *asd) Rollback() error {
// ... invoke DEL for attachments on asd.attachments
}
Then, on handleResult
you just invoke rollback if err != nil
pkg/controller/pod.go
Outdated
if shouldDeleteattachmentsAdded { | ||
attachmentsToCancel = attachmentsToAdd | ||
} else if addfailedIndex != -1 { | ||
attachmentsToCancel = append(attachmentsToCancel, attachmentsToAdd[addfailedIndex]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good point ...
I agree with @LionelJouin; shouldn't you delete all the attachments you've just added ? i.e. setting the pod back to its original state ?
otherwise, the pod's network-status and the pod's actual interfaces will not be consistent.
13eaa7b
to
2aa70ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good.
@LionelJouin could you please review ?
@oshoval could you help me out in this one ?
I think this is close to a merge, thanks ! And sorry it took so long for me to review it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks nice
just a nit so far
Signed-off-by: zhuanlan <[email protected]>
Signed-off-by: zhuanlan <[email protected]>
13eaa7b
to
b2044a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending CI results, this looks good.
I'll take a last quick review on Monday, then, assuming I don't see anything weird, I'll proceed with the merge.
Huge thanks for the contribution, and for your patience. I'm aware this took more than anticipated ...
I see your PR now fails the linter tests ... Sorry about that. Let me take a quick look; FWIW, I'm considering merging it and fixing it myself afterwards. Really sorry this is taking so long. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's bring this in.
I'll fix the main branch on a follow up PR.
Huge thanks.
What this PR does / why we need it:
This pr delete all attachments added in this
processNextWorkItem
when set NetworkStatus failed inhandleResult
. Because DEL is Idempotence we can always return to the state before dynamic handle.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #235
Special notes for your reviewer (optional):