-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix duplicate IP case for NetworkPolicy #3467
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3467 +/- ##
===========================================
- Coverage 65.34% 53.51% -11.83%
===========================================
Files 268 239 -29
Lines 26901 34253 +7352
===========================================
+ Hits 17578 18331 +753
- Misses 7415 14145 +6730
+ Partials 1908 1777 -131
Flags with carried forward coverage won't be shown. Click here to find out more.
|
After a Node restarts, IP allocation storage used by host-local plugin is reset and all Pods on the Node will be recreated. It could happen that two Pods have same IP in K8s API if the Pod previously owning the IP hasn't been recreated and another Pod gets the IP assigned when it's recreated. This leads to an issue in NetworkPolicy as it calculated IPs to add and remove by getting added and removed Pods first, then getting IPs from the Pods. NetworkPolicy rule reconciler didn't handle this case correctly as it calculated added and removed Pods first, then got IPs to add and remove from the added and removed Pods. For example, if both Pod A and Pod B have IP 1.1.1.1, removing Pod B would cause IP 1.1.1.1 to be removed from dataplane. This patch changes to calculate IPs to add and remove based on the difference between the old IPs and the new IPs directly. Signed-off-by: Quan Tian <[email protected]>
b7d1738
to
f96733c
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed LGTM. Thanks Quan for the quick fix. I do think the there's something we can follow up on for this issue, as one of the causes is that in updating OpenFlow rules, we add addresses first then delete addresses, w/o checking for address duplication. https://github.com/antrea-io/antrea/blob/main/pkg/agent/controller/networkpolicy/reconciler.go#L787 might still be valid and @GraysonWu and I will look into simplifying OF interfaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Sure, let's take a look into that @Dyanngg .
@Dyanngg @GraysonWu thanks for your review and follow-up. For this issue, it may still happen even if the TODO is resolved because it could happen we have two Pods with same IP in one AddressGroup first, then one of them is removed, so calculating difference between realized IPs and desired IPs is necessary:
|
/skip-conformance |
/test-integration |
After a Node restarts, IP allocation storage used by host-local plugin
is reset and all Pods on the Node will be recreated. It could happen
that two Pods have same IP in K8s API if the Pod previously owning the
IP hasn't been recreated and another Pod gets the IP assigned when it's
recreated. This leads to an issue in NetworkPolicy as it calculated IPs
to add and remove by getting added and removed Pods first, then getting
IPs from the Pods.
NetworkPolicy rule reconciler didn't handle this case correctly as it
calculated added and removed Pods first, then got IPs to add and remove
from the added and removed Pods. For example, if both Pod A and Pod B
have IP 1.1.1.1, removing Pod B would cause IP 1.1.1.1 to be removed
from dataplane.
This patch changes to calculate IPs to add and remove based on the
difference between the old IPs and the new IPs directly.
Fixes #3468
Signed-off-by: Quan Tian [email protected]