-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot access ClusterIP service if the endpoint is on local Node when both AntreaProxy and Egress are enabled #2330
Comments
@jianjuns is the above flow necessary to support Egress? why does it need to bypass normal L3 flows for local Pods? |
I'm not sure if this is an issue on Windows: since the flow is installed unconditionally on Windows, would it prevent local Pods accessing each other via ClusterIP? |
The flow is added to bypass:
Do you have any good idea? I can only think about an extra table for MAC rewrite. |
How about changing the priorities of MAC rewrite flows and/or the SNAT skipping (195?) flow? It is a little strange to send the local Service flow to TTL table (or do you think such packets should do TTL?). |
This flow exists only on Windows, because we want to perform SNAT on the packets that are sent to the external. Since we can't predict destination of external traffic, we have to give a higher priority on the local traffic than the external packets. However, this flow and the rewrite MAC flow have the same priority currently, it is possible to prevent MAC rewrite actions on the packet that both source and destination on the same node. I would prefer to lower this flow's priority a bit (lower than 200), and the final priority should be higher than the SNAT flow. Or we could have a higher priority on the MAC rewrite flow?
|
Thanks for your input @jianjuns @wenyingd. @jianjuns I think normal LB/router will do TTL when forwarding the traffic, should we keep it same? I think TTL will be reduced in kube-proxy case too. @wenyingd the flow exists on Linux too when enabling Egress feature, which requires some flows to SNAT Pod-to-External traffic. For the solution that makes MAC rewrite flow have higher priority than SNAT skipping flow, it's doable but would introduce the 4th priority while we normally use low, normal, and high priorities. I'm thinking a solution that keeps them same priority and make SNAT skipping flow apply to non-macRewrite marked traffic, the flows would look this:
With above flows, all traffic to local Pods are handled properly in priority 200 using same match fields, regardless of where the traffic come from. I feel it's eaiser to understand, what do you think? |
But why you remove this flow?
|
I did test on windows and confirmed the issue indeed applied to windows even when Egress feature is not enabled (and the feature cannot be enabled on windows currently). But the access didn't always fail. It depended on which flow is enforced first. Even when they have same priority, one of them will be enforced first, skipping another. For example, in below case there will be no problem. That may explain some flaky tests on Windows. @wenyingd @lzhecheng
|
Describe the bug
When debugging test failure in #2306, I found the case "TestClusterIP" failed consistently on Windows+Linux mixed e2e test but never failed on Linux e2e test, even though the failure was between two Linux Pods in former case.
The failure was not related to #2306, it was first caught in that PR because windows CI was down for a few hours and the PR that added the test case #2318 didn't test it on Windows testbed. However, the issue was not introduced by #2318 either. It just added the test case that can catch the problem.
The real issue is because a flow that will only be added when Egress feature is enabled cause the packet that is DNATed by AntreaProxy flows not to be delivered to local destination Pod.
When a local Pod accesses another local Pod via ClusterIP, the packet will be DNATed by OpenFlows and is supposed to hit one rule in table 70 that can rewrite its destination MAC to the target Pod's MAC. But because of the presence of the second flow above, it jumps to next table directly with the destination MAC unchanged (i.e. is still antrea-gw0) MAC. Then the packet will be output to antrea-gw0. Then the host network will route the packet back to OVS, messing up the connection's ct state.
The code that adds the second flow above:
antrea/pkg/agent/openflow/pipeline.go
Lines 1809 to 1815 in 4405aaa
It was only caught by windows-e2e because antrea-agent was not re-deployed between TestEgress and TestClusterIP in windows-e2e (many tests were skipped), so TestClusterIP was run with Egress feature enabled.
To Reproduce
Expected
The access should succeed.
Actual behavior
The access failed.
Versions:
Please provide the following information:
The text was updated successfully, but these errors were encountered: