-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes LoadBalancer Service + Requests towards Load Balancer external IP from within same cluster = Connection refused #6244
Comments
@tnn-simon thank you for the report. As we discussed in slack, the problem here is Antrea relies on PodCIDR of Node objects to differentiate whether the traffic comes from local Pods or not, but AKS doesn't set it in this mode. So the solution we may apply is to use a more generic way to differentiate local Pod generated traffic. But I want to mention that, with the solution, Pod to LoadBalancerIP traffic will be processed by Antrea and DNATed to a random endpoint of the Service in the cluster, instead of going to the external load balancer (if there is one) to perform load balancing, will it work for you? |
@tnqn: Sounds like something that could work, but I'd still have to test it to be completely certain. What is the rationale behind not sending the traffic through the external load balancer? |
It's the default behavior we assume most users would expect as it's the shorter path, and it's also the same behavior how kube-proxy would handle it. It's configurable and can be disabled by setting When you tried |
If I disable Antrea and reboot the nodes, the issues disappear - just cross-checking. I'm testing using Helm (chart version 1.15.1) with these values: trafficEncapMode: "networkPolicyOnly"
nodeIPAM:
enable: false
antreaProxy:
proxyLoadBalancerIPs: false When running with Inspected the dump of Not sure what this indicates. My cluster does not have any network policy resources. I'll attach a complete dump ( |
@tnqn: I'm attempting to get my head around this issue. Just out of curiosity, can you point to the source showing that kube-proxy does influence the traffic path for egress traffic destined for IPs assigned in As you mentioned, one will typically send traffic through the external load-balancer to piggyback on features implemented by the load-balancer - in our case cluster-independent DNS. If one doesn't want the traffic to flow through the external load-balancer, wouldn't one use the ClusterIP of the service instead(?). I also struggle to understand why |
You can find how it handles "locally-originated pod -> external destination" by combining the following code:
My understanding is, the handling is to assume users' applications are provided an unified address (the LB IP) for a service regardless of the location of the clients, but the intention is always the same Endpoints, so it short-circuits the traffic as it will come back anyway. However, since K8s 1.29, a feature called
externalTrafficPolicy and internalTrafficPolicy applies to service traffic based on the destination address. You can find the following explainations in their API spec:
I saw the CIDR of Pod IPs have changed, is this the same cluster? By disabling Antrea, you mean removing antrea from the cluster or disabling Antrea's |
This is a different cluster from the one referred to in Slack. This is based on the Azure CNI Overlay network plugin, but the issue is the same. Created it to get a dedicated test environment for this issue. By disabling Antrea, I mean removing Antrea from the cluster and reboot the nodes. The dump.log was collected when running Antrea with proxyLoadBalancerIPs: false, and this does not work - the connection times out. Thank you so much for sharing your insights on the kube-proxy behaviour. I can confirm that traffic originated in the same cluster does not flow through the external load-balancer regardless of if Antrea is installed or not. Regarding your first suggestion:
I think this sounds even better now that I know more about kube-proxy. Guess the suggested solution is to ignore the externalTrafficPolicy - which I guess is the current behaviour of kube-proxy for traffic originated internally (from our experience). Still digging for the root cause of the timeouts. Our experience so far is as follows:
When |
Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
Yes, Antrea is implemented to be identical to kube-proxy for most Service features. Other modes already work in this way, it's just NetworkPolicyOnly mode misses the required PodCIDR to implement this.
Could you share the output of |
Here is the output: iptables.log IP of LB-Service: 10.34.208.4 |
Thanks for the output. I figured out why it's dropped by iptables: Like Antrea, kube-proxy needs to know whether the traffic comes from local Pods to short-circuit the traffic. In this cluster, the way it detects locally-originated pod by checking the interface name prefix. If you look at the kube-proxy configmap, there should be a field like the following:
However, in Antrea NetworkPolicyOnly mode, the Pod interfaces are not connected to the host network but via an OVS bridge (so Antrea can enforce NetworkPolicy), and the locally-originated pod will arrive the host network via "antrea-gw0", instead of "azv+" expected by kube-proxy. The relevant rules are as below:
I think updating |
Thanks for the clarification! Will give this some thought. Kube-proxy is managed by Azure AKS and something we will not tamper with without proper research. |
Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
…#6251) Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix #6244 Signed-off-by: Hongliang Liu <[email protected]>
…antrea-io#6251) Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
…antrea-io#6251) Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
…antrea-io#6251) Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix antrea-io#6244 Signed-off-by: Hongliang Liu <[email protected]>
…#6251) (#6268) Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix #6244 Signed-off-by: Hongliang Liu <[email protected]>
…#6251) (#6269) Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode. Fix #6244 Signed-off-by: Hongliang Liu <[email protected]>
Describe the bug
Traffic does not reach the external IP associated with v1.Service (type: LoadBalancer) when source workloads and the v1.Service are hosted in same cluster, unless the source workloads are running on the same nodes as the endpoints associated with the v1.Service. Despite running Antrea in
networkPolicyOnly
mode.To Reproduce
azure
and network policynone
.networkPolicyOnly
mode.LoadBalancer
, with externalTrafficPolicylocal
and the following annotation:"service.beta.kubernetes.io/azure-load-balancer-internal: "true"
..status.loadBalancer.ingress.ip[0]
. Add verbose flag if using curl.connection refused
.Expected
Expected Antrea not to drop traffic destined for an IP outside both the Pod CIDR and the Service CIDR (ClusterIP) of the cluster, despite of the externalTrafficPolicy set to
local
.Actual behavior
Traffic gets dropped when the node of the source workload does not host any endpoints associated with the target service.
Versions:
Antrea version 1.15.1
Kubernetes version: 1.28.5
Container runtime: 1.7.14-1
Linux kernel version: 5.15.0-1059-azure
Additional context
The text was updated successfully, but these errors were encountered: