-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChecksumOffloadBroken autodetection doesn't necessarily detect all cases #4727
Comments
VXLAN offload works with many 10G NICs, disabling by default will hurt performance for those, and each card can have different offload toggle, for the qede driver + IPIP you need to disable all offload, not just |
Good point, but the issue at hand is completely limited to vSphere infrastructure, so the fix would/should also only apply to the specific type of NIC used there (VMXNET3). The goal is not to solve all knowns issue in relation to Calico IPIP or VXLAN but to restore compatibility with what is undoubtedly a very mainstream and widespread infrastructure. |
Thanks @janeczku. So IIUC there is a workaround to disable hardware offloading on those specific NICs that can be done prior to installing Calico for Windows. cc @song-jiang |
Is there a good way to detect these NICs? If so, we could arrange for ChecksumOffloadBroken to be set int hat case: https://github.com/projectcalico/felix/blob/master/iptables/feature_detect.go#L116 Note: Calico feature detction can be overridden with config by setting an override in the FelixConfiguration resource:
|
It should either be documented or the workaround should be applied automatically in Felix using the approach described by @fasaxc above. |
Yes, they can be detected by determining NIC model and hw revision via ethtool syscalls |
The bug is actually in the new linux driver for vmxnet3. So probably instead of detecting the specific hardware revision (which i am not sure is exposed over ethtool) it would be enough to detect that it uses the buggy driver version. |
Sometimes the bug is with the driver + firmware combination, it's endless. |
@fasaxc, et al., I have an issue where pods can't communicate with one another across nodes. I've concluded that it's related to this issue. I was able to verify that on a brand new k3s cluster install adding I have calico installed via the tigera operator v1.23.1 (calico v3.21.0) on k3s v1.21.5+k3s2. OS is Ubuntu 20.04. -robodude666 |
I'm hitting this issue on Azure (requires VXLAN) However, Calico does not allow configuring Felix directly when using the operator: https://projectcalico.docs.tigera.io/reference/felix/configuration It would be great if we could either:
|
Hm, that's a bummer that the auto-detection isn't working on newer kernels.
If you're using the operator, you should look at https://projectcalico.docs.tigera.io/reference/resources/felixconfig to use REST API-based configuration instead of environment variables. You should be able to modify the
|
@caseydavenport that's what I'm doing for now and it seems to make the tests happy: https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/1a1fa22e8947ba7805e029a279c85af325c2e32b/templates/addons/calico/felix-override.yaml Do you know if there is a way to do this directly via the Helm chart though? It'd be easier if I could set the After doing some research across many GitHub issues on this kernel bug I found https://github.com/rancher/rke2-charts/blob/main-source/packages/rke2-calico/generated-changes/overlay/templates/felixconfig.yaml, seems like rancher folks are doing some sort of overlay to extend the upstream calico template to allow configuring Felix in Thanks so much for the answer and for all your work on the project btw, I've gone through a lot of Calico issues the past few days and your comments were very helpful! |
Thanks for the pointer to that overlay file! I didn't realize that. However, this line . . . Looks like #6412 strikes again!
It definitely would, and were it not for the problems discussed in the above issue I'd probably just do that right now. To be honest I'm tempted to do it anyway since the default FelixConfiguration is a singleton and this would be a nice UX improvement and would actually be abstracted behind helm's values.yaml "API" anyway... I will mull on that :)
You're very welcome! and I really appreciate the kind words 😸 |
Hey @caseydavenport have you given this any more thought? Looks like others are running into this as well from issue mentions |
this only works for VXLAN, not for IPIP; |
@fredkan see above, we decided to disable it by default in more recent versions. |
Expected Behavior
Pod-pod and pod-service communication across nodes should work.
Current Behavior
All traffic between pods across nodes is dropped (with the exception of ICMP).
Possible Solution
VMware recommends to either:
Since a port change is not feasible for Calico Windows (which requires 4789) disabling the hardware offload feature is the only feasible solution. Since this feature was not even supported by earlier Linux versions for that particular NIC device there is no performance impact of disabling it.
Given that the NIC firmware configuration is not something most users are used to manage i suggest to implement a transparent solution in Calico that disables the offload feature when Calico configures VXLAN on host interfaces backed by a VMXNET3 device.
To that effect: It looks like Calico already configures NIC driver settings: https://github.com/projectcalico/felix/blob/master/ethtool/ethtool.go
Steps to Reproduce (for bugs)
Context
VXLAN packets are dropped on the Linux network stack due to incorrect checksums of inner packets. These incorrect checksums occur when enabling VXLAN hardware offload on the VMXNET3 interface (which recent Linux version do by default) and creating a VXLAN overlay network in the guest OS on ports other than 8472 (when NSX is not used) or 4789 (when NSX is used).
References:
Your Environment
The text was updated successfully, but these errors were encountered: