-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
custom iptables version monitor plugin #844
Conversation
/assign @thockin @danwinship @SergeyKanzhelev |
"incompatible" and "can cause problems" are overstating things. As long as your rules are self-contained and aren't trying to interact with or override another component's rules, it generally doesn't matter whether you're using the same iptables mode as anyone else. (Unless the system only has kernel support for one of the two modes, but this PR doesn't help people running into that.) I think we got a little bit confused about this back when iptables-1.8 first came out, because (a) we were also dealing with various bugs in iptables-1.8.[0-2] at the same time, without fully understand what was going on; (b) at the time, kube-proxy depended on iptables chains that were created by kubelet, so kube-proxy and kubelet specifically needed to be using the same iptables mode; (c) kube-proxy purports to poke holes in the system firewall, which can only work if kube-proxy and the firewall both use the same iptables mode. The bugs eventually got fixed, KEP-3178 made kube-proxy no longer depend on kubelet's rules, and the "poking holes in the system firewall" use case turns out to be slightly dubious and probably unnecessary. (Some distros have moved to So I'm not sure it really makes sense to call this out as a "problem"... |
well, at least if there are two systems using the different versions they can override each other rules ... I do not know exactly the problems today, but I remember having segfaults because of mixed versions https://bugzilla.netfilter.org/show_bug.cgi?id=1476 We want this plugin to run just to prevent any possible problems , as is impossible to say that there is zero risk |
That's not "segfaults because of mixed versions", that's "segfaults because of a bug in the iptables binaries that you happened to run into on a node with mixed versions". |
It seems to me that this is more likely to generate false positives and warn people that there is a problem when everything is actually working fine. |
interesting, should I add a threshold to the number of lines? |
Based on Dan's feedback and npd docs, I'm going to update this to only report events and metrics to get signal, based on the feedback we can assess the criticality of the problem with real data and update to Condition if necessary
|
iptables has two kernel backends, legacy and nft. Quoting https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables > It is also important to note that while iptables-nft > can supplant iptables-legacy, you should never use them simultaneously. However, we don't want to block the node operations because of this reason, as there is no enough evidence this is causing big issues in the wild, so we just signal and warn about this situation. Once we have more information we can revisit this decision and keep it as is or move it to permanent.
/test pull-npd-e2e-test at first sight in does not look related to my PR |
OK. I don't really know much about node-problem-detector... |
/lgtm |
if [ "$num_legacy_lines" -gt 0 ] && [ "$num_nft_lines" -gt 0 ]; then | ||
echo "Found rules from both versions, iptables-legacy: ${num_legacy_lines} iptables-nft: ${num_nft_lines}" | ||
echo $NONOK | ||
elif [ "$num_legacy_lines" -gt 0 ] && [ "$num_nft_lines" -eq 0 ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if the following conditions are necessary. From my understanding, there is no rule found doesn't mean the node is using one backend..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a Kubernetes cluster, kubelet always install an iptables rule to solve this specific problem https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/3178-iptables-cleanup#iptables-wrapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These mechanism of detecting the backends by counting rules has demonstrated to be the most effective and widely used
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, hakman, vteratipally The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
iptables has two kernel backends, legacy and nft.
Quoting https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables
However, we don't want to block the node operations because of this
reason, as there is no enough evidence this is causing big issues in the
wild, so we just signal and warn about this situation.
Once we have more information we can revisit this decision and
keep it as is or move it to permanent.
The plugin runs every day to avoid causing problems on large systems.
https://github.com/kubernetes-sigs/iptables-wrappers/blob/97b01f43a8e8db07840fc4b95e833a37c0d36b12/iptables-wrapper-installer.sh