-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: cleanup leaked iptables rules #15407
Conversation
ddcc9f1
to
ca1b229
Compare
ca1b229
to
e3c9854
Compare
This PR adds a secondary path for cleaning up iptables created for an allocation when the normal CNI library fails to do so. This typically happens when the state of the pause container is unexpected - e.g. deleted out of band from Nomad. Before, the iptables rules would be leaked which could lead to unexpected nat routing behavior later on (in addition to leaked resources). With this change, we scan for the rules created on behalf of the allocation being GC'd and delete them. Fixes #6385
e3c9854
to
99fb7d6
Compare
Should we backport this? It's like on the fence between bug / feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Should we backport this? It's like on the fence between bug / feature.
I agree it's on the fence, but yeah we should probably backport it.
var ( | ||
// ipRuleRe is used to parse a postrouting iptables rule created by nomad, e.g. | ||
// -A POSTROUTING -s 172.26.64.191/32 -m comment --comment "name: \"nomad\" id: \"6b235529-8111-4bbe-520b-d639b1d2a94e\"" -j CNI-50e58ea77dc52e0c731e3799 | ||
ipRuleRe = regexp.MustCompile(`-A POSTROUTING -s (\S+) -m comment --comment "name: \\"nomad\\" id: \\"([[:xdigit:]-]+)\\"" -j (CNI-[[:xdigit:]]+)`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My kingdom for an iptables library that parses the rules into a sensible struct instead of returning a list of strings! 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
// remove the jump rule | ||
ok := true | ||
if err = ipt.Delete(natTable, postRoutingChain, toDel...); err != nil { | ||
c.logger.Warn("failed to remove iptables nat.POSTROUTING rule", "alloc_id", allocID, "chain", chainID, "error", err) | ||
ok = false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return the error here? If this fails, will we ever be able to clear and delete the chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I dunno, my thinking is we're already in a "just close our eyes and try deleting stuff" state. If someone ever reports an error here we'll need client logs surrounding the delete command anyway; the error alone is next to useless.
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR adds a secondary path for cleaning up iptables created for an allocation
when the normal CNI library fails to do so. This typically happens when the state
of the pause container is unexpected - e.g. deleted out of band from Nomad. Before,
the iptables rules would be leaked which could lead to unexpected nat routing
behavior later on (in addition to leaked resources). With this change, we scan
for the rules created on behalf of the allocation being GC'd and delete them.
Fixes #6385