Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: cleanup leaked iptables rules #15407

Merged
merged 1 commit into from
Nov 28, 2022
Merged

client: cleanup leaked iptables rules #15407

merged 1 commit into from
Nov 28, 2022

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Nov 28, 2022

This PR adds a secondary path for cleaning up iptables created for an allocation
when the normal CNI library fails to do so. This typically happens when the state
of the pause container is unexpected - e.g. deleted out of band from Nomad. Before,
the iptables rules would be leaked which could lead to unexpected nat routing
behavior later on (in addition to leaked resources). With this change, we scan
for the rules created on behalf of the allocation being GC'd and delete them.

Fixes #6385

This PR adds a secondary path for cleaning up iptables created for an allocation
when the normal CNI library fails to do so. This typically happens when the state
of the pause container is unexpected - e.g. deleted out of band from Nomad. Before,
the iptables rules would be leaked which could lead to unexpected nat routing
behavior later on (in addition to leaked resources). With this change, we scan
for the rules created on behalf of the allocation being GC'd and delete them.

Fixes #6385
@shoenig shoenig marked this pull request as ready for review November 28, 2022 16:01
@shoenig shoenig requested review from tgross and angrycub November 28, 2022 16:02
@shoenig shoenig added this to the 1.5.0 milestone Nov 28, 2022
@shoenig shoenig changed the title client: manually cleanup leaked iptables rules client: cleanup leaked iptables rules Nov 28, 2022
@shoenig
Copy link
Member Author

shoenig commented Nov 28, 2022

Should we backport this? It's like on the fence between bug / feature.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Should we backport this? It's like on the fence between bug / feature.

I agree it's on the fence, but yeah we should probably backport it.

var (
// ipRuleRe is used to parse a postrouting iptables rule created by nomad, e.g.
// -A POSTROUTING -s 172.26.64.191/32 -m comment --comment "name: \"nomad\" id: \"6b235529-8111-4bbe-520b-d639b1d2a94e\"" -j CNI-50e58ea77dc52e0c731e3799
ipRuleRe = regexp.MustCompile(`-A POSTROUTING -s (\S+) -m comment --comment "name: \\"nomad\\" id: \\"([[:xdigit:]-]+)\\"" -j (CNI-[[:xdigit:]]+)`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My kingdom for an iptables library that parses the rules into a sensible struct instead of returning a list of strings! 😀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

Comment on lines +308 to +313
// remove the jump rule
ok := true
if err = ipt.Delete(natTable, postRoutingChain, toDel...); err != nil {
c.logger.Warn("failed to remove iptables nat.POSTROUTING rule", "alloc_id", allocID, "chain", chainID, "error", err)
ok = false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return the error here? If this fails, will we ever be able to clear and delete the chain?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I dunno, my thinking is we're already in a "just close our eyes and try deleting stuff" state. If someone ever reports an error here we'll need client logs surrounding the delete command anyway; the error alone is next to useless.

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/1.2.x backport to 1.1.x release line backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line
Projects
None yet
Development

Successfully merging this pull request may close these issues.

iptables entries are not reconciled
2 participants