Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the implementation of conntrack in pcn-iptables #244

Closed
whl739 opened this issue Nov 25, 2019 · 2 comments
Closed

Questions about the implementation of conntrack in pcn-iptables #244

whl739 opened this issue Nov 25, 2019 · 2 comments

Comments

@whl739
Copy link

whl739 commented Nov 25, 2019

Hi, guys,
I saw this project in this paper: Securing_Linux_with_a_Faster_and_Scalable_Iptables.
And i have some questions about the implemention of conntrack in iptables.

In section 4.5 Conntrack entry creation:

To identify the connection associated to a packet, bpf-iptables uses the packet 5-tuple (i.e., src/dst IP address, L4 protocol, src/dst L4 port) as key in the conntrack table.
...
This process allows to create a single entry in the conntrack table for both directions, speeding up the lookup process. In addition, together with the new connection state, the Conntrack Update module stores into the conntrack table two additional flags, ip reverse (ipRev) and port reverse (portRev) indicating if the IPs and the L4 ports have been reversed compared to the current packet 5-tuple.

the conntrack process will create a single entry for both directions, but if egress and ingress packets with same flow are handled at the same time, they will update the entry too, and if the packets are handled on the same cpu, then there is no problem, but if they are not on same cpu, will this cause inconsistent state?

In section 4.5 TCP state machine and Conntrack Cleanup:

Finally, when the connection reaches the TIME_WAIT state, only a timeout event or a new SYN will trigger a state change. In the first case the entry is deleted from the conntrack table, otherwise the current packet direction is marked as forward and the new state becomes SYN_SENT.
Conntrack Cleanup. bpf-iptables implements the cleanup of conntrack entries in the control plane, where a dedicated thread checks the presence of expired sessions.

I didn't find the cleanup code of conntrack entries, did i miss something?
And another question, if the control plane finds an expired entry, and decides to delete it, during the operation, the datapath reuses the connection, and updates the entry, after this, the control plane deletes it, how to prevent this?

Thank you.

@sebymiano
Copy link
Collaborator

Hi @whl739,
thanks a lot for the interest in this work.
Just a small note; the version of the paper that you are referencing is outdated, I suggest you have a look at this version, which is the published (and therefore complete) version under the CCR website.

Below you can find my answers to your questions:

the conntrack process will create a single entry for both directions, but if egress and ingress packets with same flow are handled at the same time, they will update the entry too, and if the packets are handled on the same cpu, then there is no problem, but if they are not on same cpu, will this cause inconsistent state?

You are absolutely right, nice catch!
To avoid the inconsistency between the ingress and egress pipeline (but also between the ConntrackLabel and ConntrackTableUpdate programs) the only way that we have would be to introduce locks when updating the conntrack value through bpf_spin_locks, which have been introduced recently in the kernel [1].

Unfortunately, I do not have time to implement it now but, if you would like to do it and create a PR I would be happy to review it.
To support bpf_spink_locks we need to first update the BCC version that we are using together with the LLVM version 9, which is required to use BTF annotation in the maps.

I didn't find the cleanup code of conntrack entries, did i miss something?
And another question, if the control plane finds an expired entry, and decides to delete it, during the operation, the datapath reuses the connection, and updates the entry, after this, the control plane deletes it, how to prevent this?

Even in this case, you are absolutely right; the cleanup code was there before, but we decided to remove it for the same reason you explained.
To avoid this, we are using an LRU map so that entries that are not used can be simply removed from the maps since they are not accessed.
In case a flow already present in the map arrives, we check if the timestamp is too old and in that case, we consider the flow as it was not on the map.

I have noticed that the code that checks the expired entries is not there (probably we did some mistakes in porting the code from one branch to another). I'll do it ASAP or again, if you want to submit a PR even for that it will be appreciated.

Please if there is something that it is still not clear to you let me know.

@whl739
Copy link
Author

whl739 commented Nov 28, 2019

Thanks for your detailed answers, i got it.
Since i'm still a newbee to bcc and bpf, i may not be able to complete the PR, sorry for that.

@whl739 whl739 closed this as completed Nov 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants