-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected subflow management behaviour, following endpoint add after subflow failure #242
Comments
This has been fixed in our export branch thanks to the modifications done by @pabeni:
@pgreenland do you mind checking this new version? Please re-open this ticket if you still have issues. |
Add a big batch of test coverage to assert all aspects of the tcx opts attach, detach and query API: # ./vmtest.sh -- ./test_progs -t tc_opts [...] #238 tc_opts_after:OK #239 tc_opts_append:OK #240 tc_opts_basic:OK #241 tc_opts_before:OK #242 tc_opts_chain_classic:OK #243 tc_opts_demixed:OK #244 tc_opts_detach:OK #245 tc_opts_detach_after:OK #246 tc_opts_detach_before:OK #247 tc_opts_dev_cleanup:OK #248 tc_opts_invalid:OK #249 tc_opts_mixed:OK #250 tc_opts_prepend:OK #251 tc_opts_replace:OK #252 tc_opts_revision:OK Summary: 15/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
Add several new tcx test cases to improve test coverage. This also includes a few new tests with ingress instead of clsact qdisc, to cover the fix from commit dc644b5 ("tcx: Fix splat in ingress_destroy upon tcx_entry_free"). # ./test_progs -t tc [...] #234 tc_links_after:OK #235 tc_links_append:OK #236 tc_links_basic:OK #237 tc_links_before:OK #238 tc_links_chain_classic:OK #239 tc_links_chain_mixed:OK #240 tc_links_dev_cleanup:OK #241 tc_links_dev_mixed:OK #242 tc_links_ingress:OK #243 tc_links_invalid:OK #244 tc_links_prepend:OK #245 tc_links_replace:OK #246 tc_links_revision:OK #247 tc_opts_after:OK #248 tc_opts_append:OK #249 tc_opts_basic:OK #250 tc_opts_before:OK #251 tc_opts_chain_classic:OK #252 tc_opts_chain_mixed:OK #253 tc_opts_delete_empty:OK #254 tc_opts_demixed:OK #255 tc_opts_detach:OK #256 tc_opts_detach_after:OK #257 tc_opts_detach_before:OK #258 tc_opts_dev_cleanup:OK #259 tc_opts_invalid:OK #260 tc_opts_mixed:OK #261 tc_opts_prepend:OK #262 tc_opts_replace:OK #263 tc_opts_revision:OK [...] Summary: 44/38 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/8699efc284b75ccdc51ddf7062fa2370330dc6c0.1692029283.git.daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <[email protected]>
Hi,
Follow on from initial discussion on the mailing list: Original Mailing List Email.
If we add and remove endpoint addresses, everything appears to work well, subflows are connected and disconnected as expected.
However if a subflow fails, for example due to reception of a TCP RST from an upstream network device this seems to cause problems.
The next time an endpoint is removed (a link other than the failing one), the associated subflow is disconnected as expected.
When the endpoint is re-added, more often than not the failed subflow (the one stopped by a TCP RST) it re-connected and the newly added endpoint address is not used.
As suggested I've managed to capture an occurrence of the above, with hopefully enough details to reproduce.
My test setup is as follows:
MPTCP enabled VMs acts as a client and server, connected by a (non MPTCP enabled) VM acting as a router.
The three links between the client and router are implemented via virtual network interfaces, connected between VMs via their own dedicated bridges, mimicking independent networks.
The client and server applications, implemented in python act as simple python TCP client and servers, tricking approximately 2KB of data (from urandom) per second. Both are pretty simple, with the client application blocking on send until it completes. The server listens on port 6666 and clients connect to the server's IP 192.168.2.2.
Client machine config:
Router config (on which packets were captured, network interface names should match in the attached pcap):
Server machine config:
Issue described can be reproduced as follows, with the attached packet capture taken during the session below with associated commands:
mptcp remove and add subflow.pcapng.zip
Connect client application and observe that a subflow has been created over each configured endpoint :-).
Server machine as expected shows the same subflows.
Remove endpoint address and observe associated subflow disconnecting.
Subflow has been disconnected here too :-).
Add the endpoint back and observer a new subflow being established.
Server now shows full complement of flows again.
Reject traffic being forwarded from adapter on client machine 192.168.5.0/30 network, causing subflow to fail, from the client's point of view.
We see that subflow that was originating from 192.168.4.2 is now gone, following its reception of a RST.
Server is blissfully unaware of the situation and still shows the subflow as established.
Flush iptables rules, stopping RST transmission.
As before remove endpoint address and observe its associated subflow being disconnected.
Server follows along and the subflow disconnects from its point of view.
Re-add the endpoint address, expecting a new subflow to be created using it. However instead we see that the new subflow was created using a different endpoint, the previously failed 192.168.4.2. With the newly added 192.168.5.2 endpoint unused.
The server collaborates the story, showing two subflows originating from 192.168.4.2 (the original failed one and the newly created one), but now flows from 192.168.5.2 (the newly added address).
I believe this behaviour originates from the
select_local_address
function. When triggered as part of a call chain from the netlink endpoint add handler selects the first available address with no subflow. In this case 192.168.4.2 rather than the added 192.168.5.2. Upon successful re-reconnection using the old address, the event handlermptcp_pm_nl_subflow_established
looks to be called, which will attempt to trigger the next subflow creation. However this is prevented by the conditionlocal_addr_used < local_addr_max
, aslocal_addr_used
will not have been decremented following the forced failure of the original subflow.Thanks,
Phil
The text was updated successfully, but these errors were encountered: