Unexpected subflow management behaviour, following endpoint add after subflow failure #242

pgreenland · 2021-11-20T17:01:35Z

Hi,

Follow on from initial discussion on the mailing list: Original Mailing List Email.

If we add and remove endpoint addresses, everything appears to work well, subflows are connected and disconnected as expected.

However if a subflow fails, for example due to reception of a TCP RST from an upstream network device this seems to cause problems.

The next time an endpoint is removed (a link other than the failing one), the associated subflow is disconnected as expected.

When the endpoint is re-added, more often than not the failed subflow (the one stopped by a TCP RST) it re-connected and the newly added endpoint address is not used.

As suggested I've managed to capture an occurrence of the above, with hopefully enough details to reproduce.

My test setup is as follows:

MPTCP enabled VMs acts as a client and server, connected by a (non MPTCP enabled) VM acting as a router.

The three links between the client and router are implemented via virtual network interfaces, connected between VMs via their own dedicated bridges, mimicking independent networks.

The client and server applications, implemented in python act as simple python TCP client and servers, tricking approximately 2KB of data (from urandom) per second. Both are pretty simple, with the client application blocking on send until it completes. The server listens on port 6666 and clients connect to the server's IP 192.168.2.2.

Client machine config:

ip route
default via 192.168.3.1 dev ens18 metric 1 onlink 
default via 192.168.4.1 dev ens19 metric 2 onlink 
default via 192.168.5.1 dev ens20 metric 3 onlink 
192.168.3.0/30 dev ens18 proto kernel scope link src 192.168.3.2 
192.168.4.0/30 dev ens19 proto kernel scope link src 192.168.4.2 
192.168.5.0/30 dev ens20 proto kernel scope link src 192.168.5.2 

ip rule
0:      from all lookup local
32763:  from 192.168.5.2 lookup 3
32764:  from 192.168.4.2 lookup 2
32765:  from 192.168.3.2 lookup 1
32766:  from all lookup main
32767:  from all lookup default

ip route show table 1
default via 192.168.3.1 dev ens18 
192.168.3.0/30 dev ens18 scope link 

ip route show table 2
default via 192.168.4.1 dev ens19 
192.168.4.0/30 dev ens19 scope link 

ip route show table 3
default via 192.168.5.1 dev ens20 
192.168.5.0/30 dev ens20 scope link

ip mptcp endpoint show
192.168.3.2 id 1 subflow 
192.168.4.2 id 2 subflow 
192.168.5.2 id 3 subflow

Router config (on which packets were captured, network interface names should match in the attached pcap):

ip route
default via 192.168.1.254 dev ens18 
192.168.1.0/24 dev ens18 proto kernel scope link src 192.168.1.190 
192.168.2.0/30 dev ens19 proto kernel scope link src 192.168.2.1 
192.168.3.0/30 dev ens20 proto kernel scope link src 192.168.3.1 
192.168.4.0/30 dev ens21 proto kernel scope link src 192.168.4.1 
192.168.5.0/30 dev ens22 proto kernel scope link src 192.168.5.1

Server machine config:

ip route
default via 192.168.2.1 dev ens18 onlink 
192.168.2.0/30 dev ens18 proto kernel scope link src 192.168.2.2

ip rule
0:      from all lookup local
32766:  from all lookup main
32767:  from all lookup default

ip mptcp endpoint show

<No endpoints configured>

Issue described can be reproduced as follows, with the attached packet capture taken during the session below with associated commands:

mptcp remove and add subflow.pcapng.zip

On client machine

Connect client application and observe that a subflow has been created over each configured endpoint :-).

<Client application connects>

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State        
tcp        0      0 192.168.4.2:46775       192.168.2.2:6666        ESTABLISHED
tcp        0    512 192.168.3.2:49424       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.5.2:54835       192.168.2.2:6666        ESTABLISHED

On server machine

Server machine as expected shows the same subflows.

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN     
tcp        0      0 192.168.2.2:6666        192.168.4.2:46775       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.3.2:49424       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.5.2:54835       ESTABLISHED

On client machine

Remove endpoint address and observe associated subflow disconnecting.

ip mptcp endpoint del id 3

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0    128 192.168.4.2:46775       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.3.2:49424       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.5.2:54835       192.168.2.2:6666        TIME_WAIT

On server machine

Subflow has been disconnected here too :-).

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN     
tcp        0      0 192.168.2.2:6666        192.168.4.2:46775       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.3.2:49424       ESTABLISHED

On client machine

Add the endpoint back and observer a new subflow being established.

ip mptcp endpoint add 192.168.5.2 id 3 subflow

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0    128 192.168.4.2:46775       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.3.2:49424       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.5.2:54835       192.168.2.2:6666        TIME_WAIT  
tcp        0      0 192.168.5.2:38537       192.168.2.2:6666        ESTABLISHED

On server machine

Server now shows full complement of flows again.

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN     
tcp        0      0 192.168.2.2:6666        192.168.4.2:46775       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.5.2:38537       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.3.2:49424       ESTABLISHED

On router

Reject traffic being forwarded from adapter on client machine 192.168.5.0/30 network, causing subflow to fail, from the client's point of view.

iptables -t filter -A FORWARD -i ens21 -p TCP -j REJECT --reject-with tcp-reset

On client machine

We see that subflow that was originating from 192.168.4.2 is now gone, following its reception of a RST.

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 192.168.3.2:49424       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.5.2:54835       192.168.2.2:6666        TIME_WAIT  
tcp        0      0 192.168.5.2:38537       192.168.2.2:6666        ESTABLISHED

On server machine

Server is blissfully unaware of the situation and still shows the subflow as established.

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN     
tcp        0      0 192.168.2.2:6666        192.168.4.2:46775       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.5.2:38537       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.3.2:49424       ESTABLISHED

On router

Flush iptables rules, stopping RST transmission.

iptables -t filter -F FORWARD

On client machine

As before remove endpoint address and observe its associated subflow being disconnected.

ip mptcp endpoint del id 3

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0    512 192.168.3.2:49424       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.5.2:38537       192.168.2.2:6666        TIME_WAIT

On server machine

Server follows along and the subflow disconnects from its point of view.

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN     
tcp        0      0 192.168.2.2:6666        192.168.4.2:46775       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.3.2:49424       ESTABLISHED

On client machine

Re-add the endpoint address, expecting a new subflow to be created using it. However instead we see that the new subflow was created using a different endpoint, the previously failed 192.168.4.2. With the newly added 192.168.5.2 endpoint unused.

ip mptcp endpoint add 192.168.5.2 id 3 subflow

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 192.168.3.2:49424       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.4.2:49517       192.168.2.2:6666        ESTABLISHED
tcp        0      0 192.168.5.2:38537       192.168.2.2:6666        TIME_WAIT

On server machine

The server collaborates the story, showing two subflows originating from 192.168.4.2 (the original failed one and the newly created one), but now flows from 192.168.5.2 (the newly added address).

netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN     
tcp        0      0 192.168.2.2:6666        192.168.4.2:49517       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.4.2:46775       ESTABLISHED
tcp        0      0 192.168.2.2:6666        192.168.3.2:49424       ESTABLISHED

I believe this behaviour originates from the select_local_address function. When triggered as part of a call chain from the netlink endpoint add handler selects the first available address with no subflow. In this case 192.168.4.2 rather than the added 192.168.5.2. Upon successful re-reconnection using the old address, the event handler mptcp_pm_nl_subflow_established looks to be called, which will attempt to trigger the next subflow creation. However this is prevented by the condition local_addr_used < local_addr_max, as local_addr_used will not have been decremented following the forced failure of the original subflow.

Thanks,

Phil

The text was updated successfully, but these errors were encountered:

matttbe · 2021-12-16T15:15:07Z

This has been fixed in our export branch thanks to the modifications done by @pabeni:

7dff646: mptcp: fix per socket endpoint accounting
b3b3136: mptcp: clean-up MPJ option writing
93ddc71: mptcp: keep track of local endpoint still available foreach msk
b3adca8: mptcp: do not block subflows creation on errors
0f4945b: selftests: mptcp: add tests for subflow creation failure
- 07368dd: selftests: mptcp: apply Mat's comment
1d0e64a: mptcp: cleanup MPJ subflow list handling
Results: 8c25dcd..c94ca03
bcccc44: mptcp: avoid atomic bit manipulation when possible
Results: c85bd12..1fdc9db

@pgreenland do you mind checking this new version? Please re-open this ticket if you still have issues.

Add a big batch of test coverage to assert all aspects of the tcx opts attach, detach and query API: # ./vmtest.sh -- ./test_progs -t tc_opts [...] #238 tc_opts_after:OK #239 tc_opts_append:OK #240 tc_opts_basic:OK #241 tc_opts_before:OK #242 tc_opts_chain_classic:OK #243 tc_opts_demixed:OK #244 tc_opts_detach:OK #245 tc_opts_detach_after:OK #246 tc_opts_detach_before:OK #247 tc_opts_dev_cleanup:OK #248 tc_opts_invalid:OK #249 tc_opts_mixed:OK #250 tc_opts_prepend:OK #251 tc_opts_replace:OK #252 tc_opts_revision:OK Summary: 15/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Add several new tcx test cases to improve test coverage. This also includes a few new tests with ingress instead of clsact qdisc, to cover the fix from commit dc644b5 ("tcx: Fix splat in ingress_destroy upon tcx_entry_free"). # ./test_progs -t tc [...] #234 tc_links_after:OK #235 tc_links_append:OK #236 tc_links_basic:OK #237 tc_links_before:OK #238 tc_links_chain_classic:OK #239 tc_links_chain_mixed:OK #240 tc_links_dev_cleanup:OK #241 tc_links_dev_mixed:OK #242 tc_links_ingress:OK #243 tc_links_invalid:OK #244 tc_links_prepend:OK #245 tc_links_replace:OK #246 tc_links_revision:OK #247 tc_opts_after:OK #248 tc_opts_append:OK #249 tc_opts_basic:OK #250 tc_opts_before:OK #251 tc_opts_chain_classic:OK #252 tc_opts_chain_mixed:OK #253 tc_opts_delete_empty:OK #254 tc_opts_demixed:OK #255 tc_opts_detach:OK #256 tc_opts_detach_after:OK #257 tc_opts_detach_before:OK #258 tc_opts_dev_cleanup:OK #259 tc_opts_invalid:OK #260 tc_opts_mixed:OK #261 tc_opts_prepend:OK #262 tc_opts_replace:OK #263 tc_opts_revision:OK [...] Summary: 44/38 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/8699efc284b75ccdc51ddf7062fa2370330dc6c0.1692029283.git.daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <[email protected]>

matttbe added the bug label Nov 22, 2021

matttbe assigned pabeni Nov 25, 2021

matttbe closed this as completed Dec 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected subflow management behaviour, following endpoint add after subflow failure #242

Unexpected subflow management behaviour, following endpoint add after subflow failure #242

pgreenland commented Nov 20, 2021

matttbe commented Dec 16, 2021

Unexpected subflow management behaviour, following endpoint add after subflow failure #242

Unexpected subflow management behaviour, following endpoint add after subflow failure #242

Comments

pgreenland commented Nov 20, 2021

matttbe commented Dec 16, 2021