Skip to content

Commit

Permalink
mptcp: adjust mptcp receive buffer limit if subflow has larger one
Browse files Browse the repository at this point in the history
In addition to tcp autotuning during read, it may also increase the
receive buffer in tcp_clamp_window().

In this case, mptcp should adjust its receive buffer size as well so
it can move all pending skbs from the subflow socket to the mptcp socket.

At this time, TCP can have more skbs ready for processing than what the
mptcp receive buffer size allows.

In the mptcp case, the receive window announced is based on the free
space of the mptcp parent socket instead of the individual subflows.

Following the subflow allows mptcp to grow its receive buffer.

This is especially noticeable for loopback traffic where two skbs are
enough to fill the initial receive window.

In mptcp_data_ready() we do not hold the mptcp socket lock, so modifying
mptcp_sk->sk_rcvbuf is racy.  Do it when moving skbs from subflow to
mptcp socket, both sockets are locked in this case.

Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
  • Loading branch information
Florian Westphal authored and jenkins-tessares committed Sep 17, 2020
1 parent 70d3123 commit 67d6a19
Showing 1 changed file with 22 additions and 5 deletions.
27 changes: 22 additions & 5 deletions net/mptcp/protocol.c
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,18 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
bool more_data_avail;
struct tcp_sock *tp;
bool done = false;
int sk_rbuf;

sk_rbuf = READ_ONCE(sk->sk_rcvbuf);

if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
int ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf);

if (unlikely(ssk_rbuf > sk_rbuf)) {
WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf);
sk_rbuf = ssk_rbuf;
}
}

pr_debug("msk=%p ssk=%p", msk, ssk);
tp = tcp_sk(ssk);
Expand Down Expand Up @@ -508,7 +520,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
WRITE_ONCE(tp->copied_seq, seq);
more_data_avail = mptcp_subflow_data_available(ssk);

if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf)) {
if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) {
done = true;
break;
}
Expand Down Expand Up @@ -602,6 +614,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
struct mptcp_sock *msk = mptcp_sk(sk);
int sk_rbuf, ssk_rbuf;
bool wake;

/* move_skbs_to_msk below can legitly clear the data_avail flag,
Expand All @@ -612,12 +625,16 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
if (wake)
set_bit(MPTCP_DATA_READY, &msk->flags);

if (atomic_read(&sk->sk_rmem_alloc) < READ_ONCE(sk->sk_rcvbuf) &&
move_skbs_to_msk(msk, ssk))
ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf);
sk_rbuf = READ_ONCE(sk->sk_rcvbuf);
if (unlikely(ssk_rbuf > sk_rbuf))
sk_rbuf = ssk_rbuf;

/* over limit? can't append more skbs to msk */
if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf)
goto wake;

/* don't schedule if mptcp sk is (still) over limit */
if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf))
if (move_skbs_to_msk(msk, ssk))
goto wake;

/* mptcp socket is owned, release_cb should retry */
Expand Down

0 comments on commit 67d6a19

Please sign in to comment.