[Bloat] tcp: smoother receiver autotuning

Dave Taht dave.taht at gmail.com
Sun Dec 10 21:52:28 EST 2017


One of the hacks in the android world has been to limit the receive
window. The patches eric just submitted might have some impact on the
results reported here, years ago:

https://pdfs.semanticscholar.org/b293/ec57821a27bfb96d15cd11d8141e04610153.pdf


---------- Forwarded message ----------
From: Eric Dumazet <edumazet at google.com>
Date: Sun, Dec 10, 2017 at 5:55 PM
Subject: [PATCH net-next 3/3] tcp: smoother receiver autotuning
To: "David S . Miller" <davem at davemloft.net>, Neal Cardwell
<ncardwell at google.com>, Yuchung Cheng <ycheng at google.com>, Soheil
Hassas Yeganeh <soheil at google.com>, Wei Wang <weiwan at google.com>,
Priyaranjan Jha <priyarjha at google.com>
Cc: netdev <netdev at vger.kernel.org>, Eric Dumazet
<edumazet at google.com>, Eric Dumazet <eric.dumazet at gmail.com>


Back in linux-3.13 (commit b0983d3c9b13 ("tcp: fix dynamic right sizing"))
I addressed the pressing issues we had with receiver autotuning.

But DRS suffers from extra latencies caused by rcv_rtt_est.rtt_us
drifts. One common problem happens during slow start, since the
apparent RTT measured by the receiver can be inflated by ~50%,
at the end of one packet train.

Also, a single drop can delay read() calls by one RTT, meaning
tcp_rcv_space_adjust() can be called one RTT too late.

By replacing the tri-modal heuristic with a continuous function,
we can offset the effects of not growing 'at the optimal time'.

The curve of the function matches prior behavior if the space
increased by 25% and 50% exactly.

Cost of added multiply/divide is small, considering a TCP flow
typically would run this part of the code few times in its life.

I tested this patch with 100 ms RTT / 1% loss link, 100 runs
of (netperf -l 5), and got an average throughput of 4600 Mbit
instead of 1700 Mbit.

Signed-off-by: Eric Dumazet <edumazet at google.com>
Acked-by: Soheil Hassas Yeganeh <soheil at google.com>
Acked-by: Wei Wang <weiwan at google.com>
---
 net/ipv4/tcp_input.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2900e58738cde0ad1ab4a034b6300876ac276edb..fefb46c16de7b1da76443f714a3f42faacca708d
100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -601,26 +601,17 @@ void tcp_rcv_space_adjust(struct sock *sk)
        if (sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf &&
            !(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
                int rcvmem, rcvbuf;
-               u64 rcvwin;
+               u64 rcvwin, grow;

                /* minimal window to cope with packet losses, assuming
                 * steady state. Add some cushion because of small variations.
                 */
                rcvwin = ((u64)copied << 1) + 16 * tp->advmss;

-               /* If rate increased by 25%,
-                *      assume slow start, rcvwin = 3 * copied
-                * If rate increased by 50%,
-                *      assume sender can use 2x growth, rcvwin = 4 * copied
-                */
-               if (copied >=
-                   tp->rcvq_space.space + (tp->rcvq_space.space >> 2)) {
-                       if (copied >=
-                           tp->rcvq_space.space + (tp->rcvq_space.space >> 1))
-                               rcvwin <<= 1;
-                       else
-                               rcvwin += (rcvwin >> 1);
-               }
+               /* Accommodate for sender rate increase (eg. slow start) */
+               grow = rcvwin * (copied - tp->rcvq_space.space);
+               do_div(grow, tp->rcvq_space.space);
+               rcvwin += (grow << 1);

                rcvmem = SKB_TRUESIZE(tp->advmss + MAX_TCP_HEADER);
                while (tcp_win_from_space(sk, rcvmem) < tp->advmss)
--
2.15.1.424.g9478a66081-goog



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619


More information about the Bloat mailing list