[Bloat] debloating TCP further in linux 3.14

Dave Taht dave.taht at gmail.com
Wed Apr 2 15:12:16 EDT 2014


In addition to the SQM work ongoing...

people keep improving the deployed TCPs... a multitude of improvements landed in
Linux 3.14, notably these two caught my eye. usec resolution for tcp
in 3.15, oy, vey!

I am planning on stabilizing my summer's testbed on 3.14 (and having a
3.4 box lying
around for comparison)... unless any other cool patches arrive!

I've seen a very similar problem (I think) on OSX, where the TCP cwnd stays
stuck at 2*TSO on some kinds of paths where it shouldn't.

I am under the impression that TCP small queue limits is not fully enabled by
default due to some devices like wifi and non-bqled ethernet not
working well. While the subsystems exist the driver writers are
lagging...

commit d10473d4e3f9d1b81b50a60c8465d6f59a095c46
Author: Eric Dumazet <edumazet at google.com>
Date:   Sat Feb 22 22:25:57 2014 -0800

    tcp: reduce the bloat caused by tcp_is_cwnd_limited()

    tcp_is_cwnd_limited() allows GSO/TSO enabled flows to increase
    their cwnd to allow a full size (64KB) TSO packet to be sent.

    Non GSO flows only allow an extra room of 3 MSS.

    For most flows with a BDP below 10 MSS, this results in a bloat
    of cwnd reaching 90, and an inflate of RTT.

    Thanks to TSO auto sizing, we can restrict the bloat to the number
    of MSS contained in a TSO packet (tp->xmit_size_goal_segs), to keep
    original intent without performance impact.

    Because we keep cwnd small, it helps to keep TSO packet size to their
    optimal value.

    Example for a 10Mbit flow, with low TCP Small queue limits (no more than
    2 skb in qdisc/device tx ring)

    Before patch :

    lpk51:~# ./ss -i dst lpk52:44862 | grep cwnd
             cubic wscale:6,6 rto:215 rtt:15.875/2.5 mss:1448 cwnd:96
    ssthresh:96
    send 70.1Mbps unacked:14 rcv_space:29200

    After patch :

    lpk51:~# ./ss -i dst lpk52:52916 | grep cwnd
             cubic wscale:6,6 rto:206 rtt:5.206/0.036 mss:1448 cwnd:15
    ssthresh:14
    send 33.4Mbps unacked:4 rcv_space:29200


commit 4a5ab4e224288403b0b4b6b8c4d339323150c312
Author: Eric Dumazet <edumazet at google.com>
Date:   Thu Feb 6 15:57:10 2014 -0800

    tcp: remove 1ms offset in srtt computation

    TCP pacing depends on an accurate srtt estimation.

    Current srtt estimation is using jiffie resolution,
    and has an artificial offset of at least 1 ms, which can produce
    slowdowns when FQ/pacing is used, especially in DC world,
    where typical rtt is below 1 ms.

    We are planning a switch to usec resolution for linux-3.15,
    but in the meantime, this patch removes the 1 ms offset.

    All we need is to have tp->srtt minimal value of 1 to differentiate
    the case of srtt being initialized or not, not 8.

    The problematic behavior was observed on a 40Gbit testbed,
    where 32 concurrent netperf were reaching 12Gbps of aggregate
    speed, instead of line speed.

    This patch also has the effect of reporting more accurate srtt and send
    rates to iproute2 ss command as in :

    $ ss -i dst cca2
    Netid  State      Recv-Q Send-Q          Local Address:Port
    Peer Address:Port
    tcp    ESTAB      0      0                10.244.129.1:56984
    10.244.129.2:12865
         cubic wscale:6,6 rto:200 rtt:0.25/0.25 ato:40 mss:1448 cwnd:10 send
    463.4Mbps rcv_rtt:1 rcv_space:29200
    tcp    ESTAB      0      390960           10.244.129.1:60247
    10.244.129.2:50204
         cubic wscale:6,6 rto:200 rtt:0.875/0.75 mss:1448 cwnd:73 ssthresh:51
    send 966.4Mbps unacked:73 retrans:0/121 rcv_space:29200

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html



More information about the Bloat mailing list