[Bloat] debloating TCP further in linux 3.14
Dave Taht
dave.taht at gmail.com
Wed Apr 2 15:12:16 EDT 2014
In addition to the SQM work ongoing...
people keep improving the deployed TCPs... a multitude of improvements landed in
Linux 3.14, notably these two caught my eye. usec resolution for tcp
in 3.15, oy, vey!
I am planning on stabilizing my summer's testbed on 3.14 (and having a
3.4 box lying
around for comparison)... unless any other cool patches arrive!
I've seen a very similar problem (I think) on OSX, where the TCP cwnd stays
stuck at 2*TSO on some kinds of paths where it shouldn't.
I am under the impression that TCP small queue limits is not fully enabled by
default due to some devices like wifi and non-bqled ethernet not
working well. While the subsystems exist the driver writers are
lagging...
commit d10473d4e3f9d1b81b50a60c8465d6f59a095c46
Author: Eric Dumazet <edumazet at google.com>
Date: Sat Feb 22 22:25:57 2014 -0800
tcp: reduce the bloat caused by tcp_is_cwnd_limited()
tcp_is_cwnd_limited() allows GSO/TSO enabled flows to increase
their cwnd to allow a full size (64KB) TSO packet to be sent.
Non GSO flows only allow an extra room of 3 MSS.
For most flows with a BDP below 10 MSS, this results in a bloat
of cwnd reaching 90, and an inflate of RTT.
Thanks to TSO auto sizing, we can restrict the bloat to the number
of MSS contained in a TSO packet (tp->xmit_size_goal_segs), to keep
original intent without performance impact.
Because we keep cwnd small, it helps to keep TSO packet size to their
optimal value.
Example for a 10Mbit flow, with low TCP Small queue limits (no more than
2 skb in qdisc/device tx ring)
Before patch :
lpk51:~# ./ss -i dst lpk52:44862 | grep cwnd
cubic wscale:6,6 rto:215 rtt:15.875/2.5 mss:1448 cwnd:96
ssthresh:96
send 70.1Mbps unacked:14 rcv_space:29200
After patch :
lpk51:~# ./ss -i dst lpk52:52916 | grep cwnd
cubic wscale:6,6 rto:206 rtt:5.206/0.036 mss:1448 cwnd:15
ssthresh:14
send 33.4Mbps unacked:4 rcv_space:29200
commit 4a5ab4e224288403b0b4b6b8c4d339323150c312
Author: Eric Dumazet <edumazet at google.com>
Date: Thu Feb 6 15:57:10 2014 -0800
tcp: remove 1ms offset in srtt computation
TCP pacing depends on an accurate srtt estimation.
Current srtt estimation is using jiffie resolution,
and has an artificial offset of at least 1 ms, which can produce
slowdowns when FQ/pacing is used, especially in DC world,
where typical rtt is below 1 ms.
We are planning a switch to usec resolution for linux-3.15,
but in the meantime, this patch removes the 1 ms offset.
All we need is to have tp->srtt minimal value of 1 to differentiate
the case of srtt being initialized or not, not 8.
The problematic behavior was observed on a 40Gbit testbed,
where 32 concurrent netperf were reaching 12Gbps of aggregate
speed, instead of line speed.
This patch also has the effect of reporting more accurate srtt and send
rates to iproute2 ss command as in :
$ ss -i dst cca2
Netid State Recv-Q Send-Q Local Address:Port
Peer Address:Port
tcp ESTAB 0 0 10.244.129.1:56984
10.244.129.2:12865
cubic wscale:6,6 rto:200 rtt:0.25/0.25 ato:40 mss:1448 cwnd:10 send
463.4Mbps rcv_rtt:1 rcv_space:29200
tcp ESTAB 0 390960 10.244.129.1:60247
10.244.129.2:50204
cubic wscale:6,6 rto:200 rtt:0.875/0.75 mss:1448 cwnd:73 ssthresh:51
send 966.4Mbps unacked:73 retrans:0/121 rcv_space:29200
--
Dave Täht
Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
More information about the Bloat
mailing list