[Bloat] FQ_Codel lwn draft article review

Fri Nov 23 03:57:34 EST 2012

David Woodhouse and I fiddled a lot with adsl and openwrt and a
variety of drivers and network layers in a typical bonded adsl stack
yesterday. The complexity of it all makes my head hurt. I'm happy that
a newly BQL'd ethernet driver (for the geos and qemu) emerged from it,
which he submitted to netdev...

I made a recording of us last night discussing the layers, which I
will produce and distribute later...

Anyway, along the way, we fiddled a lot with trying to analyze where
the 350ms or so of added latency was coming from in the traverse geo's
adsl implementation and overlying stack....

Plots: http://david.woodhou.se/dwmw2-netperf-plots.tar.gz

Note: 1:

The  netperf sample rate on the rrul test needs to be higher than
100ms in order to get a decent result at sub 10Mbit speeds.

Note 2:

The two nicest graphs here are nofq.svg vs fq.svg, which were taken on
a gigE link from a Mac running Linux to another gigE link. (in other
words, NOT on the friggin adsl link) (firefox can display svg, I don't
know what else) I find the T+10 delay before stream start in the
fq.svg graph suspicious and think the "throw out the outlier" code in
the netperf-wrapper code is at fault. Prior to that, codel is merely
buffering up things madly, which can also be seen in the pfifo_fast
behavior, with 1000pkts it's default.

(Arguably, the default queue length in codel can be reduced from 10k
packets to something more reasonable at GigE speeds)

(the indicator that it's the graph, not the reality, is that the
fq.svg pings and udp start at T+5 and grow minimally, as is usual with
fq_codel.)

As for the *.ps graphs, well, they would take david's network topology
to explain, and were conducted over a variety of circumstances,
including wifi, with more variables in play than I care to think
about.

We didn't really get anywhere on digging deeper. As we got to purer
tests - with a minimal number of boxes, running pure ethernet,
switched over a couple of switches, even in the simplest two box case,
my HTB based "ceroshaper" implementation had multiple problems in
cutting median latencies below 100ms, on this very slow ADSL link.
David suspects problems on the path along the carrier backbone as a
potential issue, and the only way to measure that is with two one way
trip time measurements (rather than rtt), time synced via ntp... I
keep hoping to find a rtp test, but I'm open to just about any option
at this point. anyone?

We also found a probable bug in mtr in that multiple mtrs on the same
box don't co-exist.

Moving back to more scientific clarity and simpler tests...

The two graphs, taken a few weeks back, on pages 5 and 6 of this:

http://www.teklibre.com/~d/bloat/Not_every_packet_is_sacred-Battling_Bufferbloat_on_wifi.pdf

appear to show the advantage of fq_codel fq + codel + head drop over
tail drop during the slow start period on a 10Mbit link - (see how
squiggly slow start is on pfifo fast?) as well as the marvelous
interstream latency that can be achieved with BQL=3000 (on a 10 mbit
link.)  Even that latency can be halved by reducing BQL to 1500, which
is just fine on a 10mbit. Below those rates I'd like to be rid of BQL
entirely, and just have a single packet outstanding... in everything
from adsl to cable...

That said, I'd welcome other explanations of the squiggly slowstart
pfifo_fast behavior before I put that explanation on the slide.... ECN
was in play here, too. I can redo this test easily, it's basically
running a netperf TCP_RR for 70 seconds, and starting up a TCP_MAERTS
and TCP_STREAM for 60 seconds a T+5, after hammering down on BQL's
limit and the link speeds on two sides of a directly connected laptop
connection.

ethtool -s eth0 advertise 0x002 # 10 Mbit