[Bloat] A little nearly GigE testing of various AQM and packet scheduling and fair queuing technologies
Dave Taht
dave.taht at gmail.com
Thu Nov 20 11:14:45 EST 2014
I finally my rangeley off of openwrt and onto ubuntu so as to do some
testing of variants of codel, pie, sfq, etc, for the first time at
gigE speeds, using the tip of net-next throughout (linux 3.18+, which
has the new very exciting bulk_xmit support, and some fixes for GSO
handling, and a few other tcp fixes I like a lot).
All tests done with ecn on on a short path (nuc1 -> rangeley -> nuc2).
I am well aware that longer paths are harder and more interesting, it
was just that finding hardware on my budget, that could run fast
enough to also measure at GigE speeds, has been a problem.
(It does appear that I can make netem in this linux release insert
delay sanely, finally, so I hope to have time to try to duplicate
tokes results at slower speeds using different tools)
See:
http://snapon.lab.bufferbloat.net/~d/cakewins.png
Highlights:
* the still under-prototyping cake and cake2 wins on latency and
throughput across the board, rate limited or not rate limited (way to
go jon!) (note: flowblind mode is with a single queue, so it tests the
htb-like-internal scheduler + codel only)
* At these rates. BQL is *needed* to keep the hardware busy. (I know,
"duh") With insufficient BQL the various algos starve. cake* starves
the least, however, compared to htb + algo. Most of the tests ran with
BQL = 8000, which is not explicitly expressed in the data set. The bql
32000 results were much better. tail drop or head drop, you still need
to have enough stuff in the driver to deal with linux scheduler and
x86 context switch latency.
* Offloads are are needed for best results (GRO in particular). The
rangeley still appears to be "loafing" at these speeds but has high
context switch overhead...
* Limited numbers of hardware queues (the rangeley has 8) hit the
birthday problem in quite obvious ways from the data set...
* variants of codel beat pie across the board on latency and are an
even match on throughput.
* The new simplified version of codel compares almost equally with the
older one. (whether that holds up at longer RTTs is still a question)
* A packet limit of 1000 results in some tail drop behavior on pie and
codel with offloads off. Upped it to 10000. (this still points to a
byte limit being preferable to a packet limit)
rather than do more plots, the relevant netperf-wrapper data set is
at: http://snapon.lab.bufferbloat.net/~d/rangeley_routing.tgz
useful subsets are *900mbit* and *bql_32000*
I would have liked to have got a better grip on xmit_more, and
although the rangeley has support for it in its ethernet card, the
e1000es in the nucs, dont.
--
Dave Täht
thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
More information about the Bloat
mailing list