[Bloat] BQL, txqueue lengths and the internet of things
Dave Taht
dave.taht at gmail.com
Wed Jun 11 15:49:43 PDT 2014
The bloat problem and solutions are not just limited to fixing
routers, but hosts.
Nearly every low end board I've seen out there forgos a gigE ethernet
interface in favor of a lower power and cost 100mbit interface.
No distro I've seen modifies the default pfifo txqueuelen from the
current 1000 packet default down to a more reasonable 100 packet
default in that case. And, while many ethernet devices in this
category are hooked up via usb (and currently hard to add BQL support
to), some are not, and byte queue limit support can be easily added to
those.
Sadly byte queue limits (BQL) is only implemented on a bunch of top
end ethernet drivers. (about 10, last I looked)
I needed a break from big problems, so a couple late nights later, I
have a very small patch adding support for BQL to the beaglebone
black:
http://snapon.lab.bufferbloat.net/~d/0001-Add-BQL-support-to-cpsw-beaglebone-driver.patch
And the results were quite pleasing at 100mbit. BQL holds things down
to two full size packets in the tx ring and we see an enormous
improvement in bidirectional throughput, jitter, and latency.
http://snapon.lab.bufferbloat.net/~d/beagle_bql/bql_makes_a_difference.png
http://snapon.lab.bufferbloat.net/~d/beagle_bql/beaglebonewins.png
The default linux behavior ( pfifo fast, txqueue 1000 ) prior to this
patch looked pretty awful:
http://snapon.lab.bufferbloat.net/~d/beagle_nobql/pfifo_nobql_tsq3028txqueue1000.svg
and went to looking like this:
http://snapon.lab.bufferbloat.net/~d/beagle_bql/pfifo_bql_tsq3028txqueue1000.svg
And adding the new fq scheduler looked like this:
http://snapon.lab.bufferbloat.net/~d/beagle_bql/fq_bql_tsq3028.svg
(fq_codel was similar)
The fact that we don't achieve full upload throughput on this last
test is probably
due to having a tail dropping switch in the way, and/or some dma dequeuing
cleanup conflicts between the low level transmit and receive queues on
this device (they share an interrupt AND use napi which seems
puzzling).
But any day I can get a 4-10x improvement in latency and throughput is
a good day. One IoT device down, thousands to go. It would be nice if
the chipmakers were incorporating bql into boxes destined for the
internet of things.
--
Dave Täht
More information about the Bloat
mailing list