[Cerowrt-devel] SQM in mainline openwrt, fq_codel considered for fedora default

Dave Taht dave.taht at gmail.com
Tue Oct 21 13:59:41 PDT 2014


On Tue, Oct 21, 2014 at 12:51 PM, Dave Taht <dave.taht at gmail.com> wrote:
>> http://snapon.lab.bufferbloat.net/~d/beagle_bql/bql_makes_a_difference.png
>>
>> You can see that BQL makes the most difference in the latency.
>
> And ALSO that these fixes also improved system throughput enormously.

Meant to include that plot.

http://snapon.lab.bufferbloat.net/~d/beagle_bql/beaglebonewithbql.png

You can disregard the decline in download bandwidth (as we are also
sending 5x as many acks and measurement data, which is not counted
in that part of the plot)

> This is partially due to the improvement in ack clocking you get from
> reduced RTTs, partially due to improved cache behavior (shorter
> queues), and partially continual improvements elsewhere in the tcp
> portions of the stack.
>
> With more recent kernels...
>
> I now get full throughput from the beagles in both directions with the
> 3.16 kernel,
> the stil out of tree bql patch, and either fq or fq_codel. I haven't
> got around to plotting all those results (they are from kathie's new
> lab), but they are here:
> http://snapon.lab.bufferbloat.net/~d/pollere/

The latency spikes are generally due to not having BQL, probably:
http://snapon.lab.bufferbloat.net/~d/pollere/beagle/beagle-3.8-nobql-fq-fq_codel.png

This is using the new fq scheduler on both sides, with BQL enabled.

http://snapon.lab.bufferbloat.net/~d/pollere/beagle/beagle_3.16-fq-fq-no-offloads.png

The switch most likely is prioritizing EF marked packets. (as is
sch_fq). Most of the buffering is in the switch, not the host, now.
(the prior results I showed had no switch in the way)

> There is a small buffered tail dropping switch in the way, on these
> later data sets. There was some puzzling behavior on the e1000e that I
> need to poke into in a more controlled setting.
>
> As for other tunables on hosts, TCP small queues might be amiable to
> some tuning, but that too may well evolve further in kernelspace.

So I have now drowned you in data on one architecture and setup. The
most thoroughly publicly analyzed devices and drivers are the ar71xx,
e1000e, and beaglebone at this point.

The use of fq_codel in a qos system (artificially rate limited using
htb, hfsc, or tbf) is pretty well proven to be a huge win at this
point.

At line rates fq and fq_codel still help quite a bit without a BQL
enabled driver, BQL is needed for best results. I don't know to what
extent the BQL enabled drivers already cover the marketplace, it was
generally my assumption that the e1000e counted for a lot...

http://www.bufferbloat.net/projects/bloat/wiki/BQL_enabled_drivers

And thus with all the positive results so far, more wider distribution
of the new stuff
on more devices outside the sample set of those on the bloat,
cerowrt-devel and codel lists (about 500 people all told),

and all ya gotta do is turn it on.

This gets me to the stopping point that we hit a whlle back, which was
reliably determining if a good clocksource was present in the system.
Somehow. clock_getres, perhaps?

> --
> Dave Täht
>
> http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks


More information about the Cerowrt-devel mailing list