[Bloat] some (very good) preliminary results from fiddling with byte queue limits on 100Mbit ethernet

Dave Taht dave.taht at gmail.com
Sat Nov 19 17:47:38 EST 2011


On Sat, Nov 19, 2011 at 10:53 PM, Tom Herbert <therbert at google.com> wrote:
> Thanks for trying this out Dave!

I note that there was MAJOR churn in the 3.2 directory layouts and
if you could rebase that patchset on 3.2 it would be good.

>
>> With byte queue limits at mtu*3 + the SFQ qdisc, latency under load
>> can be hammered
>>  down below 6ms when running at a 100Mbit line rate. No CBQ needed.
>>
> I'm hoping that we didn't have to set the BQL max_limit.  I would
> guess that this might indicate some periodic spikes in interrupt
> latency (BQL will increase limit aggressively in that case).  You
> might want to try adjusting the hold_time to a lower value.  Also,
> disabling TSO might lower the limit.

You will find it helpful in debugging (and the results more pleasing)
to artificially lower your line rate to 100Mbit as per the ethtool trick
noted in the email prior.

This also disables TSO at least on the e1000e.

> Without lowering the max_limit, what values so you see for limit and
> inflight?  If you set min_limit to a really big number (effectively
> turn of BQL), what does inflight grow to?

It is very late in paris right now. I'll apply your suggestions in the morning.

>> Anyway, script could use improvement, and I'm busily patching BQL into
>> the ag71xx driver as I write.

I wish I could make QFQ work without a CBQ. So far no luck. It should be
better than SFQ, with the right classifier. SFQ might be better with a
different classifier... finally, to have options higher in the stack!

>>
> Cool, I look forward to those results!
>
>> Sorry it's taken me so long to get to this since your bufferbloat
>> talks at linux plumbers. APPLAUSE.
>> It's looking like BQL + SFQ is an effective means of improving
>> fairness and reducing latency on drivers
>> that can support it. Even if they have large tx rings that the
>> hardware demands.
>>
> Great.  I actually got back to looking at this a little last week.
> AFAICT the overhead of BQL is < 1% CPU and throughput (still need more
> testing to verify that).

Seeing it work well at 100Mbit (which much of the world still runs at - notably
most ADSL and cable modems are running at that (or less), as do all 3
of my laptops)
*really* made my night. I've been having a losing battle with the
wireless stack
architecture of late...

You don't get a factor of ~50 improvement in something every day at
nearly zero cost!

I mean, with BQL, a saturated 100Mbit system will start new tcp
connects ~50x faster,
do local dns lookups in roughly 22 ms rather than 140, and so on and
so on, and so on.

At the moment I don't care if it eats 10% of CPU! so long as it saves the most
important component of the network - the user - time. :) (thus my
interest in QFQ now)

(particularly as I assume your < 1% of cpu is for gige speeds?)

And being self clocked, BQL can handle scary hardware things like
pause frames better, too.

Win all the way across the board.

Effectively tying the driver to the line rate as BQL seems to do moves the need
for more intelligence in queue management back up into  the qdisc layer.

I recently learned that with multi-cores it's actually possible to
have more than one
packet in the qdisc even at gigE speeds so a better qdisc up there may help even
at that speed, assuming BQL scales up right.

>There are some (very) minor performance
> improvements that might be possible, but I don't have any major
> modifications pending at this point.

My major thought is that bytes on the wire is a proxy for 'time'. If
you did a smoothed
ewma based on bytes/each time interval, you might be able to hold
latencies down even
better, and still use a lightweight timesource like jiffies for the calculation.

All the same, the BQL API is wonderfully clean and you can fiddle as
much as you want with
the core algorithm without exposing the actual scheme elsewhere in the stack.

My hat is off to you.  I HATED the artificially low tx queue rings I
was using in cerowrt...

>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net



More information about the Bloat mailing list