[Bloat] setting queue depth on tail drop configurations of pfifo_fast

Fri Mar 27 18:14:11 EDT 2015

Dave Lang

Thanks for the quick response

For this very specific test, I am doing one-way netperf-wrapper packet tests that will (almost) always be sending 1500 byte packets. I am then running some ABR traffic cross traffic to see how it responds to FQ_AQM and AQM (where AQM == Codel and PIE). I am using the pfifo_fast as a baseline. The Codel, FQ_codel, PIE and FQ_PIE stuff is working fine. I need to tweak the pfifo_fast queue length to do some comparisons.

One of the test scenarios is a 3 Mbps ABR video flow on a 4 Mbps link, with and without cross traffic. I have already done what you suggested, and the ABR traffic drives the pfifo_fast code into severe congestion (even with no cross traffic), with a 3 second bloat. This is a bit surprising until you think about how the ABR code fills its video buffer at startup and then during steady state playout. I will send a detailed note once I get a chance to write it up properly. 

I would like to reduce the tail drop queue size to 100 packets (down from the default of 1000) and see how that impacts the test. 3 seconds of bloat is pretty bad, and I would like to compare how ABR works at at 1 second and at 200-300 ms.

Bill Ver Steeg
DISTINGUISHED ENGINEER 
versteb at cisco.com

-----Original Message-----
From: David Lang [mailto:david at lang.hm] 
Sent: Friday, March 27, 2015 6:02 PM
To: Bill Ver Steeg (versteb)
Cc: bloat at lists.bufferbloat.net
Subject: Re: [Bloat] setting queue depth on tail drop configurations of pfifo_fast

BQL and HTB are not really comparible things.

all the BQL does is to change the definition of the length of a buffer from X packets to X bytes.

using your example, 1000 packets of 1500 bytes is 1.5MB or 120ms at 100Mb. But if you aren't transmitting 1500 byte packets, and are transmitting 75 byte packets instead, it's only 6ms worth of buffering.

The bottom line is that sizing buffers by packets doesn't work.

HTB creates virtual network interfaces that chop up the available bandwidth of the underlying device. I believe that if the underlying device supports BQL, HTB is working on byte length allocations, not packet counts.

fq_codel doesn't have fixed buffer sizes, it takes a completely different approach that works much better in practice.

The document that you found is actually out of date. Rather than trying to tune each thing for optimum performance and then measureing things, just benchmark the stock, untuned setup that you have and the simple fq_codel version without any tweaks and see if that does what you want. You can then work on tweaking things from there, but the improvements will be minor compared to doing the switch in the first place.

A good tool for seeing the performance (throughput and latency) is netperf-wrapper. Set it up and just test the two configs. The RRUL test is especially good at showing the effects of the switch.

David Lang

On Fri, 27 Mar 2015, Bill Ver Steeg (versteb) wrote:

> Date: Fri, 27 Mar 2015 21:45:11 +0000
> From: "Bill Ver Steeg (versteb)" <versteb at cisco.com>
> To: "bloat at lists.bufferbloat.net" <bloat at lists.bufferbloat.net>
> Subject: [Bloat] setting queue depth on tail drop configurations of	pfifo_fast
> 
> Bloaters-
>
> I am looking into how Adaptive Bitrate video algorithms interact with 
> the various queue management schemes. I have been using the netperf 
> and netperf wrapper tools, along with the macros to set the links 
> states (thanks Toke and Dave T). I am using HTB rather than BQL, which 
> may have something to do with the issues below. I am getting some 
> interesting ABR results, which I will share in detail with the group once I write them up.
>
> I need to set the transmit queue length of my Ubuntu ethernet path 
> while running tests against the legacy pfifo_fast (tail drop) 
> algorithm.  The default value is 1000 packets, which boils down to 1.5 
> MBytes. At 100 Mbps, this gives me a 120ms tail drop buffer, which is big, but somewhat reasonable.
> When I then run tests at 10 Mbps, the buffer becomes a 1.2 second 
> bloaty buffer. When I run tests at 4 Mbps, the buffer becomes a 3 
> second extra-bloaty buffer. This gives me some very distinct ABR 
> results, which I am looking into in some detail. I do want to try a 
> few more delay values for tail drop at 4 Mbps.
>
> https://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_ben
> chmarking_Codel_and_FQ_Codel says to set txqueuelen to the desired 
> size, which makes sense. I have tried several ways to do this on 
> Ubuntu, with no glory. The way that seems it should have worked was 
> "ifconfig eth8 txqueuelen 100". When I then check the txqueuelen using 
> ifconfig, it looks correct. However, the delay measurements still stay 
> up near 3 seconds under load. When I check the queue depth using "tc 
> -s -d qdisc ls dev ifb_eth8", it shows the very large backlog in 
> pfifo_fast under load.
>
> So, has anybody recently changed the ethernet/HTB transmit packet 
> queue size for pfifo_fast in Ubuntu? If so, any pointers? I will also 
> try to move over to BQL and see if that works better than HTB...... I 
> am not sure that my ethernet drivers have BQL support though, as they 
> complain when I try to load it as the queue discipline.
>
> Thanks in advance
> Bill VerSteeg
>
>