I just spent the last 6 months not talking about the debloat-testing kernel.

The initial test results sucked for eBDP and A* in it.

I didn't trust any layer of the stack at that point in time. Definately not the iwl driver. (I still don't trust the iwl driver!).

(I tend to use eBDP and A* as synonymous below)

I kept testing debloat-testing as we fixed driver problems. The results kept being bad at every step of the way, actually. I figured eBDP was either wrong - or as implemented, broken, and it turns out to be both. More on that below.

All this is sort of what begat the cerowrt project - getting to where we had a driver layer we could trust, which didn't happen until early august... Also needed tcp behaving as expected under various forms of tests, between various bits of hardware, over local and continental distances, which with paris and bloatlab #1 at isc, we now have.... YEA!

I knew fixing bufferbloat was going to take *years* to do, but I'd figured I could fix wireless-n, at least theoretically, in a couple of months. Hah. I wish I could stop destroying my retirement savings and losing sleep worrying about congestion collapse.

ANYWAY:

I've spent all my spare time during this period trying to come up with an algorithm for wireless buffer management (specifically for 802.11n) that invokes much lower latency than what we have today, that also incorporates what we know about tcp and wireless's actual behavior under lightly loaded, heavily loaded, good, and bad conditions.

(I'm leaving the RED replacement to VJ and kathie, as it mostly appears to apply to wired streaming behavior)

I spent much of the last month starting to finally code up what I thought might work better.

(I can barely describe the algorithms to myself right now, forgive me for not publishing them. And for all I know, I'm wrong)

The net result was one bug per new line and a kernel that I'm actually running right now, with most of the new stuff oopsing.

Frustrated with my own attempt, and having my be-all-end-all solution firmly cemented in my own head and written down enough so that nothing can dislodge it, I decided to sneak a peek at eBDP and the A* algorithm again.

There is a great deal in Tanji Li's eBDP paper that it has right - measuring time in queue, using ewma, etc, but a few things that make it unusable for wireless-n, and dubious for wireless-g.

I'm going to refer to the paper several times. http://www.hamilton.ie/tianji_li/buffersizing.pdf

0) A* and eBDP (mis) uses the VO and VI queues in several ways, using VO for acks and VI for other purposes.

"Namely, while the data packets for the n flows have an aggregate n/(n + 1) share of the transmission opportunities the TCP ACKs for the n flows have only a 1/(n+1) share. Issues of this sort are known to lead to significant unfairness amongst TCP flows but can be readily resolved using 802.11e functionality by treating TCP ACKs as a separate traffic class which is assigned higher priority [15]. ”

And:

"We put TCP ACK packets into a high priority queue (we use the WME AC VO queue of MadWifi as an example) which is assigned parameters of CWmin = 3, CWmax = 7 and AIF S = 2. TCP data packets are collected into a lower priority queue (we use the WME AC VI queue) 2) "

This is just plain wrong. In wireless-g the relationship between these queues is far more sane than in -n. Aggregation doesn't currently happen in the VO(?) queue - thus in addition to having a larger transmit window, the VO queue can aggregate up to 64 packets.

I had been wondering why some people kept saying that the catagory of packets we now call 'ANT's, included 'TCP ACKS'. Because ANTS aren't ACKS. Never were.

Now I think I get why, they were conflating this conclusion from the eBDP paper.

And the 'published eBDP's use of the queues doesn't scale right on n at all. G to some extent. B, sure.

VO is for voice and high priority packets. In no way is it suitable for acks, particularly on n.

ANTs belong in VO and VI.

TCP ACKS are a whole other animal and belong (somehow) in the same queue they came from.

And so, the otherwise promising experimental results and benefits of A* aren't useful.

"All nodes run a Linux 2.6.21.1 kernel and a MadWifi wireless driver
(version r2366) modified to allow us to adjust the 802.11e
CWmin , CWmax and AIF S parameters as required. Specific
vendor features on the wireless card, such as turbo mode, rate
adaptation and multi-rate retries, are disabled. All of the tests
are performed using a transmission rate of 11Mbps (i.e., we
use an 802.11b PHY) with RTS/CTS disabled and the channel
number explicitly set."

Even more fun...

1) as implemented in debloat-testing, eBDP doesn't use different queues anyway!

The algorithm described in 0, above, more or less depends on the EDCA time distribution function actually functioning differently between the queue types and spreading the load across multiple stations for you. To think of it one way, what they establish with A* is a a clocked *bidirectional* link distributed evenly across the time and station domain, with a small, fast backchannel for acks....

the debloat-testing patch wedges everything into whatever queue it came from, which isn't how A*/eBDP was implemented, and thus the independent clocking goes away, as does performance, because we're wedging packets back into the same hose they came from.

John was right when he wrote in the initial announcement:

"This version runs separate instances of the algorithm for each WMM
queue -- I reckon that makes sense. "

and it doesn't. 

"I still ignore HT aggregation, as I'm pretty sure I don't understand how it really effects the issue."

Messes it up completely.

Like so many other pieces of published wireless research today eBDP/A* it seems only to apply to 802.11b, and maybe g to some extent, but not n. The devil is in the details. Actually I thought the max delay time was too high anyway with this algo. I really had great hopes for it but in re-reading this paper months later I think we were delusional.

There are some bits in John's eBDP implementation that can be salvaged. And it does, indeed, as implemented, lower latency to some extent - but mostly by capping bandwidth.

....

I am considerably happier about the byte queue limits(bql) and "completions rate" stuff that is also in debloat-testing. That said, I haven't actually used it for much yet. It's just nice to have the extra skb fields!!!

However as complex queue management really doesn't belong in the device driver, I've been pushing up the core eBDP ideas of ewma, bql, and timed queue limits into the packet scheduling layer, where all sorts of interesting stuff (like classification, QFQ, red/sfb/sfq etc) becomes far easier.

If it wasn't oopsing.

--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net