[Bloat] Some updates on hacking on AQMS

Dave Taht dave.taht at gmail.com
Thu Jun 9 10:01:26 EDT 2011

I'm going to make some notes against my original posting 'Some notes
on hacking on AQM's' here, because in less than a day, much progress
been made.

On Wed, Jun 8, 2011 at 6:12 AM, Dave Taht <dave.taht at gmail.com> wrote:
> So in addition to hacking on the switch, I've been poking into the behavior of multiple AQM systems in the kernel


> Some notes:
> There are as many philosophies to AQM as there are shapers and classifiers.
> None of the Linux shaper scripts in the field handle ipv6 traffic.

I've got some fixes for this in the 3 (soon to be 4) shapers I'm playing with.

According to the netfilter guys, DDR scheduling + HFSC is the new
'hotness', and I was supplied a script that uses it... (which I think
was noted in the previous thread, if it was private email, I'll

I have not (as yet) patched DDR support into the openwrt testbed.

> HTB is the most commonly used qdisc, handles it's bandwidth limits by packet
> drop
> and doesn't do ECN. It's usually used in conjunction with other qdiscs, too.
> An explanation of how diffserv (dsmark) and GRED are supposed to play ball
> together
> (starting here: http://www.opalsoft.net/qos/DS-27.htm) is so amazingly not
> opaque.
> SFB remains promising, but until I get a ported tc for it,
> I can't play with it much.

Next week iproute2-2.6.39 may be released, and I'll slam it into
openwrt, assuming it can be made to work with

> SFQ is the second most commonly used qdisc, but doesn't balance in ways ESFQ
> could.
> ESFQ really looked like a winner and I'm sorry it never made the mainline
> kernel.

ESFQ features were added to SFQ 4 years ago. Few use them, but they
are there. It's actually more flexible than ESFQ was. Core among them
is the ability to not match against flows but against distinct IP
addresses, which makes sense in this bittorrent-ed age.

> HFSC is mind-bending as to what it tries to do.

It really seems that ECN support could be added generically for all
qdiscs that currently do packet drop. Creating a generic mark_or_drop
function is easy. The difficulties lie in adding 'marking' counters to
every qdisc and userspace.

The overall effacy of ECN when more widely deployed
remains an unknown, but given the heroic efforts given
packet delivery today across 3g/4g/wireless, providing
a postitive congestion signaling mechanism across those links seems
ever more important.

I've added an ipod, an ipad, and a windows 7 and windows vista box to
the list of machines I can test ECN with.

> Any form of fair queuing is useful for ethernet, but actually knowing the
> link rate and port on the switch per dest macaddr would help in load
> balancing streams.
> Fair queueing is very bad on wireless when packet aggregation is used.

I currently find the mismatch between wired and wireless to be so
extreme that I'm no longer bridging the two wherever possible, thus
reducing the effects of multicast.

Naturally this introduces problems of its own. As one example, windows
needs a wins proxy supplied in order to 'see' machines on the other
side of a router. MDNS seems to be another issue.

Openwrt, seems to generally come with PIMv1 and PIMv2 disabled, and I
don't know to what extent these protocols are actually used.

IGMPproxy seems to work, but there are probably many more gotchas with
short haul multicast left to be exposed.

Another problem is that the wndr3700 comes without a distinct mac for
one ethernet device, and the switch is either unassigned one or my
code for finding it is wrong.

Still, I'm loving not having wired and wireless bridged,
a busy wired link (lots of arp requests, for example) seemingly does
terrible things to a wireless one when bridged.

> PFIFO_FAST is tied to TOS bits, not diffserv bits.
> RED is, well, RED.
> GRED is far less opaque than RED, as noted earlier.

> 802.11e does its prioritization at the vlan layer, not at the TOS or
> diffserv bits. Getting from tos or diffserv to mq* seems painful but I
> haven't looked into it too hard.

Here's a really obscure bit of info gleaned from nbd (and I have not
tried this yet, but it seems really important to do so)

In the mac80211 code....

If on entry you mark a packet's skb->priority field with a value that
matches the 802.1d priority fields (0..7) + 256 - the mac80211 layer
interprets that  and wedges the result into the 4 bands available in
802.11e. (regrettably it's not linear, both 0 and 3 = BE)

Those are BK (background), BE, Video, and Voice.

(without classification, everything ends up in BE)

The interesting thing about the (rather underused) video and voice
catagories is that they actually work at the
physical layer in separate bands across the Cwin.

So, in addition to voice, I'd like to try and wedge all the
non-multicast, low volume 'mice' packets like NTP, ND, etc into the
Voice category, treating 'voice' more like a control plane, than just
voice. This will probably tickle more bugs, but the idea is kind of

I'll write more about this later.

It's really not clear to me how to properly use 'BK' vs 'BE' given how
it 'sounds on the air'. It seems like a lose from a game theory
perspective, where repurposing voice and video catagories seems more
like a win.

> MQ and MQPrio are horribly underdocumented. I still don't 'get' how to use
> them
> properly (I'm more focused on writing a good classifier at the moment)

I still have no idea what MQ and MQprio are really good for. Perhaps the above?

> iptables seems to think ecn can only be looked at in TCP streams, where (for
> example),
> ecn bits can be copied to the outer header of a udp vpn stream, and marked
> when needed.

This was incorrect. There is a specific ecn-ip-ect (0x.0x3) match that
works against IP.

> ip6tables has no support for looking at ecn except through a u32 match.

Patrick Mchardy added ipv6 support for iptables matches a few minutes
ago, and after review I figure they will end up in net-next...

> iptables -t mangle -A Wireless -p tcp -m tcp --tcp-flags ALL SYN,ACK -m ecn
> --ecn-tcp-ece -m recent --name ecn_enabled --set -m comment --comment 'ECN
> enabled streams'
> iptables -t mangle -A Wireless -p tcp -m tcp --tcp-flags ALL SYN,ACK -m ecn
> ! --ecn-tcp-ece -m recent --name ecn_disabled --set -m comment --comment
> 'ECN disab
> led streams'
> iptables -t mangle -F POSTROUTING
> iptables -t mangle -A POSTROUTING -j Wireless
> You can see what ips managed to do ECN or not via
> cat /proc/net/xt_recent/ecn_*

The patches just submitted to the netfilter list also fix an inversion
match problem.

> But that's just a distraction from trying to converge on a
> decent set of solutions for AQM. I AM happy to report that after getting
> buffer sizes down (via ethtool, a switch patch, txqueuelen) I am finally
> able to reliably see sub 10ms latencies on the wndr3700... but I wake up
> these days, feeling doomed.

I feel less doomed now. Thanks everybody!

Sometimes crying out in frustration (and thoroughly documenting your
problems), really, really works.

My thanks to everybody that jumped in to help, and to find the right
places to find the help, etc.

Amusingly, while I was testing a new build of cerowrt, my link to
comcast went down and I spent 20 minutes blaming my stuff before
figuring out the real cause of the

Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608

More information about the Bloat mailing list