From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-iw0-f171.google.com (mail-iw0-f171.google.com [209.85.214.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 178FD2009AA for ; Thu, 9 Jun 2011 06:40:31 -0700 (PDT) Received: by iwn8 with SMTP id 8so2009087iwn.16 for ; Thu, 09 Jun 2011 07:01:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=2CHagYHR9eL1aF1rxzNwO7lap4AZsltoVWczOEQg+O0=; b=Tta4B/GDq9uTbXJh5Vfcbg98xZciXvvRbYaJUexeGy0yuGz5YyH3Q5fg7Jq4PpuONd 0D5Id5u9KOtcElwIA+idX96rpLVtiJr1N1iU+l4s3lKMikQRgFuAWAd+iylyZ1zxB0Er UIijMzHydQ7K8llcPQRdJsSqzFOdFQO6wHuzw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=q79yCw/54zcbQzUWKAI0coDrb3NOgx0xS4EQc4ps1f7M0M1pjRX1k4DM7lZbLsxI3J fyuHkSyq6rHtMjEoOSPLYD0k6xrIV75NixkVW7OCrDD504Ijcpldbiad/sRhEFBOy4uN ynw4ly7cFU7IwjTENxkaQ93epfcPpGS5M+29U= MIME-Version: 1.0 Received: by 10.231.1.14 with SMTP id 14mr1005043ibd.13.1307628086219; Thu, 09 Jun 2011 07:01:26 -0700 (PDT) Received: by 10.231.13.76 with HTTP; Thu, 9 Jun 2011 07:01:26 -0700 (PDT) Date: Thu, 9 Jun 2011 08:01:26 -0600 Message-ID: From: Dave Taht To: bloat Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Bloat] Some updates on hacking on AQMS X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jun 2011 13:40:31 -0000 I'm going to make some notes against my original posting 'Some notes on hacking on AQM's' here, because in less than a day, much progress been made. On Wed, Jun 8, 2011 at 6:12 AM, Dave Taht wrote: > So in addition to hacking on the switch, I've been poking into the behavi= or of multiple AQM systems in the kernel ...elided... > Some notes: > > There are as many philosophies to AQM as there are shapers and classifier= s. > > None of the Linux shaper scripts in the field handle ipv6 traffic. I've got some fixes for this in the 3 (soon to be 4) shapers I'm playing wi= th. According to the netfilter guys, DDR scheduling + HFSC is the new 'hotness', and I was supplied a script that uses it... (which I think was noted in the previous thread, if it was private email, I'll repost) I have not (as yet) patched DDR support into the openwrt testbed. > > HTB is the most commonly used qdisc, handles it's bandwidth limits by pac= ket > drop > and doesn't do ECN. It's usually used in conjunction with other qdiscs, t= oo. > > An explanation of how diffserv (dsmark) and GRED are supposed to play bal= l > together > (starting here: http://www.opalsoft.net/qos/DS-27.htm) is so amazingly no= t > opaque. > > SFB remains promising, but until I get a ported tc for it, > I can't play with it much. Next week iproute2-2.6.39 may be released, and I'll slam it into openwrt, assuming it can be made to work with 2.6.37.6. > > SFQ is the second most commonly used qdisc, but doesn't balance in ways E= SFQ > could. > > ESFQ really looked like a winner and I'm sorry it never made the mainline > kernel. ESFQ features were added to SFQ 4 years ago. Few use them, but they are there. It's actually more flexible than ESFQ was. Core among them is the ability to not match against flows but against distinct IP addresses, which makes sense in this bittorrent-ed age. > > HFSC is mind-bending as to what it tries to do. It really seems that ECN support could be added generically for all qdiscs that currently do packet drop. Creating a generic mark_or_drop function is easy. The difficulties lie in adding 'marking' counters to every qdisc and userspace. The overall effacy of ECN when more widely deployed remains an unknown, but given the heroic efforts given packet delivery today across 3g/4g/wireless, providing a postitive congestion signaling mechanism across those links seems ever more important. I've added an ipod, an ipad, and a windows 7 and windows vista box to the list of machines I can test ECN with. > > Any form of fair queuing is useful for ethernet, but actually knowing the > link rate and port on the switch per dest macaddr would help in load > balancing streams. > > Fair queueing is very bad on wireless when packet aggregation is used. I currently find the mismatch between wired and wireless to be so extreme that I'm no longer bridging the two wherever possible, thus reducing the effects of multicast. Naturally this introduces problems of its own. As one example, windows needs a wins proxy supplied in order to 'see' machines on the other side of a router. MDNS seems to be another issue. Openwrt, seems to generally come with PIMv1 and PIMv2 disabled, and I don't know to what extent these protocols are actually used. IGMPproxy seems to work, but there are probably many more gotchas with short haul multicast left to be exposed. Another problem is that the wndr3700 comes without a distinct mac for one ethernet device, and the switch is either unassigned one or my code for finding it is wrong. Still, I'm loving not having wired and wireless bridged, a busy wired link (lots of arp requests, for example) seemingly does terrible things to a wireless one when bridged. > > PFIFO_FAST is tied to TOS bits, not diffserv bits. > > RED is, well, RED. > > GRED is far less opaque than RED, as noted earlier. > > 802.11e does its prioritization at the vlan layer, not at the TOS or > diffserv bits. Getting from tos or diffserv to mq* seems painful but I > haven't looked into it too hard. Here's a really obscure bit of info gleaned from nbd (and I have not tried this yet, but it seems really important to do so) In the mac80211 code.... If on entry you mark a packet's skb->priority field with a value that matches the 802.1d priority fields (0..7) + 256 - the mac80211 layer interprets that and wedges the result into the 4 bands available in 802.11e. (regrettably it's not linear, both 0 and 3 =3D BE) Those are BK (background), BE, Video, and Voice. (without classification, everything ends up in BE) The interesting thing about the (rather underused) video and voice catagories is that they actually work at the physical layer in separate bands across the Cwin. So, in addition to voice, I'd like to try and wedge all the non-multicast, low volume 'mice' packets like NTP, ND, etc into the Voice category, treating 'voice' more like a control plane, than just voice. This will probably tickle more bugs, but the idea is kind of fun. I'll write more about this later. It's really not clear to me how to properly use 'BK' vs 'BE' given how it 'sounds on the air'. It seems like a lose from a game theory perspective, where repurposing voice and video catagories seems more like a win. > MQ and MQPrio are horribly underdocumented. I still don't 'get' how to us= e > them > properly (I'm more focused on writing a good classifier at the moment) I still have no idea what MQ and MQprio are really good for. Perhaps the ab= ove? > iptables seems to think ecn can only be looked at in TCP streams, where (= for > example), > ecn bits can be copied to the outer header of a udp vpn stream, and marke= d > when needed. This was incorrect. There is a specific ecn-ip-ect (0x.0x3) match that works against IP. > ip6tables has no support for looking at ecn except through a u32 match. Patrick Mchardy added ipv6 support for iptables matches a few minutes ago, and after review I figure they will end up in net-next... > iptables -t mangle -A Wireless -p tcp -m tcp --tcp-flags ALL SYN,ACK -m e= cn > --ecn-tcp-ece -m recent --name ecn_enabled --set -m comment --comment 'EC= N > enabled streams' > iptables -t mangle -A Wireless -p tcp -m tcp --tcp-flags ALL SYN,ACK -m e= cn > ! --ecn-tcp-ece -m recent --name ecn_disabled --set -m comment --comment > 'ECN disab > led streams' > > iptables -t mangle -F POSTROUTING > iptables -t mangle -A POSTROUTING -j Wireless > > You can see what ips managed to do ECN or not via > > cat /proc/net/xt_recent/ecn_* The patches just submitted to the netfilter list also fix an inversion match problem. > But that's just a distraction from trying to converge on a > decent set of solutions for AQM. I AM happy to report that after getting > buffer sizes down (via ethtool, a switch patch, txqueuelen) I am finally > able to reliably see sub 10ms latencies on the wndr3700... but I wake up > these days, feeling doomed. I feel less doomed now. Thanks everybody! Sometimes crying out in frustration (and thoroughly documenting your problems), really, really works. My thanks to everybody that jumped in to help, and to find the right places to find the help, etc. Amusingly, while I was testing a new build of cerowrt, my link to comcast went down and I spent 20 minutes blaming my stuff before figuring out the real cause of the problem. --=20 Dave T=E4ht SKYPE: davetaht US Tel: 1-239-829-5608 http://the-edge.blogspot.com