From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-f172.google.com (mail-ie0-f172.google.com [209.85.223.172]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 60CBC21F1B0; Wed, 12 Dec 2012 01:08:12 -0800 (PST) Received: by mail-ie0-f172.google.com with SMTP id c13so1288782ieb.3 for ; Wed, 12 Dec 2012 01:08:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=68wBRbK3Q0053Ni4NoV6WgapPXieBmuex8NXr7+rWhY=; b=Bih76uuv/fhrrT8aZTVJ5wf80/7/+I91L5NLMyh7wHgUoIKouIwTudzcdJ9hAy7gcH YwH+JGGSXSVyzKNvEk4ohFKzIBQCCr5NtwW5o2inbIKejogZksbQSu65jft+GTR4MyiZ 6YUPFuwSWEXoy7k05uWcu4KfJ44hHFzwCgl2xTSXnKhygqfHNC2HmaNmoKDUKqANI51Y vza1GUyQZO0OfyfAueXMctXt9cnh4ct0cxIuGt75UOezwHAwHxKbovF7N22NHGpV10qs Fmoc4C/mWaXJDGO45Uf9bIn/UxmkZTTXSXL871Zz5gBuPDpXNQDsa8RZ8PNs9KeXY3DN anwA== MIME-Version: 1.0 Received: by 10.50.15.134 with SMTP id x6mr12835947igc.27.1355303291610; Wed, 12 Dec 2012 01:08:11 -0800 (PST) Received: by 10.64.135.39 with HTTP; Wed, 12 Dec 2012 01:08:11 -0800 (PST) Date: Wed, 12 Dec 2012 04:08:11 -0500 Message-ID: From: Dave Taht To: cerowrt-devel@lists.bufferbloat.net, bloat-devel Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Cerowrt-devel] More sanely debloating wifi aggregates X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Dec 2012 09:08:12 -0000 It is my intent to start working harder on wifi issues next year. Regardless of how much I care about fixing APs, the biggest user of linux based wifi is android, so it makes sense to be hacking on that, rather than the crappy iwl chip in most of my machines. (I have ath9k cards, tho) I'm getting an android tablet for christmas (any recommendations? Obviously Cyanogenmod is going to have to be supported... What wifi chipsets are in use?) Anyway, besides smashing all the extra wifi tx buffering in the stack and instituting sane drop policies and something fq_codel-like there, as well as paying attention to station ids and classification and a few other things important on an AP but not on a client, I came up with a idea for dealing with rx de-aggregation which seems simpler to implement initially and will lead towards the tx goal eventually. Basically, it's adding SFQ to the de-aggregation step in the rx path. What happens currently is that an entire rx aggregate (up to 42 packets) is decoded, and then dequeued in strict FIFO order, and then shipped "elsewhere", usually at a speed far higher than the arrival rate of the wifi link. No queue forms at the egress link as Linux is a strict pull-through stack, so you can't do any work on the egress side that is useful. However, that pesky aggregate exists... To explain the possible advantage of SFQ'ing the aggregate before it is delivered elsewhere, I'll use an example. 1 big flow, 1 small flow, 1 ping and 1 DNS packet arrive via an aggregate in that order. The 30 packets of the big flow are dequeued first, and shipped to the local TCP server, which responds immediately with a ton of large packets, scaling up according to slow start or whatever phase of the TCP algorithm it's in. The small flow gets 10 packets out and a ton of packets back. The ping then arrives, and then the DNS packet. Now the behavior on the receiving side is that it now builds up a queue that is fairly large, long before the small flow, ping and dns packet arrive, so they are starved to share the link, and multiple aggregates have to be scheduled and shipped long before the ping arrives. And we're already familiar with the over buffering in the tx path. An alternative is SFQ dequeuing the aggregate. Now, 1 packet each from the 4 flows depart in round robin order. The ping, small flow, big flow, and DNS packet (with a little lookup latency but hopefully pretty fast) all manage to get packets out and back, so they can be scheduled in the next string of tx aggregates. (Thanks to the rrul test and a zillion benchmarks of wifi under various scenarios I have a good mental picture of what's happening today in aggregates, and bidirectional throughput is generally quite compromised by them be-ing dequeued in fifo order) Temporarily "sorting" packets in the de-aggregation step will certainly incur a cpu cost, and a bit of delay, but I think the above behavior will smooth out client application behavior somewhat and certainly help on APs. Thoughts? --=20 Dave T=E4ht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.= html