From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.143]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e3.ny.us.ibm.com", Issuer "GeoTrust SSL CA" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id E457721F192 for ; Tue, 27 Nov 2012 17:05:35 -0800 (PST) Received: from /spool/local by e3.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 27 Nov 2012 20:05:34 -0500 Received: from d01dlp03.pok.ibm.com (9.56.250.168) by e3.ny.us.ibm.com (192.168.1.103) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 27 Nov 2012 19:51:31 -0500 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id BC728C90043; Tue, 27 Nov 2012 19:51:17 -0500 (EST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAS0pHVS346378; Tue, 27 Nov 2012 19:51:17 -0500 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAS0pG3q023374; Tue, 27 Nov 2012 17:51:17 -0700 Received: from paulmck-ThinkPad-W500 ([9.47.24.61]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id qAS0pGFn023320; Tue, 27 Nov 2012 17:51:16 -0700 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id A816CEBF22; Tue, 27 Nov 2012 16:51:15 -0800 (PST) Date: Tue, 27 Nov 2012 16:51:15 -0800 From: "Paul E. McKenney" To: Andrew McGregor Message-ID: <20121128005115.GT2474@linux.vnet.ibm.com> References: <20121123221842.GD2829@linux.vnet.ibm.com> <20121127225406.GN2474@linux.vnet.ibm.com> <3E331029-4BC7-4935-8727-286A2EF8A0D6@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E331029-4BC7-4935-8727-286A2EF8A0D6@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12112800-8974-0000-0000-000011C6026F Cc: Paolo Valente , Toke =?iso-8859-1?Q?H=F8iland-J=F8rgensen?= , "codel@lists.bufferbloat.net" , "cerowrt-devel@lists.bufferbloat.net" , bloat , John Crispin Subject: Re: [Bloat] [Codel] [Cerowrt-devel] FQ_Codel lwn draft article review X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list Reply-To: paulmck@linux.vnet.ibm.com List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Nov 2012 01:05:36 -0000 On Wed, Nov 28, 2012 at 12:15:35PM +1300, Andrew McGregor wrote: > > On 28/11/2012, at 11:54 AM, "Paul E. McKenney" wrote: > > > On Tue, Nov 27, 2012 at 02:31:53PM -0800, David Lang wrote: > >> On Tue, 27 Nov 2012, Jim Gettys wrote: > >> > >>> 2) "fairness" is not necessarily what we ultimately want at all; you'd > >>> really like to penalize those who induce congestion the most. But we don't > >>> currently have a solution (though Bob Briscoe at BT thinks he does, and is > >>> seeing if he can get it out from under a BT patent), so the current > >>> fq_codel round robins ultimately until/unless we can do something like > >>> Bob's idea. This is a local information only subset of the ideas he's been > >>> working on in the congestion exposure (conex) group at the IETF. > >> > >> Even more than this, we _know_ that we don't want to be fair in > >> terms of the raw packet priority. > >> > >> For example, we know that we want to prioritize DNS traffic over TCP > >> streams (due to the fact that the TCP traffic usually can't even > >> start until DNS resolution finishes) > >> > >> We strongly suspect that we want to prioritize short-lived > >> connections over long lived connections. We don't know a good way to > >> do this, but one good starting point would be to prioritize syn > >> packets so that the initialization of the connection happens as fast > >> as possible. > >> > >> Ideally we'd probably like to prioritize the first couple of packets > >> of a connection so that very short lived connections finish quickly > > fq_codel does all of this, although it isn't explicit about it so it is hard to see how it happens. > > >> it may make sense to prioritize fin packets so that connection > >> teardown (and the resulting release of resources and connection > >> tracking) happens as fast as possible > >> > >> all of these are horribly unfair when you are looking at the raw > >> packet flow, but they significantly help the user's percieved > >> response time without making much difference on the large download > >> cases. > > > > In all cases, to Jim's point, as long as we avoid starvation. And there > > will likely be more corner cases that show up under extreme overload. > > > > Thanx, Paul > > > > So, fq_codel exhibits a new kind of fairness: it is jitter fair, or in other words, each flow gets the same bound on how much jitter it can induce in the whole ensemble of flows. Exceed that bound, and flows get deprioritised. This achieves thin-flow and DNS prioritisation, while allowing TCP flows to build more buffer if required. The sub-flow CoDel queues then allow short flows to use a reasonably large buffer, while draining standing buffers for long TCP flows. > > The really interesting part of the jitter-fair behaviour is that jitter-sensitive traffic is protected as much as it can be, provided its own sending rate control does something sensible. Good news for interactive video, in other words. > > The actual jitter bound is the transmission time of max(mtu, quantum) * n_thin_flows bytes, where a thin flow is one that has not exceeded its own jitter allowance since the last time its queue drained. While it is possible that there might instantaneously be a fairly large number of thin flows, in practice on a home network link there are normally only a very few of these at any one moment, and so the jitter experienced is pretty good. I will have to think about this, but at first glance I kinda like the idea of describing FQ-CoDel as jitter fair. ;-) Thanx, Paul