From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <andrewmcgr@gmail.com>
Received: from mail-pb0-f43.google.com (mail-pb0-f43.google.com
	[209.85.160.43]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client CN "smtp.gmail.com",
	Issuer "Google Internet Authority" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 08DF42003F4;
	Tue, 27 Nov 2012 15:15:45 -0800 (PST)
Received: by mail-pb0-f43.google.com with SMTP id wz17so11267167pbc.16
	for <multiple recipients>; Tue, 27 Nov 2012 15:15:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=content-type:mime-version:subject:from:in-reply-to:date:cc
	:content-transfer-encoding:message-id:references:to:x-mailer;
	bh=hC1+qTr+RGKRiMyeRYM7IfkE6fenCk0DyggZJOOuulM=;
	b=m88mtEGIeZahow8E897VJE6cWcsOfQTk/0742dfaNr+ACMQhFCDc/tt8nbMLPQa2eD
	CsKNC7RSvRSeTIAs7cgw9+4jll1ziMsUfNbsweFuZAOFl3OSw7tKDma+oSbKl1NNKw+M
	b82v2FBzc+YYWt0IDDmlXFN6Lwy2ayiqQkiXB2qQehMUzequrr4KbHmnyKYaW77QrDvz
	ltkNqcbVrlRBIRtq5DgUdIMA8iOgRVAp3yjK8Qp7bVoXvfVkq6397IBHuL9pM3bTlMgH
	mHz3zYcNroIPL2SmbMhb5qJjOAxCGQv8DT+UTHU6KkwfbT5CAdLCbaA7BJy63rX/vmui
	cz7Q==
Received: by 10.68.197.68 with SMTP id is4mr19301310pbc.30.1354058145291;
	Tue, 27 Nov 2012 15:15:45 -0800 (PST)
Received: from ?IPv6:2406:e000:316:24:ccd8:eb25:f984:a130?
	([2406:e000:316:24:ccd8:eb25:f984:a130])
	by mx.google.com with ESMTPS id mn5sm11328301pbc.12.2012.11.27.15.15.41
	(version=TLSv1/SSLv3 cipher=OTHER);
	Tue, 27 Nov 2012 15:15:44 -0800 (PST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
From: Andrew McGregor <andrewmcgr@gmail.com>
In-Reply-To: <20121127225406.GN2474@linux.vnet.ibm.com>
Date: Wed, 28 Nov 2012 12:15:35 +1300
Content-Transfer-Encoding: quoted-printable
Message-Id: <3E331029-4BC7-4935-8727-286A2EF8A0D6@gmail.com>
References: <CAA93jw5yFvrOyXu2s2DY3oK_0v3OaNfnL+1zTteJodfxtAAzcQ@mail.gmail.com>
	<CAA93jw5DcnWbE9Zb-JCeimT+YckUcM7AA3iiwXJx95Np2F8vmw@mail.gmail.com>
	<20121123221842.GD2829@linux.vnet.ibm.com>
	<CAGhGL2BoyZ+p+sD5kCq3n-8eQUkMa3gRj77m1x8ga262uY513g@mail.gmail.com>
	<alpine.DEB.2.02.1211271425050.16794@nftneq.ynat.uz>
	<20121127225406.GN2474@linux.vnet.ibm.com>
To: paulmck@linux.vnet.ibm.com
X-Mailer: Apple Mail (2.1499)
Cc: Paolo Valente <paolo.valente@unimore.it>,
	=?iso-8859-1?Q?Toke_H=F8iland-J=F8rgensen?= <toke@toke.dk>,
	"codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>,
	"cerowrt-devel@lists.bufferbloat.net"
	<cerowrt-devel@lists.bufferbloat.net>, bloat <bloat@lists.bufferbloat.net>,
	John Crispin <blogic@openwrt.org>
Subject: Re: [Bloat] [Codel] [Cerowrt-devel] FQ_Codel lwn draft article
	review
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 27 Nov 2012 23:15:46 -0000


On 28/11/2012, at 11:54 AM, "Paul E. McKenney" =
<paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Nov 27, 2012 at 02:31:53PM -0800, David Lang wrote:
>> On Tue, 27 Nov 2012, Jim Gettys wrote:
>>=20
>>> 2) "fairness" is not necessarily what we ultimately want at all; =
you'd
>>> really like to penalize those who induce congestion the most.  But =
we don't
>>> currently have a solution (though Bob Briscoe at BT thinks he does, =
and is
>>> seeing if he can get it out from under a BT patent), so the current
>>> fq_codel round robins ultimately until/unless we can do something =
like
>>> Bob's idea.  This is a local information only subset of the ideas =
he's been
>>> working on in the congestion exposure (conex) group at the IETF.
>>=20
>> Even more than this, we _know_ that we don't want to be fair in
>> terms of the raw packet priority.
>>=20
>> For example, we know that we want to prioritize DNS traffic over TCP
>> streams (due to the fact that the TCP traffic usually can't even
>> start until DNS resolution finishes)
>>=20
>> We strongly suspect that we want to prioritize short-lived
>> connections over long lived connections. We don't know a good way to
>> do this, but one good starting point would be to prioritize syn
>> packets so that the initialization of the connection happens as fast
>> as possible.
>>=20
>> Ideally we'd probably like to prioritize the first couple of packets
>> of a connection so that very short lived connections finish quickly

fq_codel does all of this, although it isn't explicit about it so it is =
hard to see how it happens.

>> it may make sense to prioritize fin packets so that connection
>> teardown (and the resulting release of resources and connection
>> tracking) happens as fast as possible
>>=20
>> all of these are horribly unfair when you are looking at the raw
>> packet flow, but they significantly help the user's percieved
>> response time without making much difference on the large download
>> cases.
>=20
> In all cases, to Jim's point, as long as we avoid starvation.  And =
there
> will likely be more corner cases that show up under extreme overload.
>=20
> 							Thanx, Paul
>=20

So, fq_codel exhibits a new kind of fairness: it is jitter fair, or in =
other words, each flow gets the same bound on how much jitter it can =
induce in the whole ensemble of flows.  Exceed that bound, and flows get =
deprioritised.  This achieves thin-flow and DNS prioritisation, while =
allowing TCP flows to build more buffer if required.  The sub-flow CoDel =
queues then allow short flows to use a reasonably large buffer, while =
draining standing buffers for long TCP flows.

The really interesting part of the jitter-fair behaviour is that =
jitter-sensitive traffic is protected as much as it can be, provided its =
own sending rate control does something sensible.  Good news for =
interactive video, in other words.

The actual jitter bound is the transmission time of max(mtu, quantum) * =
n_thin_flows bytes, where a thin flow is one that has not exceeded its =
own jitter allowance since the last time its queue drained.  While it is =
possible that there might instantaneously be a fairly large number of =
thin flows, in practice on a home network link there are normally only a =
very few of these at any one moment, and so the jitter experienced is =
pretty good.

Andrew