From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chromatix99@gmail.com>
Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com
	[IPv6:2a00:1450:4010:c04::236])
	(using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client CN "smtp.gmail.com",
	Issuer "Google Internet Authority G2" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 763A021F968
	for <cake@lists.bufferbloat.net>; Mon,  9 Nov 2015 07:07:45 -0800 (PST)
Received: by lbbwb3 with SMTP id wb3so100709886lbb.1
	for <cake@lists.bufferbloat.net>; Mon, 09 Nov 2015 07:07:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=content-type:mime-version:subject:from:in-reply-to:date:cc
	:content-transfer-encoding:message-id:references:to;
	bh=ozmNUTOxZh8oLRhKVHctk+rVKHbx3O4u1mBp9g6GUng=;
	b=BZOWL6Js69VWcSHRCDj8GUWPaPQKxVDSpDMhYS1wmbDMslmwCkPP+i33eQe2xyhXsq
	nVs7YkNTk7xhRJFyTcnlFiZXgiVIreJ6tKj5QFqXuXy+zQd5291v6Sqfp2eBXVNEZvlA
	Bn/z2+SBhIBfk2Gm6z9aNbluhrhk3FeWBOiObYDTAqOgl2sk9JqHMGZyUP+jeDYvj7wn
	NwIVlxVyDHbiLh28dtPILS5DL9fki1n2/ENuHC0yZ6kjGQs+RnvHQwaIklAl5htlN5sB
	d38ERpRGExsTXmzsCOsbJjMkmPBze1rMPoFhXbg0rqvKszk0ZJyqiXsT7mShBkb5RQhE
	mOow==
X-Received: by 10.112.205.194 with SMTP id li2mr13901837lbc.75.1447081663371; 
	Mon, 09 Nov 2015 07:07:43 -0800 (PST)
Received: from bass.home.chromatix.fi (83-245-237-101-nat-p.elisa-mobile.fi.
	[83.245.237.101]) by smtp.gmail.com with ESMTPSA id
	um1sm2419925lbb.23.2015.11.09.07.07.42
	(version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
	Mon, 09 Nov 2015 07:07:42 -0800 (PST)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <56408DAB.6080106@darbyshire-bryant.me.uk>
Date: Mon, 9 Nov 2015 17:07:40 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <0B75A2D1-80FC-4499-8DCD-FC2C4E150448@gmail.com>
References: <5638A4F4.2010701@darbyshire-bryant.me.uk> <87si4nntcg.fsf@toke.dk>
	<0196FDEC-50A7-4ECA-9973-1FD23FF2945A@gmx.de>
	<877flznq3f.fsf@toke.dk>
	<848953E5-8571-4B81-B67F-D4A7BA4A1F96@darbyshire-bryant.me.uk>
	<8737wnnpco.fsf@toke.dk> <563B86D4.6030704@darbyshire-bryant.me.uk>
	<563F21AE.5040506@darbyshire-bryant.me.uk>
	<F04244A3-7A74-4AC8-8D96-9A2B0AF0B412@gmail.com>
	<56408DAB.6080106@darbyshire-bryant.me.uk>
To: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
X-Mailer: Apple Mail (2.3096.5)
Cc: cake@lists.bufferbloat.net
Subject: Re: [Cake] More on 'target' corner cases - rate/target/interval
	confusion?
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
	<mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
	<mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 09 Nov 2015 15:08:09 -0000


> On 9 Nov, 2015, at 14:12, Kevin Darbyshire-Bryant =
<kevin@darbyshire-bryant.me.uk> wrote:
>=20
> In the presence of a full link, that link having competing =91full' =
flows in all 'tins', then how should cake split the link in terms of =
bandwidth?

That=92s a good question, and one that I think becomes more critical at =
low bandwidths.  I=92ve tended towards generous allocations for that =
reason, so as to avoid causing trouble to low-latency applications.

The main requirement I keep in mind is that an application should not be =
able to guarantee itself an excessive bandwidth share simply by =
selecting a particular DSCP.  At the same time, there are applications =
for which a relatively large, consistent bandwidth is a requirement for =
satisfactory performance (consider streaming video), such that =
best-effort traffic should defer to them.  These are conflicting =
requirements, so a compromise has to be established somehow.

The current thresholds are at 100%, 15/16, =BE and =BC.  Under saturated =
conditions, this gives throughputs of 1/16, 3/16, =BD and =BC.  The =
=93video=94 class (Tin 2) can usurp =BE of the bandwidth when competing =
against any mixture of best-effort and bulk traffic.  This, admittedly, =
might turn out to be too much, so I could consider setting Tin 2=92s =
threshold at =BD instead of =BE.

And yes, I have long noticed that Flent=92s standard RRUL test doesn=92t =
use Tin 2 at all.

> With each increasing priority of tini[0-3], we decrease the 'expected' =
bandwidth (good) but also as a result increase the target and interval.

Yes, there are a number of counter-intuitive things happening here.

Most of Cake=92s latency-reducing power comes from the liberal =
application of flow isolation, *not* from AQM itself.  Diffserv =
prioritisation plays a lesser role, and mainly has to do with restoring =
the desired allocations of bandwidth, replacing the reliance on measure =
queue fill level that some protocols presently use to stay out from =
underfoot.  Both these mechanisms primarily control the effects that one =
flow can have on another, and say little about the latency that a flow =
causes to itself.

This latter is the domain of AQM, specifically Codel, which is what =
we=92re talking about when we mention =93interval" and "target=94.  As I =
mentioned elsewhere recently, Codel is designed specifically to give =
congestion signals to TCP-like flows, and deals rather less efficiently =
with unresponsive and anti-responsive flows, which as a result tend to =
spend some time bouncing off the queue=92s hard limit until Codel finds =
the correct operating point to control them.  There are other AQMs which =
are designed with unresponsive flows more in mind, but which somehow =
perform less well with TCP-like flows.

A key design principle of Codel is that no packet whose sojourn time is =
below target will be signalled.  However, if the sojourn time is =
consistently above target, signalling begins and increases steadily in =
frequency.  It is also a fundamental truth that if it takes longer than =
target to transmit the previous packet, the following packet can have a =
=93congested=94 sojourn time even if there is consistently precisely one =
packet in the queue (which is the ideal state).  This is why I constrain =
=93target=94 to be at least 1.5 packet times at MTU; the difference can =
be substantial at low bandwidths.

But there are subtleties here too.

If there are multiple flows, isolated into multiple queues, then the =
effective packet-to-packet time for each queue will be increased =
proportionately.  Early versions of Codel refused to signal on the last =
packet in the queue, to account for this.  However, if there were a =
large number of occupied queues, this meant that the minimum queue fill =
could be rather high, and this seemed to lead to high induced latencies =
in fq_codel under heavy load.  Due to the statistical multiplexing =
effect, it turned out to be sufficient to tune =93target=94 as above for =
the final output bandwidth (even though this is unknown to fq_codel) and =
to remove the special status of the last remaining packet.

The same logic could naively be applied to traffic in separate tins.  =
However, unlike queues for flow isolation, bandwidth is not shared =
evenly between tins.  More subtly, traffic characteristics also differ - =
low-latency traffic tends to be unresponsive to TCP-style congestion =
signalling, and dropping any of it tends to reduce service quality in =
some way.  Note that network-control traffic (most relevantly NTP) falls =
into the =93voice=94 category.  Since unresponsive flows aren=92t what =
Codel is meant to deal with, the mere fact that =93target=94 is higher =
is not meaningful - and in any case this has no effect on the primary =
flow-isolation mechanism.

The tin-specific tuning of target and interval was introduced when Cake =
had a separate hard shaper per tin.  It was the obvious design at the =
time.  Now that Cake uses soft shaping between tins (allowing any tin to =
use the full link bandwidth if uncontended), it=92s possible that =
choosing identical target and interval for all tins might suffice.  =
Alternatively, we might choose an even more conservative strategy.

But which - and how do we decide?

As a final point, I haven=92t even mentioned =93rtt=94 as the =
user-specified input to this mayhem.  That parameter must be understood =
to be *separate* from both =93target=94 and =93interval=94, even though =
Codel specifies the latter to be related to expected RTT.  Simply put, =
the user tells us what the expected RTT is (on the understanding that an =
order of magnitude variation either way is typical), and we calculate =
=93target=94 and =93interval=94 to be as consistent with that estimate =
as is practical, given the link bandwidth and other constraints that he =
has also specified.  So there is a firm conceptual distinction between =
the user=92s intent and the implementation details.

 - Jonathan Morton