From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com [IPv6:2a00:1450:4010:c04::236]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 763A021F968 for ; Mon, 9 Nov 2015 07:07:45 -0800 (PST) Received: by lbbwb3 with SMTP id wb3so100709886lbb.1 for ; Mon, 09 Nov 2015 07:07:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ozmNUTOxZh8oLRhKVHctk+rVKHbx3O4u1mBp9g6GUng=; b=BZOWL6Js69VWcSHRCDj8GUWPaPQKxVDSpDMhYS1wmbDMslmwCkPP+i33eQe2xyhXsq nVs7YkNTk7xhRJFyTcnlFiZXgiVIreJ6tKj5QFqXuXy+zQd5291v6Sqfp2eBXVNEZvlA Bn/z2+SBhIBfk2Gm6z9aNbluhrhk3FeWBOiObYDTAqOgl2sk9JqHMGZyUP+jeDYvj7wn NwIVlxVyDHbiLh28dtPILS5DL9fki1n2/ENuHC0yZ6kjGQs+RnvHQwaIklAl5htlN5sB d38ERpRGExsTXmzsCOsbJjMkmPBze1rMPoFhXbg0rqvKszk0ZJyqiXsT7mShBkb5RQhE mOow== X-Received: by 10.112.205.194 with SMTP id li2mr13901837lbc.75.1447081663371; Mon, 09 Nov 2015 07:07:43 -0800 (PST) Received: from bass.home.chromatix.fi (83-245-237-101-nat-p.elisa-mobile.fi. [83.245.237.101]) by smtp.gmail.com with ESMTPSA id um1sm2419925lbb.23.2015.11.09.07.07.42 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 09 Nov 2015 07:07:42 -0800 (PST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\)) From: Jonathan Morton In-Reply-To: <56408DAB.6080106@darbyshire-bryant.me.uk> Date: Mon, 9 Nov 2015 17:07:40 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <0B75A2D1-80FC-4499-8DCD-FC2C4E150448@gmail.com> References: <5638A4F4.2010701@darbyshire-bryant.me.uk> <87si4nntcg.fsf@toke.dk> <0196FDEC-50A7-4ECA-9973-1FD23FF2945A@gmx.de> <877flznq3f.fsf@toke.dk> <848953E5-8571-4B81-B67F-D4A7BA4A1F96@darbyshire-bryant.me.uk> <8737wnnpco.fsf@toke.dk> <563B86D4.6030704@darbyshire-bryant.me.uk> <563F21AE.5040506@darbyshire-bryant.me.uk> <56408DAB.6080106@darbyshire-bryant.me.uk> To: Kevin Darbyshire-Bryant X-Mailer: Apple Mail (2.3096.5) Cc: cake@lists.bufferbloat.net Subject: Re: [Cake] More on 'target' corner cases - rate/target/interval confusion? X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2015 15:08:09 -0000 > On 9 Nov, 2015, at 14:12, Kevin Darbyshire-Bryant = wrote: >=20 > In the presence of a full link, that link having competing =91full' = flows in all 'tins', then how should cake split the link in terms of = bandwidth? That=92s a good question, and one that I think becomes more critical at = low bandwidths. I=92ve tended towards generous allocations for that = reason, so as to avoid causing trouble to low-latency applications. The main requirement I keep in mind is that an application should not be = able to guarantee itself an excessive bandwidth share simply by = selecting a particular DSCP. At the same time, there are applications = for which a relatively large, consistent bandwidth is a requirement for = satisfactory performance (consider streaming video), such that = best-effort traffic should defer to them. These are conflicting = requirements, so a compromise has to be established somehow. The current thresholds are at 100%, 15/16, =BE and =BC. Under saturated = conditions, this gives throughputs of 1/16, 3/16, =BD and =BC. The = =93video=94 class (Tin 2) can usurp =BE of the bandwidth when competing = against any mixture of best-effort and bulk traffic. This, admittedly, = might turn out to be too much, so I could consider setting Tin 2=92s = threshold at =BD instead of =BE. And yes, I have long noticed that Flent=92s standard RRUL test doesn=92t = use Tin 2 at all. > With each increasing priority of tini[0-3], we decrease the 'expected' = bandwidth (good) but also as a result increase the target and interval. Yes, there are a number of counter-intuitive things happening here. Most of Cake=92s latency-reducing power comes from the liberal = application of flow isolation, *not* from AQM itself. Diffserv = prioritisation plays a lesser role, and mainly has to do with restoring = the desired allocations of bandwidth, replacing the reliance on measure = queue fill level that some protocols presently use to stay out from = underfoot. Both these mechanisms primarily control the effects that one = flow can have on another, and say little about the latency that a flow = causes to itself. This latter is the domain of AQM, specifically Codel, which is what = we=92re talking about when we mention =93interval" and "target=94. As I = mentioned elsewhere recently, Codel is designed specifically to give = congestion signals to TCP-like flows, and deals rather less efficiently = with unresponsive and anti-responsive flows, which as a result tend to = spend some time bouncing off the queue=92s hard limit until Codel finds = the correct operating point to control them. There are other AQMs which = are designed with unresponsive flows more in mind, but which somehow = perform less well with TCP-like flows. A key design principle of Codel is that no packet whose sojourn time is = below target will be signalled. However, if the sojourn time is = consistently above target, signalling begins and increases steadily in = frequency. It is also a fundamental truth that if it takes longer than = target to transmit the previous packet, the following packet can have a = =93congested=94 sojourn time even if there is consistently precisely one = packet in the queue (which is the ideal state). This is why I constrain = =93target=94 to be at least 1.5 packet times at MTU; the difference can = be substantial at low bandwidths. But there are subtleties here too. If there are multiple flows, isolated into multiple queues, then the = effective packet-to-packet time for each queue will be increased = proportionately. Early versions of Codel refused to signal on the last = packet in the queue, to account for this. However, if there were a = large number of occupied queues, this meant that the minimum queue fill = could be rather high, and this seemed to lead to high induced latencies = in fq_codel under heavy load. Due to the statistical multiplexing = effect, it turned out to be sufficient to tune =93target=94 as above for = the final output bandwidth (even though this is unknown to fq_codel) and = to remove the special status of the last remaining packet. The same logic could naively be applied to traffic in separate tins. = However, unlike queues for flow isolation, bandwidth is not shared = evenly between tins. More subtly, traffic characteristics also differ - = low-latency traffic tends to be unresponsive to TCP-style congestion = signalling, and dropping any of it tends to reduce service quality in = some way. Note that network-control traffic (most relevantly NTP) falls = into the =93voice=94 category. Since unresponsive flows aren=92t what = Codel is meant to deal with, the mere fact that =93target=94 is higher = is not meaningful - and in any case this has no effect on the primary = flow-isolation mechanism. The tin-specific tuning of target and interval was introduced when Cake = had a separate hard shaper per tin. It was the obvious design at the = time. Now that Cake uses soft shaping between tins (allowing any tin to = use the full link bandwidth if uncontended), it=92s possible that = choosing identical target and interval for all tins might suffice. = Alternatively, we might choose an even more conservative strategy. But which - and how do we decide? As a final point, I haven=92t even mentioned =93rtt=94 as the = user-specified input to this mayhem. That parameter must be understood = to be *separate* from both =93target=94 and =93interval=94, even though = Codel specifies the latter to be related to expected RTT. Simply put, = the user tells us what the expected RTT is (on the understanding that an = order of magnitude variation either way is typical), and we calculate = =93target=94 and =93interval=94 to be as consistent with that estimate = as is practical, given the link bandwidth and other constraints that he = has also specified. So there is a firm conceptual distinction between = the user=92s intent and the implementation details. - Jonathan Morton