From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <toke@toke.dk>
Received: from mail.toke.dk (mail.toke.dk [IPv6:2001:470:dc45:1000::1])
 (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 82F593BA8E
 for <cake@lists.bufferbloat.net>; Wed, 18 Apr 2018 08:57:31 -0400 (EDT)
From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= <toke@toke.dk>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023;
 t=1524056249; bh=zOGLM+3WkQiFPaQ5CRY0ej+O9bwsv7NO6GOHlZ/pZ64=;
 h=From:To:Cc:Subject:In-Reply-To:References:Date:From;
 b=xdDSlnZYk9Ixap9wjgGe/UldhOYSdc/jdSgtRpz6fAlPxT1ju5tK+YJSwhXZ0WF9A
 AQ0i8gtsPTX1zmIJMAecANwnKg9c8fozEGsnpYodyG4lhABgqljnhUl0tXOtcv9wxw
 a6GxDWy4oqnQJ7R5okUeAZglktJx186HA4ssGV4PIYOM1ygWsNZI4yI6POD6EJIFG4
 QoTwoAg5/pfl4GCYCRCDtN+/fmmlxtzJoe+rsCqLuZEhTIIf68c1ohPlOGdwxvzXlh
 iNzO6OpZ3aedRtudHnA91MCBuTKU7uXLrP4TUCo/ZaOwvmXWb3Sc1MiZxFEebQdMvE
 QFWAqxUiBwGng==
To: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
Cc: Jonathan Morton <chromatix99@gmail.com>,
 "cake\@lists.bufferbloat.net" <cake@lists.bufferbloat.net>
In-Reply-To: <1B7176CA-41BC-4CF0-838D-871F0C858CF3@darbyshire-bryant.me.uk>
References: <87vacq419h.fsf@toke.dk>
 <C0095700-7A65-4259-84E5-C3A113608D96@gmail.com> <874lk9533l.fsf@toke.dk>
 <87604o3get.fsf@toke.dk>
 <1B7176CA-41BC-4CF0-838D-871F0C858CF3@darbyshire-bryant.me.uk>
Date: Wed, 18 Apr 2018 14:57:29 +0200
X-Clacks-Overhead: GNU Terry Pratchett
Message-ID: <8736zs3c5i.fsf@toke.dk>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Cake] A few puzzling Cake results
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Wed, 18 Apr 2018 12:57:31 -0000

Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:

>> On 18 Apr 2018, at 12:25, Toke H=C3=B8iland-J=C3=B8rgensen <toke@toke.dk=
> wrote:
>>=20
>> Toke H=C3=B8iland-J=C3=B8rgensen <toke@toke.dk> writes:
>>=20
>>> Jonathan Morton <chromatix99@gmail.com> writes:
>>>=20
>>>>> On 17 Apr, 2018, at 12:42 pm, Toke H=C3=B8iland-J=C3=B8rgensen <toke@=
toke.dk> wrote:
>>>>>=20
>>>>> - The TCP RTT of the 32 flows is *way* higher for Cake. FQ-CoDel
>>>>> controls TCP flow latency to around 65 ms, while for Cake it is all
>>>>> the way up around the 180ms mark. Is the Codel version in Cake too
>>>>> lenient, or what is going on here?
>>>>=20
>>>> A recent change was to increase the target dynamically so that at
>>>> least 4 MTUs per flow could fit in each queue without AQM activity.
>>>> That should improve throughput in high-contention scenarios, but it
>>>> does come at the expense of intra-flow latency when it's relevant.
>>>=20
>>> Ah, right, that might explain it. In the 128 flow case each flow has
>>> less than 100 Kbps available to it, so four MTUs are going to take a
>>> while to dequeue...
>>=20
>> OK, so I went and looked at the code and found this:
>>=20
>> 	bool over_target =3D sojourn > p->target &&
>> 	                   sojourn > p->mtu_time * bulk_flows * 4;
>>=20
>>=20
>> Which means that we scale the allowed sojourn time for each flow by the
>> time of four packets *times the number of bulk flows*.
>>=20
>> So if there is one active bulk flow, we allow each flow to queue four
>> packets. But if there are ten active bulk flows, we allow *each* flow to
>> queue *40* packets.
>>=20
>> This completely breaks the isolation of different flows, and makes the
>> scaling of Cake *worse* than plain CoDel.
>>=20
>> So why on earth would we do that?
>
> The thread that lead to that change:
>
> https://lists.bufferbloat.net/pipermail/cake/2017-December/003159.html
>
> Commits: 0d8f30faa3d4bb2bc87a382f18d8e0f3e4e56eac & the change to
> 4*bulk flows 49776da5b93f03c8548e26f2d7982d553d1d226c

Ah, thanks for digging that up! I must not have been paying attention
during that discussion ;)

Well, from reading the thread, this is an optimisation for severe
overload in ingress mode on very low bandwidths. And the change
basically amounts to throwing up our hands and saying "screw it, we
don't care about the intra-flow latency improvements of an AQM". Which
is, I guess, technically a valid choice in weighing tradeoffs, but I
maintain that it is the wrong one.

Incidentally, removing the multiplication with the number of bulk flows
restores TCP intra-flow latency to be on par with (or even a bit better
than) FQ-CoDel, and no longer scaling with the number of active flows.

-Toke