From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 94D7F3CB35
 for <ecn-sane@lists.bufferbloat.net>; Sat, 24 Aug 2019 18:36:53 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net;
 s=badeba3b8450; t=1566686209;
 bh=pDZU/KgtEBzmxNXH4hE3uwM8NExL62cddT9XYdh4Ug0=;
 h=X-UI-Sender-Class:From:Subject:Date:To;
 b=M5fwmEos107G1rf2HIU/bwNPe6ZrhgaRxR9oUZC+Fl3jKK+sOG4fu+2YhbtV45KH3
 pURdPKbGSEBQ3cwpS0O6SLuXVOeH1O/H1gUvAFyF4IYTjkx03X4aqyeqjmzeQKz4Y8
 Yq4zzLZM1+VLsTfz6XKtrbxNQpx6839VIC5Tgop0=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from hms-beagle2.lan ([77.12.241.96]) by mail.gmx.com (mrgmx105
 [212.227.17.168]) with ESMTPSA (Nemesis) id 1Mqs0X-1iVzTF3S1R-00mp9Z; Sun, 25
 Aug 2019 00:36:48 +0200
From: Sebastian Moeller <moeller0@gmx.de>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
Message-Id: <FC5D7B5C-0CF8-4F66-971D-767E32B5D029@gmx.de>
Date: Sun, 25 Aug 2019 00:36:46 +0200
To: tsvwg IETF list <tsvwg@ietf.org>, ECN-Sane <ecn-sane@lists.bufferbloat.net>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:fsmjhsUrb7TJ7+qZl5cfhdFuxhC8fbGUk2Wv/wemJDT2V/+tLXv
 lOrw5wnhQIuYTrgzlkCC31gA64hbOjrIdyCSDvafL0ODOL04jIejgmeWnOagYotN5dITZAb
 CE/iVIBY9bhov3BTD/6g7fArgoFdIg3KECnVJNACD9EvqVi0LC03CStUAYguzOrqdl6bI/J
 sMtFl5jrxEon6Lg2Mnx+Q==
X-Spam-Flag: NO
X-UI-Out-Filterresults: notjunk:1;V03:K0:iiPNlBn0ktc=:KRJEmN4sOBgi3jRu4KXfT8
 xO6a9VwpYAEj7OkWmfWqwpTQwJi7COo6ek4FDKQo6+dwkiwqbQquZv2eeeWajW6RP6BDvejQN
 PitHONCc60IVrj2DubEJCSOCH5bgF6oitWuUr5D4bSuIJG8c0dkFo815/VeiQU7JauOOudvu/
 Aqo3ckudz7oqDhaiGsE2jF5+kDcY1aLeqy1CItoX+lrOU/5nKcpcPfegMSa4LechYryS5AQ8W
 uspxm5hOr1Vcz5ce78xv7M9Z5xkiPyNVTSEVmBA9/CyRkze+EaeeqXNt9KbIr3qvtPhCBcnJK
 LFJiY9GjUjtBMT4/DNpDI6UpWC0dGX2DYrCdKeUpCfQu2kimGIWjUpx60YaI0iuxUZfMHa7AG
 eM+ItjvOb17/l/NeOWxXjfRWRdrx8m6wH48emfAfjEY/Q7MJsSxdc6UoK5YoeADffsnaYdP3g
 snRQxrCHZz+JkdEJ38dnQeF5VJ4ffBVfMgMByPL3ul9ozZX/sgZOB3ahhdz/kxkQIWFQNF9tl
 V9lWxzfzZ03Qp9dfX5n2vjcIjLXIXJHc8FzWAjDcIw5KE1xHliLFVS6wRmiUC4YyfKdd5Pq12
 fWNVfpK26ae1FFIWWCtGaST0e/GFDalW2g2ca9THOFhdl17a/fMukjF4BFJkNJatuHIR6WwbV
 ibr9/kSCz0Wx0bcFlyJsVzsIaqIMTDP7npzpgLvM2rt+Cj7Dzl3HlZIAZvLmGcpikcVUd8GNm
 66JR6qRnkOIQNs8FP6nOTsUzuIsQMeylg4JaWKXBoyxAOeOtBsb75D7NEVZr1iUgMfpdCQCLZ
 5KJCct+AzfNdTrzm7eaEoSFzrJxBrmLcP78JY7zmikrdP/SCzC6FWeBtzSU8/OBaSPld4q+L3
 Dj+F9kDEZaJ5+GjzLxU+pAanwa1w5OSniQxE5YzG7N/DtKNblUkTgPNk+h0fyGqekG2aCfvp4
 jIdwx9oNPYZkaC1h4QqSjU9Djnm+Rx2rXSl7vh3xXW4SRcPvFq2SLxnisAoOtBztlFv4Zigun
 bZJYurF3+4ce/5atj/pwV3M8gL3BDBTHFrv5X4Z8N8To4Nn2AuYcIxkIyW3xqayh/Rb/ywgk/
 e0kRaVY7ztM7041DLENzzd0EqnFrR8bD7687zxtezP2fb/Xi9WhAULoLaP7Y8sgSsJNM7X5E9
 tGTgo=
Subject: [Ecn-sane] draft-white-tsvwg-nqb-02 comments
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
 Internet <ecn-sane.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/ecn-sane>
List-Post: <mailto:ecn-sane@lists.bufferbloat.net>
List-Help: <mailto:ecn-sane-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sat, 24 Aug 2019 22:36:53 -0000


"Flow queueing (FQ) approaches (such as fq_codel [RFC8290]), on the =
other hand, achieve latency improvements by associating packets into =
"flow" queues and then prioritizing "sparse flows", i.e. packets that =
arrive to an empty flow queue. Flow queueing does not attempt to =
differentiate between flows on the basis of value (importance or =
latency-sensitivity), it simply gives preference to sparse flows, and =
tries to guarantee that the non-sparse flows all get an equal share of =
the remaining channel capacity and are interleaved with one another. As =
a result, FQ mechanisms could be considered more appropriate for =
unmanaged environments and general Internet traffic."


	[SM] An intermediate hop does not have any real handle of the =
"value" of a packet and hence will can not "differentiate between flows =
on the basis of value" this is true for FQ and non-FQ approaches. What =
this section calls "preference to sparse flows" is basically the same as =
NBQ with queue protection does to NBQ-marked packets, give them the =
benefit of the doubt until they exceed a  variable sojourn/queueing =
threshold after which they are treated less preferentially, either by =
not being treated as "sparse" anymore, or by being redirected into the =
QB-flow queue. With the exception that FQ based solutions will =
immediately move all packets of such a non-sparse flow out of the way of =
qualifying sparse flows, while queue protection as described in the =
docsis document will only redirect newly incoming packets to the =
NBQ-flow queue, all falsely classified packets will remain in and =
"pollute" the NQB-flow queue.


"Downsides to this approach include loss of low latency performance due =
to the possibility of hash collisions (where a sparse flow shares a =
queue with a bulk data flow), complexity in managing a large number of =
queues in certain implementations, and some undesirable effects of the =
Deficit Round Robin (DRR) scheduling."


	[SM] As described in DOCSIS-MULPIv3.1 the queue protection =
method only has a limited number of buckets to use and will to my =
understanding account all flows above that number in the default bucket, =
this looks pretty close to the consequence of a hash collision to me as =
I interpret this as the same fate-sharing of nominally independent flows =
observed in FQ when hash collisions occur. While it is fair criticism =
that these failure mode exists, only mentioning in the context of FQ =
seems sub-optimal. Especially since the docsis queue protection is =
defined with only 32 non-default buckets... versus a default of 1024 =
flows for fq_codel.


"The DRR scheduler enforces that each non-sparse flow gets an equal =
fraction of link bandwidth,"


	[SM] This is actually a feature not a bug. This will only =
trigger under load conditions and will give behavior that end-points can =
actually predict reasonably well. Any other kind of bandwidth sharing =
between flows is bound to have better best-case behavior, but also much =
worse worst-case behavior (like almost complete starvation of some =
flows). In short equal bandwidth under load seems far superior to =
forward progress than "anything goes" as it will deliver something good =
enough without requiring an oracle and without regressing intro =
starvation territory.
	Tangent, I have read Bob's justification for wanting inequality =
here, but just mention that an intermediate hop simply can not know or =
reasonably balance importance of traversing flows (unless in a very =
controlled environment were all endpoints can be trusted to a) do the =
right thing and also rank their bandwidth use by overall importance).


"In effect, the network element is making a decision as to what =
constitutes a flow, and then forcing all such flows to take equal =
bandwidth at every instant."


	[SM] This seems to hold only under saturating conditions, and as =
argued above seems to be a reasonable compromise that will be good =
enough. The intermediate hop has reliable way of objectively ranking the =
relative importance of the concurrently active flows; and without such a =
ranking, treating flows all equal seems to be more cautious and =
conservative than basically allowing anything.=20
The network element in front of the saturated link needs to make a =
decision (otherwise no AQM would be active) and the network element =
needs to "force" its view on the flows (which by the way is exactly the =
rationale for recommending queue protection). Also the equal bandwidth =
for all flows at every instant is simply wrong, as long as the link is =
not saturated this does not trigger, also no flow is "forced" to take =
more bandwidth than it requires... Let me try to give a description of =
how FQ behavior looks from the outside (this is a simplification and =
hence wrong, but hopefully less wrong than the simplification in the =
draft: Under saturating conditions with N flows, all flows with rates =
less than egress_rate/N will send at full blast, just like without =
saturation, then the remaining bandwidth is equally shared among those =
flows that are sending at higher rates. This does hence not result in =
equal rates for all flows at every instance.


"The Dual-queue approach defined in this document achieves the main =
benefit of fq_codel: latency improvement without value judgements, =
without the downsides."


	[SM] Well, that seems a rather subjective judgement, also wrong =
given that queue protection conceptually suffers from similar downsides =
as fq "hash collisions" and lacks the clear and justify-able =
middle-of-road equal bandwidth to all (that can make use of it) approach =
that might not be as optimal as the best possible bandwidth allotment, =
but has the advantage of not requiring an oracle to be actually =
guaranteed to work. The point is unequal sharing is a "value judgement" =
just as equal sharing, so claiming dualQ to be policy free is simply =
wrong.


"The distinction between NQB flows and QB flows is similar to the =
distinction made between "sparse flow queues" and "non-sparse flow =
queues" in fq_codel. In fq_codel, a flow queue is considered sparse if =
it is drained completely by each packet transmission, and remains empty =
for at least one cycle of the round robin over the active flows (this is =
approximately equivalent to saying that it utilizes less than its fair =
share of capacity). While this definition is convenient to implement in =
fq_codel, it isn't the only useful definition of sparse flows."


	[SM] Have the fq_codel authors been asked whether the choice of =
this sparseness measure was by convenience (only)?


Best Regards