From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 355953B29E for ; Tue, 27 Nov 2018 05:40:51 -0500 (EST) Received: by mail-qk1-x72b.google.com with SMTP id 189so14203382qkj.8 for ; Tue, 27 Nov 2018 02:40:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RNOH2KwKf8ohYxXskUbMjISMSmI1OmDnpv8e4AgXReQ=; b=qw9QiWJCelyoc/Xkgz41pWrBCRldw7z7Vy95wLdiULJgpUk+ghVUn7fqr0HfhCqjOT sf88K8MCwHUhMrMvkyvO4yyGIuFPItU03AW5JQ6F0KBXnFIgvqqtJxkkmpjo0xnFngzY lVqtmSvutYpHRUg7R1AuhsSq8gVrXvRyLvREW+V8SUlx/Yq191J4xCQuwPqlbP5Pn7j5 LRC+i/j2aweuBaXaEug8Hb0yPSdKAKo7mvoAjC0jK1Rdg+yuaqcbWTVpORo17dcX5m7f ggTJZziYiD3gYKOYQ12AIvU0xs763ZwUNAGJuedfpylaIQL5MqkD8cSy3OL2BeXHS9ci /6gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RNOH2KwKf8ohYxXskUbMjISMSmI1OmDnpv8e4AgXReQ=; b=YbzpEixwfTp3hNT49Cr9Cd+wvOcai9dYMIn05aZF6s4Q1YXpWDfn5QGanrlxWcocP/ KJByj4krIOCE+0oSc+p29YS8XfFedGeUYdc1jaSXfE5CDYgmiLm+Pa/DytW5jpWNCWsw 1lkE5YhVlnUguz0I3/C+Qidn96eYWMYEzS5D+df3VVOYtUnqycDpZnF2pRvsH69wTsTz XWJBNuofW797eEuZx0M6vBdl7805wJ7sfAr3vZrEPxc6h6/V//+mzc2f0k3Cws404rY7 MMmxON2Ff0nt80m3YVeLsUbNQ0zrKK0iavtQr4xleH4DhA8nIdruasCnNzR4x1gkPJTJ 7VdA== X-Gm-Message-State: AA+aEWYfC3afwe7+s1EzSjdwodMJmYrIDEpwO0mYqZHTDvyxB+lM3Jwa HPtNBkbPSEtf1CSMgsdSkRibLdPdwjLsqtvhs+0= X-Google-Smtp-Source: AFSGD/VqMkCgl0dxfb14GnOq1IX7R5hnnXjoSbXYVVLzT7gAU7VmXnAokY3dHWk60KQof1xDK4SFFNCYZPP9jXFZBEg= X-Received: by 2002:a37:d994:: with SMTP id q20mr28981978qkl.116.1543315250722; Tue, 27 Nov 2018 02:40:50 -0800 (PST) MIME-Version: 1.0 References: <65EAC6C1-4688-46B6-A575-A6C7F2C066C5@heistp.net> <86b16a95-e47d-896b-9d43-69c65c52afc7@kit.edu> In-Reply-To: <86b16a95-e47d-896b-9d43-69c65c52afc7@kit.edu> From: Luca Muscariello Date: Tue, 27 Nov 2018 11:40:39 +0100 Message-ID: To: "Bless, Roland (TM)" Cc: Jonathan Morton , bloat Content-Type: multipart/alternative; boundary="0000000000004bbc84057ba315c8" Subject: Re: [Bloat] when does the CoDel part of fq_codel help in the real world? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2018 10:40:51 -0000 --0000000000004bbc84057ba315c8 Content-Type: text/plain; charset="UTF-8" OK. We agree. That's correct, you need *at least* the BDP in flight so that the bottleneck queue never empties out. This can be easily proven using fluid models for any congestion controlled source no matter if it is loss-based, delay-based, rate-based, formula-based etc. A highly paced source gives you the ability to get as close as theoretically possible to the BDP+epsilon as possible. link fully utilized is defined as Q>0 unless you don't include the packet currently being transmitted. I do, so the TXtteer is never idle. But that's a detail. On Tue, Nov 27, 2018 at 11:35 AM Bless, Roland (TM) wrote: > Hi, > > Am 27.11.18 um 11:29 schrieb Luca Muscariello: > > I have never said that you need to fill the buffer to the max size to > > get full capacity, which is an absurdity. > > Yes, it's absurd, but that's what today's loss-based CC algorithms do. > > > I said you need at least the BDP so that the queue never empties out. > > The link is fully utilized IFF the queue is never emptied. > > I was also a bit imprecise: you'll need a BDP in flight, but > you don't need to fill the buffer at all. The latter sentence > is valid only in the direction: queue not empty -> link fully utilized. > > Regards, > Roland > > > > > > > > > On Tue 27 Nov 2018 at 11:26, Bless, Roland (TM) > > wrote: > > > > Hi Luca, > > > > Am 27.11.18 um 10:24 schrieb Luca Muscariello: > > > A congestion controlled protocol such as TCP or others, including > > QUIC, > > > LEDBAT and so on > > > need at least the BDP in the transmission queue to get full link > > > efficiency, i.e. the queue never empties out. > > > > This is not true. There are congestion control algorithms > > (e.g., TCP LoLa [1] or BBRv2) that can fully utilize the bottleneck > link > > capacity without filling the buffer to its maximum capacity. The BDP > > rule of thumb basically stems from the older loss-based congestion > > control variants that profit from the standing queue that they built > > over time when they detect a loss: > > while they back-off and stop sending, the queue keeps the bottleneck > > output busy and you'll not see underutilization of the link. > Moreover, > > once you get good loss de-synchronization, the buffer size > requirement > > for multiple long-lived flows decreases. > > > > > This gives rule of thumbs to size buffers which is also very > practical > > > and thanks to flow isolation becomes very accurate. > > > > The positive effect of buffers is merely their role to absorb > > short-term bursts (i.e., mismatch in arrival and departure rates) > > instead of dropping packets. One does not need a big buffer to > > fully utilize a link (with perfect knowledge you can keep the link > > saturated even without a single packet waiting in the buffer). > > Furthermore, large buffers (e.g., using the BDP rule of thumb) > > are not useful/practical anymore at very high speed such as 100 > Gbit/s: > > memory is also quite costly at such high speeds... > > > > Regards, > > Roland > > > > [1] M. Hock, F. Neumeister, M. Zitterbart, R. Bless. > > TCP LoLa: Congestion Control for Low Latencies and High Throughput. > > Local Computer Networks (LCN), 2017 IEEE 42nd Conference on, pp. > > 215-218, Singapore, Singapore, October 2017 > > http://doc.tm.kit.edu/2017-LCN-lola-paper-authors-copy.pdf > > > > > Which is: > > > > > > 1) find a way to keep the number of backlogged flows at a > > reasonable value. > > > This largely depends on the minimum fair rate an application may > > need in > > > the long term. > > > We discussed a little bit of available mechanisms to achieve that > > in the > > > literature. > > > > > > 2) fix the largest RTT you want to serve at full utilization and > size > > > the buffer using BDP * N_backlogged. > > > Or the other way round: check how much memory you can use > > > in the router/line card/device and for a fixed N, compute the > largest > > > RTT you can serve at full utilization. > > > > > > 3) there is still some memory to dimension for sparse flows in > > addition > > > to that, but this is not based on BDP. > > > It is just enough to compute the total utilization of sparse flows > and > > > use the same simple model Toke has used > > > to compute the (de)prioritization probability. > > > > > > This procedure would allow to size FQ_codel but also SFQ. > > > It would be interesting to compare the two under this buffer > sizing. > > > It would also be interesting to compare another mechanism that we > have > > > mentioned during the defense > > > which is AFD + a sparse flow queue. Which is, BTW, already > > available in > > > Cisco nexus switches for data centres. > > > > > > I think that the the codel part would still provide the ECN > feature, > > > that all the others cannot have. > > > However the others, the last one especially can be implemented in > > > silicon with reasonable cost. > > > > --0000000000004bbc84057ba315c8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
OK. We agree.
That's correct, you need *at least* = the BDP in flight so that the bottleneck queue never empties out.

This can be easily proven using fluid models for any conges= tion controlled source no matter if it is=C2=A0
loss-based, delay= -based, rate-based, formula-based etc.

A highly pa= ced source gives you the ability to get as close as theoretically possible = to the BDP+epsilon
as possible.

link ful= ly utilized is defined as Q>0 unless you don't include the packet cu= rrently being transmitted. I do,
so the TXtteer is never idle. Bu= t that's a detail.



On Tue, Nov 27, 2018 at 11:35 AM Bless, Rolan= d (TM) <roland.bless@kit.edu= > wrote:
Hi,

Am 27.11.18 um 11:29 schrieb Luca Muscariello:
> I have never said that you need to fill the buffer to the max size to<= br> > get full capacity, which is an absurdity.

Yes, it's absurd, but that's what today's loss-based CC algorit= hms do.

> I said you need at least the BDP so that the queue never empties out.<= br> > The link is fully utilized IFF the queue is never emptied.

I was also a bit imprecise: you'll need a BDP in flight, but
you don't need to fill the buffer at all. The latter sentence
is valid only in the direction: queue not empty -> link fully utilized.<= br>
Regards,
=C2=A0Roland

>
>
>
> On Tue 27 Nov 2018 at 11:26, Bless, Roland (TM) <roland.bless@kit.edu
> <mailto:r= oland.bless@kit.edu>> wrote:
>
>=C2=A0 =C2=A0 =C2=A0Hi Luca,
>
>=C2=A0 =C2=A0 =C2=A0Am 27.11.18 um 10:24 schrieb Luca Muscariello:
>=C2=A0 =C2=A0 =C2=A0> A congestion controlled protocol such as TCP o= r others, including
>=C2=A0 =C2=A0 =C2=A0QUIC,
>=C2=A0 =C2=A0 =C2=A0> LEDBAT and so on
>=C2=A0 =C2=A0 =C2=A0> need at least the BDP in the transmission queu= e to get full link
>=C2=A0 =C2=A0 =C2=A0> efficiency, i.e. the queue never empties out.<= br> >
>=C2=A0 =C2=A0 =C2=A0This is not true. There are congestion control algo= rithms
>=C2=A0 =C2=A0 =C2=A0(e.g., TCP LoLa [1] or BBRv2) that can fully utiliz= e the bottleneck link
>=C2=A0 =C2=A0 =C2=A0capacity without filling the buffer to its maximum = capacity. The BDP
>=C2=A0 =C2=A0 =C2=A0rule of thumb basically stems from the older loss-b= ased congestion
>=C2=A0 =C2=A0 =C2=A0control variants that profit from the standing queu= e that they built
>=C2=A0 =C2=A0 =C2=A0over time when they detect a loss:
>=C2=A0 =C2=A0 =C2=A0while they back-off and stop sending, the queue kee= ps the bottleneck
>=C2=A0 =C2=A0 =C2=A0output busy and you'll not see underutilization= of the link. Moreover,
>=C2=A0 =C2=A0 =C2=A0once you get good loss de-synchronization, the buff= er size requirement
>=C2=A0 =C2=A0 =C2=A0for multiple long-lived flows decreases.
>
>=C2=A0 =C2=A0 =C2=A0> This gives rule of thumbs to size buffers whic= h is also very practical
>=C2=A0 =C2=A0 =C2=A0> and thanks to flow isolation becomes very accu= rate.
>
>=C2=A0 =C2=A0 =C2=A0The positive effect of buffers is merely their role= to absorb
>=C2=A0 =C2=A0 =C2=A0short-term bursts (i.e., mismatch in arrival and de= parture rates)
>=C2=A0 =C2=A0 =C2=A0instead of dropping packets. One does not need a bi= g buffer to
>=C2=A0 =C2=A0 =C2=A0fully utilize a link (with perfect knowledge you ca= n keep the link
>=C2=A0 =C2=A0 =C2=A0saturated even without a single packet waiting in t= he buffer).
>=C2=A0 =C2=A0 =C2=A0Furthermore, large buffers (e.g., using the BDP rul= e of thumb)
>=C2=A0 =C2=A0 =C2=A0are not useful/practical anymore at very high speed= such as 100 Gbit/s:
>=C2=A0 =C2=A0 =C2=A0memory is also quite costly at such high speeds...<= br> >
>=C2=A0 =C2=A0 =C2=A0Regards,
>=C2=A0 =C2=A0 =C2=A0=C2=A0Roland
>
>=C2=A0 =C2=A0 =C2=A0[1] M. Hock, F. Neumeister, M. Zitterbart, R. Bless= .
>=C2=A0 =C2=A0 =C2=A0TCP LoLa: Congestion Control for Low Latencies and = High Throughput.
>=C2=A0 =C2=A0 =C2=A0Local Computer Networks (LCN), 2017 IEEE 42nd Confe= rence on, pp.
>=C2=A0 =C2=A0 =C2=A0215-218, Singapore, Singapore, October 2017
>=C2=A0 =C2=A0 =C2=A0http://doc.tm.kit.= edu/2017-LCN-lola-paper-authors-copy.pdf
>
>=C2=A0 =C2=A0 =C2=A0> Which is:=C2=A0
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> 1) find a way to keep the number of backlogged= flows at a
>=C2=A0 =C2=A0 =C2=A0reasonable value.=C2=A0
>=C2=A0 =C2=A0 =C2=A0> This largely depends on the minimum fair rate = an application may
>=C2=A0 =C2=A0 =C2=A0need in
>=C2=A0 =C2=A0 =C2=A0> the long term.
>=C2=A0 =C2=A0 =C2=A0> We discussed a little bit of available mechani= sms to achieve that
>=C2=A0 =C2=A0 =C2=A0in the
>=C2=A0 =C2=A0 =C2=A0> literature.
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> 2) fix the largest RTT you want to serve at fu= ll utilization and size
>=C2=A0 =C2=A0 =C2=A0> the buffer using BDP * N_backlogged.=C2=A0=C2= =A0
>=C2=A0 =C2=A0 =C2=A0> Or the other way round: check how much memory = you can use=C2=A0
>=C2=A0 =C2=A0 =C2=A0> in the router/line card/device and for a fixed= N, compute the largest
>=C2=A0 =C2=A0 =C2=A0> RTT you can serve at full utilization.=C2=A0 >=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> 3) there is still some memory to dimension for= sparse flows in
>=C2=A0 =C2=A0 =C2=A0addition
>=C2=A0 =C2=A0 =C2=A0> to that, but this is not based on BDP.=C2=A0 >=C2=A0 =C2=A0 =C2=A0> It is just enough to compute the total utiliza= tion of sparse flows and
>=C2=A0 =C2=A0 =C2=A0> use the same simple model Toke has used=C2=A0<= br> >=C2=A0 =C2=A0 =C2=A0> to compute the (de)prioritization probability.=
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> This procedure would allow to size FQ_codel bu= t also SFQ.
>=C2=A0 =C2=A0 =C2=A0> It would be interesting to compare the two und= er this buffer sizing.=C2=A0
>=C2=A0 =C2=A0 =C2=A0> It would also be interesting to compare anothe= r mechanism that we have
>=C2=A0 =C2=A0 =C2=A0> mentioned during the defense
>=C2=A0 =C2=A0 =C2=A0> which is AFD=C2=A0+ a sparse flow queue. Which= is, BTW, already
>=C2=A0 =C2=A0 =C2=A0available in
>=C2=A0 =C2=A0 =C2=A0> Cisco nexus switches for data centres.
>=C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0> I think that the the codel part would still pr= ovide the ECN feature,
>=C2=A0 =C2=A0 =C2=A0> that all the others cannot have.
>=C2=A0 =C2=A0 =C2=A0> However the others, the last one especially ca= n be implemented in
>=C2=A0 =C2=A0 =C2=A0> silicon with reasonable cost.
>

--0000000000004bbc84057ba315c8--