From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.taht.net (mail.taht.net [IPv6:2a01:7e00::f03c:91ff:feae:7028]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 782223B2A4 for ; Thu, 29 Nov 2018 02:40:09 -0500 (EST) Received: from dancer.taht.net (unknown [IPv6:2603:3024:1536:86f0:eea8:6bff:fefe:9a2]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.taht.net (Postfix) with ESMTPSA id EECBB21B1A; Thu, 29 Nov 2018 07:40:07 +0000 (UTC) From: Dave Taht To: "Bless\, Roland \(TM\)" Cc: Luca Muscariello , Jonathan Morton , bloat References: <65EAC6C1-4688-46B6-A575-A6C7F2C066C5@heistp.net> <86b16a95-e47d-896b-9d43-69c65c52afc7@kit.edu> <7e6cc6f4-bd2f-49b5-0f64-292f56a0592c@kit.edu> Date: Wed, 28 Nov 2018 23:39:56 -0800 In-Reply-To: <7e6cc6f4-bd2f-49b5-0f64-292f56a0592c@kit.edu> (Roland Bless's message of "Tue, 27 Nov 2018 12:40:45 +0100") Message-ID: <878t1cwcvn.fsf@taht.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] when does the CoDel part of fq_codel help in the real world? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Nov 2018 07:40:09 -0000 "Bless, Roland (TM)" writes: > Hi Luca, > > Am 27.11.18 um 11:40 schrieb Luca Muscariello: >> OK. We agree. >> That's correct, you need *at least* the BDP in flight so that the >> bottleneck queue never empties out. > > No, that's not what I meant, but it's quite simple. > You need: data min_inflight=3D2 * RTTmin * bottleneck_rate to filly > utilize the bottleneck link. > If this is true, the bottleneck queue will be empty. If your amount > of inflight data is larger, the bottleneck queue buffer will store > the excess packets. With just min_inflight there will be no > bottleneck queue, the packets are "on the wire". > >> This can be easily proven using fluid models for any congestion >> controlled source no matter if it is=C2=A0 >> loss-based, delay-based, rate-based, formula-based etc. >>=20 >> A highly paced source gives you the ability to get as close as >> theoretically possible to the BDP+epsilon >> as possible. > > Yep, but that BDP is "on the wire" and epsilon will be in the bottleneck > buffer. I'm hoping I made my point effectively earlier, that " data min_inflight=3D2 * RTTmin * bottleneck_rate " when it is nearly certain that more than one flow exists, means aiming for the BDP in a single flow is generally foolish. Liked the stanford result, I think it's pretty general. I see hundreds of flows active every minute. There was another paper that looked into some magic 200-ish number as simultaneous flows active, normally > >> link fully utilized is defined as Q>0 unless you don't include the >> packet currently being transmitted. I do, >> so the TXtteer is never idle. But that's a detail. > > I wouldn't define link fully utilized as Q>0, but if Q>0 then > the link is fully utilized (that's what I meant by the direction > of implication). > > Rgards, > Roland > >>=20 >>=20 >> On Tue, Nov 27, 2018 at 11:35 AM Bless, Roland (TM) >> > wrote: >>=20 >> Hi, >>=20 >> Am 27.11.18 um 11:29 schrieb Luca Muscariello: >> > I have never said that you need to fill the buffer to the max size= to >> > get full capacity, which is an absurdity. >>=20 >> Yes, it's absurd, but that's what today's loss-based CC algorithms d= o. >>=20 >> > I said you need at least the BDP so that the queue never empties o= ut. >> > The link is fully utilized IFF the queue is never emptied. >>=20 >> I was also a bit imprecise: you'll need a BDP in flight, but >> you don't need to fill the buffer at all. The latter sentence >> is valid only in the direction: queue not empty -> link fully utiliz= ed. >>=20 >> Regards, >> =C2=A0Roland >>=20 >> > >> > >> > >> > On Tue 27 Nov 2018 at 11:26, Bless, Roland (TM) >> >> > >> wrote: >> > >> >=C2=A0 =C2=A0 =C2=A0Hi Luca, >> > >> >=C2=A0 =C2=A0 =C2=A0Am 27.11.18 um 10:24 schrieb Luca Muscariello: >> >=C2=A0 =C2=A0 =C2=A0> A congestion controlled protocol such as TCP = or others, >> including >> >=C2=A0 =C2=A0 =C2=A0QUIC, >> >=C2=A0 =C2=A0 =C2=A0> LEDBAT and so on >> >=C2=A0 =C2=A0 =C2=A0> need at least the BDP in the transmission que= ue to get full link >> >=C2=A0 =C2=A0 =C2=A0> efficiency, i.e. the queue never empties out. >> > >> >=C2=A0 =C2=A0 =C2=A0This is not true. There are congestion control = algorithms >> >=C2=A0 =C2=A0 =C2=A0(e.g., TCP LoLa [1] or BBRv2) that can fully ut= ilize the >> bottleneck link >> >=C2=A0 =C2=A0 =C2=A0capacity without filling the buffer to its maxi= mum capacity. >> The BDP >> >=C2=A0 =C2=A0 =C2=A0rule of thumb basically stems from the older lo= ss-based congestion >> >=C2=A0 =C2=A0 =C2=A0control variants that profit from the standing = queue that they >> built >> >=C2=A0 =C2=A0 =C2=A0over time when they detect a loss: >> >=C2=A0 =C2=A0 =C2=A0while they back-off and stop sending, the queue= keeps the >> bottleneck >> >=C2=A0 =C2=A0 =C2=A0output busy and you'll not see underutilization= of the link. >> Moreover, >> >=C2=A0 =C2=A0 =C2=A0once you get good loss de-synchronization, the = buffer size >> requirement >> >=C2=A0 =C2=A0 =C2=A0for multiple long-lived flows decreases. >> > >> >=C2=A0 =C2=A0 =C2=A0> This gives rule of thumbs to size buffers whi= ch is also very >> practical >> >=C2=A0 =C2=A0 =C2=A0> and thanks to flow isolation becomes very acc= urate. >> > >> >=C2=A0 =C2=A0 =C2=A0The positive effect of buffers is merely their = role to absorb >> >=C2=A0 =C2=A0 =C2=A0short-term bursts (i.e., mismatch in arrival an= d departure rates) >> >=C2=A0 =C2=A0 =C2=A0instead of dropping packets. One does not need = a big buffer to >> >=C2=A0 =C2=A0 =C2=A0fully utilize a link (with perfect knowledge yo= u can keep the link >> >=C2=A0 =C2=A0 =C2=A0saturated even without a single packet waiting = in the buffer). >> >=C2=A0 =C2=A0 =C2=A0Furthermore, large buffers (e.g., using the BDP= rule of thumb) >> >=C2=A0 =C2=A0 =C2=A0are not useful/practical anymore at very high s= peed such as >> 100 Gbit/s: >> >=C2=A0 =C2=A0 =C2=A0memory is also quite costly at such high speeds= ... >> > >> >=C2=A0 =C2=A0 =C2=A0Regards, >> >=C2=A0 =C2=A0 =C2=A0=C2=A0Roland >> > >> >=C2=A0 =C2=A0 =C2=A0[1] M. Hock, F. Neumeister, M. Zitterbart, R. B= less. >> >=C2=A0 =C2=A0 =C2=A0TCP LoLa: Congestion Control for Low Latencies = and High >> Throughput. >> >=C2=A0 =C2=A0 =C2=A0Local Computer Networks (LCN), 2017 IEEE 42nd C= onference on, pp. >> >=C2=A0 =C2=A0 =C2=A0215-218, Singapore, Singapore, October 2017 >> >=C2=A0 =C2=A0 =C2=A0http://doc.tm.kit.edu/2017-LCN-lola-paper-autho= rs-copy.pdf >> > >> >=C2=A0 =C2=A0 =C2=A0> Which is:=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> >> >=C2=A0 =C2=A0 =C2=A0> 1) find a way to keep the number of backlogge= d flows at a >> >=C2=A0 =C2=A0 =C2=A0reasonable value.=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> This largely depends on the minimum fair rate= an application may >> >=C2=A0 =C2=A0 =C2=A0need in >> >=C2=A0 =C2=A0 =C2=A0> the long term. >> >=C2=A0 =C2=A0 =C2=A0> We discussed a little bit of available mechan= isms to achieve >> that >> >=C2=A0 =C2=A0 =C2=A0in the >> >=C2=A0 =C2=A0 =C2=A0> literature. >> >=C2=A0 =C2=A0 =C2=A0> >> >=C2=A0 =C2=A0 =C2=A0> 2) fix the largest RTT you want to serve at f= ull utilization >> and size >> >=C2=A0 =C2=A0 =C2=A0> the buffer using BDP * N_backlogged.=C2=A0=C2= =A0 >> >=C2=A0 =C2=A0 =C2=A0> Or the other way round: check how much memory= you can use=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> in the router/line card/device and for a fixe= d N, compute >> the largest >> >=C2=A0 =C2=A0 =C2=A0> RTT you can serve at full utilization.=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> >> >=C2=A0 =C2=A0 =C2=A0> 3) there is still some memory to dimension fo= r sparse flows in >> >=C2=A0 =C2=A0 =C2=A0addition >> >=C2=A0 =C2=A0 =C2=A0> to that, but this is not based on BDP.=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> It is just enough to compute the total utiliz= ation of sparse >> flows and >> >=C2=A0 =C2=A0 =C2=A0> use the same simple model Toke has used=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> to compute the (de)prioritization probability. >> >=C2=A0 =C2=A0 =C2=A0> >> >=C2=A0 =C2=A0 =C2=A0> This procedure would allow to size FQ_codel b= ut also SFQ. >> >=C2=A0 =C2=A0 =C2=A0> It would be interesting to compare the two un= der this buffer >> sizing.=C2=A0 >> >=C2=A0 =C2=A0 =C2=A0> It would also be interesting to compare anoth= er mechanism >> that we have >> >=C2=A0 =C2=A0 =C2=A0> mentioned during the defense >> >=C2=A0 =C2=A0 =C2=A0> which is AFD=C2=A0+ a sparse flow queue. Whic= h is, BTW, already >> >=C2=A0 =C2=A0 =C2=A0available in >> >=C2=A0 =C2=A0 =C2=A0> Cisco nexus switches for data centres. >> >=C2=A0 =C2=A0 =C2=A0> >> >=C2=A0 =C2=A0 =C2=A0> I think that the the codel part would still p= rovide the ECN >> feature, >> >=C2=A0 =C2=A0 =C2=A0> that all the others cannot have. >> >=C2=A0 =C2=A0 =C2=A0> However the others, the last one especially c= an be >> implemented in >> >=C2=A0 =C2=A0 =C2=A0> silicon with reasonable cost. >> > >>=20 > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat