From: "Bless, Roland (TM)" <roland.bless@kit.edu>
To: Neal Cardwell <ncardwell@google.com>
Cc: Matt Mathis <mattmathis@google.com>, bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] Abandoning Window-based CC Considered Harmful (was Re: Bechtolschiem)
Date: Thu, 8 Jul 2021 16:28:19 +0200 [thread overview]
Message-ID: <932111a2-8099-0351-caff-f18e0821f9cf@kit.edu> (raw)
In-Reply-To: <CADVnQy=SyxdOXCrUnE45x_r3vZi7mM0OyeVo6btJcyZ+qnT_1Q@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 8324 bytes --]
Hi Neal,
On 08.07.21 at 15:29 Neal Cardwell wrote:
> On Thu, Jul 8, 2021 at 7:25 AM Bless, Roland (TM)
> <roland.bless@kit.edu <mailto:roland.bless@kit.edu>> wrote:
>
> It seems that in BBRv2 there are many more mechanisms present
> that try to control the amount of inflight data more tightly and
> the new "cap"
> is at 1.25 BDP.
>
> To clarify, the BBRv2 cwnd cap is not 1.25*BDP. If there is no packet
> loss or ECN, the BBRv2 cwnd cap is the same as BBRv1. But if there has
> been packet loss then conceptually the cwnd cap is the maximum amount
> of data delivered in a single round trip since the last packet loss
> (with a floor to ensure that the cwnd does not decrease by more than
> 30% per round trip with packet loss, similar to CUBIC's 30% reduction
> in a round trip with packet loss). (And upon RTO the BBR (v1 or v2)
> cwnd is reset to 1, and slow-starts upward from there.)
Thanks for the clarification. I'm patiently waiting to see the BBRv2
mechanisms coherently written up
in that new BBR Internet-Draft version ;-) Getting this together from
the "diffs" on the IETF slides or the source code
is somewhat tedious, so I'll be very grateful for having that single
write up.
> There is an overview of the BBRv2 response to packet loss here:
> https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00#page=18
> <https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00#page=18>
My assumption came from slide 25 of this slide set:
the probing is terminated if inflight > 1.25 estimated_bdp (or "hard
ceiling" seen).
So without experiencing more than 2% packet loss this may end up beyond
1.25 estimated_bdp,
but would it often end at 2estimated_bdp?
Best regards,
Roland
>
>> This is too large for short queue routers in the Internet core,
>> but it helps a lot with cross traffic on large queue edge routers.
>
> Best regards,
> Roland
>
> [1] https://ieeexplore.ieee.org/document/8117540
> <https://ieeexplore.ieee.org/document/8117540>
>
>>
>> On Wed, Jul 7, 2021 at 3:19 PM Bless, Roland (TM)
>> <roland.bless@kit.edu <mailto:roland.bless@kit.edu>> wrote:
>>
>> Hi Matt,
>>
>> [sorry for the late reply, overlooked this one]
>>
>> please, see comments inline.
>>
>> On 02.07.21 at 21:46 Matt Mathis via Bloat wrote:
>>> The argument is absolutely correct for Reno, CUBIC and all
>>> other self-clocked protocols. One of the core assumptions
>>> in Jacobson88, was that the clock for the entire system
>>> comes from packets draining through the bottleneck queue.
>>> In this world, the clock is intrinsically brittle if the
>>> buffers are too small. The drain time needs to be a
>>> substantial fraction of the RTT.
>> I'd like to separate the functions here a bit:
>>
>> 1) "automatic pacing" by ACK clocking
>>
>> 2) congestion-window-based operation
>>
>> I agree that the automatic pacing generated by the ACK clock
>> (function 1) is increasingly
>> distorted these days and may consequently cause micro bursts.
>> This can be mitigated by using paced sending, which I
>> consider very useful.
>> However, I consider abandoning the (congestion) window-based
>> approaches
>> with ACK feedback (function 2) as harmful:
>> a congestion window has an automatic self-stabilizing
>> property since the ACK feedback reflects
>> also the queuing delay and the congestion window limits the
>> amount of inflight data.
>> In contrast, rate-based senders risk instability: two senders
>> in an M/D/1 setting, each sender sending with 50%
>> bottleneck rate in average, both using paced sending at 120%
>> of the average rate, suffice to cause
>> instability (queue grows unlimited).
>>
>> IMHO, two approaches seem to be useful:
>> a) congestion-window-based operation with paced sending
>> b) rate-based/paced sending with limiting the amount of
>> inflight data
>>
>>>
>>> However, we have reached the point where we need to discard
>>> that requirement. One of the side points of BBR is that in
>>> many environments it is cheaper to burn serving CPU to pace
>>> into short queue networks than it is to "right size" the
>>> network queues.
>>>
>>> The fundamental problem with the old way is that in some
>>> contexts the buffer memory has to beat Moore's law, because
>>> to maintain constant drain time the memory size and BW both
>>> have to scale with the link (laser) BW.
>>>
>>> See the slides I gave at the Stanford Buffer Sizing workshop
>>> december 2019: Buffer Sizing: Position Paper
>>> <https://docs.google.com/presentation/d/1VyBlYQJqWvPuGnQpxW4S46asHMmiA-OeMbewxo_r3Cc/edit#slide=id.g791555f04c_0_5>
>>>
>>>
>> Thanks for the pointer. I don't quite get the point that the
>> buffer must have a certain size to keep the ACK clock stable:
>> in case of an non application-limited sender, a very small
>> buffer suffices to let the ACK clock
>> run steady. The large buffers were mainly required for
>> loss-based CCs to let the standing queue
>> build up that keeps the bottleneck busy during CWnd reduction
>> after packet loss, thereby
>> keeping the (bottleneck link) utilization high.
>>
>> Regards,
>>
>> Roland
>>
>>
>>> Note that we are talking about DC and Internet core. At the
>>> edge, BW is low enough where memory is relatively cheap. In
>>> some sense BB came about because memory is too cheap in
>>> these environments.
>>>
>>> Thanks,
>>> --MM--
>>> The best way to predict the future is to create it. - Alan Kay
>>>
>>> We must not tolerate intolerance;
>>> however our response must be carefully measured:
>>> too strong would be hypocritical and risks
>>> spiraling out of control;
>>> too weak risks being mistaken for tacit approval.
>>>
>>>
>>> On Fri, Jul 2, 2021 at 9:59 AM Stephen Hemminger
>>> <stephen@networkplumber.org
>>> <mailto:stephen@networkplumber.org>> wrote:
>>>
>>> On Fri, 2 Jul 2021 09:42:24 -0700
>>> Dave Taht <dave.taht@gmail.com
>>> <mailto:dave.taht@gmail.com>> wrote:
>>>
>>> > "Debunking Bechtolsheim credibly would get a lot of
>>> attention to the
>>> > bufferbloat cause, I suspect." - dpreed
>>> >
>>> > "Why Big Data Needs Big Buffer Switches" -
>>> >
>>> http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf
>>> <http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf>
>>> >
>>>
>>> Also, a lot depends on the TCP congestion control
>>> algorithm being used.
>>> They are using NewReno which only researchers use in
>>> real life.
>>>
>>> Even TCP Cubic has gone through several revisions. In my
>>> experience, the
>>> NS-2 models don't correlate well to real world behavior.
>>>
>>> In real world tests, TCP Cubic will consume any buffer
>>> it sees at a
>>> congested link. Maybe that is what they mean by capture
>>> effect.
>>>
>>> There is also a weird oscillation effect with multiple
>>> streams, where one
>>> flow will take the buffer, then see a packet loss and
>>> back off, the
>>> other flow will take over the buffer until it sees loss.
>>>
>>> _______________________________________________
>>>
>>> _______________________________________________
>>
[-- Attachment #2: Type: text/html, Size: 16474 bytes --]
next prev parent reply other threads:[~2021-07-08 14:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <55fdf513-9c54-bea9-1f53-fe2c5229d7ba@eggo.org>
[not found] ` <871t4as1h9.fsf@toke.dk>
[not found] ` <3D32F19B-5DEA-48AD-97E7-D043C4EAEC51@gmail.com>
[not found] ` <alpine.DEB.2.02.1606062029380.28955@uplift.swm.pp.se>
[not found] ` <CAD6NSj6vA=bjHt3Txyw8VuV9tqg-A7wvLd6ovJG4Jxabvvjw4g@mail.gmail.com>
[not found] ` <1465267957.902610235@apps.rackspace.com>
2021-07-02 16:42 ` [Bloat] Bechtolschiem Dave Taht
2021-07-02 16:59 ` Stephen Hemminger
2021-07-02 17:50 ` Dave Collier-Brown
2021-07-02 19:46 ` Matt Mathis
2021-07-07 22:19 ` [Bloat] Abandoning Window-based CC Considered Harmful (was Re: Bechtolschiem) Bless, Roland (TM)
2021-07-07 22:38 ` Matt Mathis
2021-07-08 11:24 ` Bless, Roland (TM)
2021-07-08 13:29 ` Matt Mathis
2021-07-08 14:05 ` Bless, Roland (TM)
2021-07-08 14:40 ` Jonathan Morton
2021-07-08 20:14 ` [Bloat] [Cerowrt-devel] " David P. Reed
2021-07-09 7:10 ` Erik Auerswald
2021-07-08 13:29 ` [Bloat] " Neal Cardwell
2021-07-08 14:28 ` Bless, Roland (TM) [this message]
2021-07-08 15:47 ` Neal Cardwell
2021-07-02 20:28 ` [Bloat] Bechtolschiem Jonathan Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=932111a2-8099-0351-caff-f18e0821f9cf@kit.edu \
--to=roland.bless@kit.edu \
--cc=bloat@lists.bufferbloat.net \
--cc=mattmathis@google.com \
--cc=ncardwell@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox