General list for discussing Bufferbloat
 help / color / mirror / Atom feed
From: Neal Cardwell <ncardwell@google.com>
To: "Bless, Roland (TM)" <roland.bless@kit.edu>
Cc: Matt Mathis <mattmathis@google.com>, bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] Abandoning Window-based CC Considered Harmful (was Re: Bechtolschiem)
Date: Thu, 8 Jul 2021 11:47:40 -0400	[thread overview]
Message-ID: <CADVnQymmiT3Xj9hN1ev64RAAr5154VANi1M=3=f0AgaSLKJkuQ@mail.gmail.com> (raw)
In-Reply-To: <932111a2-8099-0351-caff-f18e0821f9cf@kit.edu>

[-- Attachment #1: Type: text/plain, Size: 7203 bytes --]

On Thu, Jul 8, 2021 at 10:28 AM Bless, Roland (TM) <roland.bless@kit.edu>
wrote:

> Hi Neal,
>
> On 08.07.21 at 15:29 Neal Cardwell wrote:
>
> On Thu, Jul 8, 2021 at 7:25 AM Bless, Roland (TM) <roland.bless@kit.edu>
> wrote:
>
>> It seems that in BBRv2 there are many more mechanisms present
>> that try to control the amount of inflight data more tightly and the new
>> "cap"
>> is at 1.25 BDP.
>>
> To clarify, the BBRv2 cwnd cap is not 1.25*BDP. If there is no packet loss
> or ECN, the BBRv2 cwnd cap is the same as BBRv1. But if there has been
> packet loss then conceptually the cwnd cap is the maximum amount of data
> delivered in a single round trip since the last packet loss (with a floor
> to ensure that the cwnd does not decrease by more than 30% per round trip
> with packet loss, similar to CUBIC's 30% reduction in a round trip with
> packet loss). (And upon RTO the BBR (v1 or v2) cwnd is reset to 1, and
> slow-starts upward from there.)
>
> Thanks for the clarification. I'm patiently waiting to see the BBRv2
> mechanisms coherently written up
> in that new BBR Internet-Draft version ;-) Getting this together from the
> "diffs" on the IETF slides or the source code
> is somewhat tedious, so I'll be very grateful for having that single write
> up.
>
> There is an overview of the BBRv2 response to packet loss here:
>
> https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00#page=18
>
> My assumption came from slide 25 of this slide set:
> the probing is terminated if inflight > 1.25 estimated_bdp (or "hard
> ceiling" seen).
> So without experiencing more than 2% packet loss this may end up beyond
> 1.25 estimated_bdp,
>

Yes, that can be the behavior when BBRv2 is probing for bandwidth, but is
not the average or steady-state behavior.


> but would it often end at 2estimated_bdp?
>

That depends on the details of the bottleneck buffer depth, number of
competing flows and what congestion control algorithm they are using, etc.

neal



> Best regards,
>
>  Roland
>
>
>
>
>> This is too large for short queue routers in the Internet core, but it
>> helps a lot with cross traffic on large queue edge routers.
>>
>> Best regards,
>>  Roland
>>
>> [1] https://ieeexplore.ieee.org/document/8117540
>>
>>
>> On Wed, Jul 7, 2021 at 3:19 PM Bless, Roland (TM) <roland.bless@kit.edu>
>> wrote:
>>
>>> Hi Matt,
>>>
>>> [sorry for the late reply, overlooked this one]
>>>
>>> please, see comments inline.
>>>
>>> On 02.07.21 at 21:46 Matt Mathis via Bloat wrote:
>>>
>>> The argument is absolutely correct for Reno, CUBIC and all
>>> other self-clocked protocols.  One of the core assumptions in Jacobson88,
>>> was that the clock for the entire system comes from packets draining
>>> through the bottleneck queue.  In this world, the clock is intrinsically
>>> brittle if the buffers are too small.  The drain time needs to be a
>>> substantial fraction of the RTT.
>>>
>>> I'd like to separate the functions here a bit:
>>>
>>> 1) "automatic pacing" by ACK clocking
>>>
>>> 2) congestion-window-based operation
>>>
>>> I agree that the automatic pacing generated by the ACK clock (function
>>> 1) is increasingly
>>> distorted these days and may consequently cause micro bursts.
>>> This can be mitigated by using paced sending, which I consider very
>>> useful.
>>> However, I consider abandoning the (congestion) window-based approaches
>>> with ACK feedback (function 2) as harmful:
>>> a congestion window has an automatic self-stabilizing property since the
>>> ACK feedback reflects
>>> also the queuing delay and the congestion window limits the amount of
>>> inflight data.
>>> In contrast, rate-based senders risk instability: two senders in an
>>> M/D/1 setting, each sender sending with 50%
>>> bottleneck rate in average, both using paced sending at 120% of the
>>> average rate, suffice to cause
>>> instability (queue grows unlimited).
>>>
>>> IMHO, two approaches seem to be useful:
>>> a) congestion-window-based operation with paced sending
>>> b) rate-based/paced sending with limiting the amount of inflight data
>>>
>>>
>>> However, we have reached the point where we need to discard that
>>> requirement.  One of the side points of BBR is that in many environments it
>>> is cheaper to burn serving CPU to pace into short queue networks than it is
>>> to "right size" the network queues.
>>>
>>> The fundamental problem with the old way is that in some contexts the
>>> buffer memory has to beat Moore's law, because to maintain constant drain
>>> time the memory size and BW both have to scale with the link (laser) BW.
>>>
>>> See the slides I gave at the Stanford Buffer Sizing workshop december
>>> 2019: Buffer Sizing: Position Paper
>>> <https://docs.google.com/presentation/d/1VyBlYQJqWvPuGnQpxW4S46asHMmiA-OeMbewxo_r3Cc/edit#slide=id.g791555f04c_0_5>
>>>
>>>
>>> Thanks for the pointer. I don't quite get the point that the buffer must
>>> have a certain size to keep the ACK clock stable:
>>> in case of an non application-limited sender, a very small buffer
>>> suffices to let the ACK clock
>>> run steady. The large buffers were mainly required for loss-based CCs to
>>> let the standing queue
>>> build up that keeps the bottleneck busy during CWnd reduction after
>>> packet loss, thereby
>>> keeping the (bottleneck link) utilization high.
>>>
>>> Regards,
>>>
>>>  Roland
>>>
>>>
>>> Note that we are talking about DC and Internet core.  At the edge, BW is
>>> low enough where memory is relatively cheap.   In some sense BB came about
>>> because memory is too cheap in these environments.
>>>
>>> Thanks,
>>> --MM--
>>> The best way to predict the future is to create it.  - Alan Kay
>>>
>>> We must not tolerate intolerance;
>>>        however our response must be carefully measured:
>>>             too strong would be hypocritical and risks spiraling out of
>>> control;
>>>             too weak risks being mistaken for tacit approval.
>>>
>>>
>>> On Fri, Jul 2, 2021 at 9:59 AM Stephen Hemminger <
>>> stephen@networkplumber.org> wrote:
>>>
>>>> On Fri, 2 Jul 2021 09:42:24 -0700
>>>> Dave Taht <dave.taht@gmail.com> wrote:
>>>>
>>>> > "Debunking Bechtolsheim credibly would get a lot of attention to the
>>>> > bufferbloat cause, I suspect." - dpreed
>>>> >
>>>> > "Why Big Data Needs Big Buffer Switches" -
>>>> >
>>>> http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf
>>>> >
>>>>
>>>> Also, a lot depends on the TCP congestion control algorithm being used.
>>>> They are using NewReno which only researchers use in real life.
>>>>
>>>> Even TCP Cubic has gone through several revisions. In my experience, the
>>>> NS-2 models don't correlate well to real world behavior.
>>>>
>>>> In real world tests, TCP Cubic will consume any buffer it sees at a
>>>> congested link. Maybe that is what they mean by capture effect.
>>>>
>>>> There is also a weird oscillation effect with multiple streams, where
>>>> one
>>>> flow will take the buffer, then see a packet loss and back off, the
>>>> other flow will take over the buffer until it sees loss.
>>>>
>>>> _______________________________________________
>>>
>>> _______________________________________________
>>>
>>>
>>>
>

[-- Attachment #2: Type: text/html, Size: 16818 bytes --]

  reply	other threads:[~2021-07-08 15:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <55fdf513-9c54-bea9-1f53-fe2c5229d7ba@eggo.org>
     [not found] ` <871t4as1h9.fsf@toke.dk>
     [not found]   ` <3D32F19B-5DEA-48AD-97E7-D043C4EAEC51@gmail.com>
     [not found]     ` <alpine.DEB.2.02.1606062029380.28955@uplift.swm.pp.se>
     [not found]       ` <CAD6NSj6vA=bjHt3Txyw8VuV9tqg-A7wvLd6ovJG4Jxabvvjw4g@mail.gmail.com>
     [not found]         ` <1465267957.902610235@apps.rackspace.com>
2021-07-02 16:42           ` [Bloat] Bechtolschiem Dave Taht
2021-07-02 16:59             ` Stephen Hemminger
2021-07-02 17:50               ` Dave Collier-Brown
2021-07-02 19:46               ` Matt Mathis
2021-07-07 22:19                 ` [Bloat] Abandoning Window-based CC Considered Harmful (was Re: Bechtolschiem) Bless, Roland (TM)
2021-07-07 22:38                   ` Matt Mathis
2021-07-08 11:24                     ` Bless, Roland (TM)
2021-07-08 13:29                       ` Matt Mathis
2021-07-08 14:05                         ` Bless, Roland (TM)
2021-07-08 14:40                         ` Jonathan Morton
2021-07-08 20:14                           ` [Bloat] [Cerowrt-devel] " David P. Reed
2021-07-09  7:10                             ` Erik Auerswald
2021-07-08 13:29                       ` [Bloat] " Neal Cardwell
2021-07-08 14:28                         ` Bless, Roland (TM)
2021-07-08 15:47                           ` Neal Cardwell [this message]
2021-07-02 20:28               ` [Bloat] Bechtolschiem Jonathan Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADVnQymmiT3Xj9hN1ev64RAAr5154VANi1M=3=f0AgaSLKJkuQ@mail.gmail.com' \
    --to=ncardwell@google.com \
    --cc=bloat@lists.bufferbloat.net \
    --cc=mattmathis@google.com \
    --cc=roland.bless@kit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox