Abandoning Window-based CC Considered Harmful (was Re: [Bloat] Bechtolschiem)

Matt Mathis mattmathis at google.com
Wed Jul 7 18:38:48 EDT 2021


Actually BBR does have a window based backup, which normally only comes
into play during load spikes and at very short RTTs.   It defaults to
2*minRTT*maxBW, which is twice the steady state window in it's normal paced
mode.

This is too large for short queue routers in the Internet core, but it
helps a lot with cross traffic on large queue edge routers.

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of
control;
            too weak risks being mistaken for tacit approval.


On Wed, Jul 7, 2021 at 3:19 PM Bless, Roland (TM) <roland.bless at kit.edu>
wrote:

> Hi Matt,
>
> [sorry for the late reply, overlooked this one]
>
> please, see comments inline.
>
> On 02.07.21 at 21:46 Matt Mathis via Bloat wrote:
>
> The argument is absolutely correct for Reno, CUBIC and all
> other self-clocked protocols.  One of the core assumptions in Jacobson88,
> was that the clock for the entire system comes from packets draining
> through the bottleneck queue.  In this world, the clock is intrinsically
> brittle if the buffers are too small.  The drain time needs to be a
> substantial fraction of the RTT.
>
> I'd like to separate the functions here a bit:
>
> 1) "automatic pacing" by ACK clocking
>
> 2) congestion-window-based operation
>
> I agree that the automatic pacing generated by the ACK clock (function 1)
> is increasingly
> distorted these days and may consequently cause micro bursts.
> This can be mitigated by using paced sending, which I consider very
> useful.
> However, I consider abandoning the (congestion) window-based approaches
> with ACK feedback (function 2) as harmful:
> a congestion window has an automatic self-stabilizing property since the
> ACK feedback reflects
> also the queuing delay and the congestion window limits the amount of
> inflight data.
> In contrast, rate-based senders risk instability: two senders in an M/D/1
> setting, each sender sending with 50%
> bottleneck rate in average, both using paced sending at 120% of the
> average rate, suffice to cause
> instability (queue grows unlimited).
>
> IMHO, two approaches seem to be useful:
> a) congestion-window-based operation with paced sending
> b) rate-based/paced sending with limiting the amount of inflight data
>
>
> However, we have reached the point where we need to discard that
> requirement.  One of the side points of BBR is that in many environments it
> is cheaper to burn serving CPU to pace into short queue networks than it is
> to "right size" the network queues.
>
> The fundamental problem with the old way is that in some contexts the
> buffer memory has to beat Moore's law, because to maintain constant drain
> time the memory size and BW both have to scale with the link (laser) BW.
>
> See the slides I gave at the Stanford Buffer Sizing workshop december
> 2019: Buffer Sizing: Position Paper
> <https://docs.google.com/presentation/d/1VyBlYQJqWvPuGnQpxW4S46asHMmiA-OeMbewxo_r3Cc/edit#slide=id.g791555f04c_0_5>
>
>
> Thanks for the pointer. I don't quite get the point that the buffer must
> have a certain size to keep the ACK clock stable:
> in case of an non application-limited sender, a very small buffer suffices
> to let the ACK clock
> run steady. The large buffers were mainly required for loss-based CCs to
> let the standing queue
> build up that keeps the bottleneck busy during CWnd reduction after packet
> loss, thereby
> keeping the (bottleneck link) utilization high.
>
> Regards,
>
>  Roland
>
>
> Note that we are talking about DC and Internet core.  At the edge, BW is
> low enough where memory is relatively cheap.   In some sense BB came about
> because memory is too cheap in these environments.
>
> Thanks,
> --MM--
> The best way to predict the future is to create it.  - Alan Kay
>
> We must not tolerate intolerance;
>        however our response must be carefully measured:
>             too strong would be hypocritical and risks spiraling out of
> control;
>             too weak risks being mistaken for tacit approval.
>
>
> On Fri, Jul 2, 2021 at 9:59 AM Stephen Hemminger <
> stephen at networkplumber.org> wrote:
>
>> On Fri, 2 Jul 2021 09:42:24 -0700
>> Dave Taht <dave.taht at gmail.com> wrote:
>>
>> > "Debunking Bechtolsheim credibly would get a lot of attention to the
>> > bufferbloat cause, I suspect." - dpreed
>> >
>> > "Why Big Data Needs Big Buffer Switches" -
>> >
>> http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf
>> >
>>
>> Also, a lot depends on the TCP congestion control algorithm being used.
>> They are using NewReno which only researchers use in real life.
>>
>> Even TCP Cubic has gone through several revisions. In my experience, the
>> NS-2 models don't correlate well to real world behavior.
>>
>> In real world tests, TCP Cubic will consume any buffer it sees at a
>> congested link. Maybe that is what they mean by capture effect.
>>
>> There is also a weird oscillation effect with multiple streams, where one
>> flow will take the buffer, then see a packet loss and back off, the
>> other flow will take over the buffer until it sees loss.
>>
>> _______________________________________________
>
> _______________________________________________
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20210707/dc5e64bc/attachment.html>


More information about the Cerowrt-devel mailing list