On Thu, Jul 8, 2021 at 10:28 AM Bless, Roland (TM) wrote: > Hi Neal, > > On 08.07.21 at 15:29 Neal Cardwell wrote: > > On Thu, Jul 8, 2021 at 7:25 AM Bless, Roland (TM) > wrote: > >> It seems that in BBRv2 there are many more mechanisms present >> that try to control the amount of inflight data more tightly and the new >> "cap" >> is at 1.25 BDP. >> > To clarify, the BBRv2 cwnd cap is not 1.25*BDP. If there is no packet loss > or ECN, the BBRv2 cwnd cap is the same as BBRv1. But if there has been > packet loss then conceptually the cwnd cap is the maximum amount of data > delivered in a single round trip since the last packet loss (with a floor > to ensure that the cwnd does not decrease by more than 30% per round trip > with packet loss, similar to CUBIC's 30% reduction in a round trip with > packet loss). (And upon RTO the BBR (v1 or v2) cwnd is reset to 1, and > slow-starts upward from there.) > > Thanks for the clarification. I'm patiently waiting to see the BBRv2 > mechanisms coherently written up > in that new BBR Internet-Draft version ;-) Getting this together from the > "diffs" on the IETF slides or the source code > is somewhat tedious, so I'll be very grateful for having that single write > up. > > There is an overview of the BBRv2 response to packet loss here: > > https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00#page=18 > > My assumption came from slide 25 of this slide set: > the probing is terminated if inflight > 1.25 estimated_bdp (or "hard > ceiling" seen). > So without experiencing more than 2% packet loss this may end up beyond > 1.25 estimated_bdp, > Yes, that can be the behavior when BBRv2 is probing for bandwidth, but is not the average or steady-state behavior. > but would it often end at 2estimated_bdp? > That depends on the details of the bottleneck buffer depth, number of competing flows and what congestion control algorithm they are using, etc. neal > Best regards, > > Roland > > > > >> This is too large for short queue routers in the Internet core, but it >> helps a lot with cross traffic on large queue edge routers. >> >> Best regards, >> Roland >> >> [1] https://ieeexplore.ieee.org/document/8117540 >> >> >> On Wed, Jul 7, 2021 at 3:19 PM Bless, Roland (TM) >> wrote: >> >>> Hi Matt, >>> >>> [sorry for the late reply, overlooked this one] >>> >>> please, see comments inline. >>> >>> On 02.07.21 at 21:46 Matt Mathis via Bloat wrote: >>> >>> The argument is absolutely correct for Reno, CUBIC and all >>> other self-clocked protocols. One of the core assumptions in Jacobson88, >>> was that the clock for the entire system comes from packets draining >>> through the bottleneck queue. In this world, the clock is intrinsically >>> brittle if the buffers are too small. The drain time needs to be a >>> substantial fraction of the RTT. >>> >>> I'd like to separate the functions here a bit: >>> >>> 1) "automatic pacing" by ACK clocking >>> >>> 2) congestion-window-based operation >>> >>> I agree that the automatic pacing generated by the ACK clock (function >>> 1) is increasingly >>> distorted these days and may consequently cause micro bursts. >>> This can be mitigated by using paced sending, which I consider very >>> useful. >>> However, I consider abandoning the (congestion) window-based approaches >>> with ACK feedback (function 2) as harmful: >>> a congestion window has an automatic self-stabilizing property since the >>> ACK feedback reflects >>> also the queuing delay and the congestion window limits the amount of >>> inflight data. >>> In contrast, rate-based senders risk instability: two senders in an >>> M/D/1 setting, each sender sending with 50% >>> bottleneck rate in average, both using paced sending at 120% of the >>> average rate, suffice to cause >>> instability (queue grows unlimited). >>> >>> IMHO, two approaches seem to be useful: >>> a) congestion-window-based operation with paced sending >>> b) rate-based/paced sending with limiting the amount of inflight data >>> >>> >>> However, we have reached the point where we need to discard that >>> requirement. One of the side points of BBR is that in many environments it >>> is cheaper to burn serving CPU to pace into short queue networks than it is >>> to "right size" the network queues. >>> >>> The fundamental problem with the old way is that in some contexts the >>> buffer memory has to beat Moore's law, because to maintain constant drain >>> time the memory size and BW both have to scale with the link (laser) BW. >>> >>> See the slides I gave at the Stanford Buffer Sizing workshop december >>> 2019: Buffer Sizing: Position Paper >>> >>> >>> >>> Thanks for the pointer. I don't quite get the point that the buffer must >>> have a certain size to keep the ACK clock stable: >>> in case of an non application-limited sender, a very small buffer >>> suffices to let the ACK clock >>> run steady. The large buffers were mainly required for loss-based CCs to >>> let the standing queue >>> build up that keeps the bottleneck busy during CWnd reduction after >>> packet loss, thereby >>> keeping the (bottleneck link) utilization high. >>> >>> Regards, >>> >>> Roland >>> >>> >>> Note that we are talking about DC and Internet core. At the edge, BW is >>> low enough where memory is relatively cheap. In some sense BB came about >>> because memory is too cheap in these environments. >>> >>> Thanks, >>> --MM-- >>> The best way to predict the future is to create it. - Alan Kay >>> >>> We must not tolerate intolerance; >>> however our response must be carefully measured: >>> too strong would be hypocritical and risks spiraling out of >>> control; >>> too weak risks being mistaken for tacit approval. >>> >>> >>> On Fri, Jul 2, 2021 at 9:59 AM Stephen Hemminger < >>> stephen@networkplumber.org> wrote: >>> >>>> On Fri, 2 Jul 2021 09:42:24 -0700 >>>> Dave Taht wrote: >>>> >>>> > "Debunking Bechtolsheim credibly would get a lot of attention to the >>>> > bufferbloat cause, I suspect." - dpreed >>>> > >>>> > "Why Big Data Needs Big Buffer Switches" - >>>> > >>>> http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf >>>> > >>>> >>>> Also, a lot depends on the TCP congestion control algorithm being used. >>>> They are using NewReno which only researchers use in real life. >>>> >>>> Even TCP Cubic has gone through several revisions. In my experience, the >>>> NS-2 models don't correlate well to real world behavior. >>>> >>>> In real world tests, TCP Cubic will consume any buffer it sees at a >>>> congested link. Maybe that is what they mean by capture effect. >>>> >>>> There is also a weird oscillation effect with multiple streams, where >>>> one >>>> flow will take the buffer, then see a packet loss and back off, the >>>> other flow will take over the buffer until it sees loss. >>>> >>>> _______________________________________________ >>> >>> _______________________________________________ >>> >>> >>> >