Hi Neal, On 08.07.21 at 15:29 Neal Cardwell wrote: > On Thu, Jul 8, 2021 at 7:25 AM Bless, Roland (TM) > > wrote: > > It seems that in BBRv2 there are many more mechanisms present > that try to control the amount of inflight data more tightly and > the new "cap" > is at 1.25 BDP. > > To clarify, the BBRv2 cwnd cap is not 1.25*BDP. If there is no packet > loss or ECN, the BBRv2 cwnd cap is the same as BBRv1. But if there has > been packet loss then conceptually the cwnd cap is the maximum amount > of data delivered in a single round trip since the last packet loss > (with a floor to ensure that the cwnd does not decrease by more than > 30% per round trip with packet loss, similar to CUBIC's 30% reduction > in a round trip with packet loss). (And upon RTO the BBR (v1 or v2) > cwnd is reset to 1, and slow-starts upward from there.) Thanks for the clarification. I'm patiently waiting to see the BBRv2 mechanisms coherently written up in that new BBR Internet-Draft version ;-) Getting this together from the "diffs" on the IETF slides or the source code is somewhat tedious, so I'll be very grateful for having that single write up. > There is an overview of the BBRv2 response to packet loss here: > https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00#page=18 > My assumption came from slide 25 of this slide set: the probing is terminated if inflight > 1.25 estimated_bdp (or "hard ceiling" seen). So without experiencing more than 2% packet loss this may end up beyond 1.25 estimated_bdp, but would it often end at 2estimated_bdp? Best regards,  Roland > >> This is too large for short queue routers in the Internet core, >> but it helps a lot with cross traffic on large queue edge routers. > > Best regards, >  Roland > > [1] https://ieeexplore.ieee.org/document/8117540 > > >> >> On Wed, Jul 7, 2021 at 3:19 PM Bless, Roland (TM) >> > wrote: >> >> Hi Matt, >> >> [sorry for the late reply, overlooked this one] >> >> please, see comments inline. >> >> On 02.07.21 at 21:46 Matt Mathis via Bloat wrote: >>> The argument is absolutely correct for Reno, CUBIC and all >>> other self-clocked protocols.  One of the core assumptions >>> in Jacobson88, was that the clock for the entire system >>> comes from packets draining through the bottleneck queue.  >>> In this world, the clock is intrinsically brittle if the >>> buffers are too small.  The drain time needs to be a >>> substantial fraction of the RTT. >> I'd like to separate the functions here a bit: >> >> 1) "automatic pacing" by ACK clocking >> >> 2) congestion-window-based operation >> >> I agree that the automatic pacing generated by the ACK clock >> (function 1) is increasingly >> distorted these days and may consequently cause micro bursts. >> This can be mitigated by using paced sending, which I >> consider very useful. >> However, I consider abandoning the (congestion) window-based >> approaches >> with ACK feedback (function 2) as harmful: >> a congestion window has an automatic self-stabilizing >> property since the ACK feedback reflects >> also the queuing delay and the congestion window limits the >> amount of inflight data. >> In contrast, rate-based senders risk instability: two senders >> in an M/D/1 setting, each sender sending with 50% >> bottleneck rate in average, both using paced sending at 120% >> of the average rate, suffice to cause >> instability (queue grows unlimited). >> >> IMHO, two approaches seem to be useful: >> a) congestion-window-based operation with paced sending >> b) rate-based/paced sending with limiting the amount of >> inflight data >> >>> >>> However, we have reached the point where we need to discard >>> that requirement. One of the side points of BBR is that in >>> many environments it is cheaper to burn serving CPU to pace >>> into short queue networks than it is to "right size" the >>> network queues. >>> >>> The fundamental problem with the old way is that in some >>> contexts the buffer memory has to beat Moore's law, because >>> to maintain constant drain time the memory size and BW both >>> have to scale with the link (laser) BW. >>> >>> See the slides I gave at the Stanford Buffer Sizing workshop >>> december 2019: Buffer Sizing: Position Paper >>> >>> >>> >> Thanks for the pointer. I don't quite get the point that the >> buffer must have a certain size to keep the ACK clock stable: >> in case of an non application-limited sender, a very small >> buffer suffices to let the ACK clock >> run steady. The large buffers were mainly required for >> loss-based CCs to let the standing queue >> build up that keeps the bottleneck busy during CWnd reduction >> after packet loss, thereby >> keeping the (bottleneck link) utilization high. >> >> Regards, >> >>  Roland >> >> >>> Note that we are talking about DC and Internet core.  At the >>> edge, BW is low enough where memory is relatively cheap.  In >>> some sense BB came about because memory is too cheap in >>> these environments. >>> >>> Thanks, >>> --MM-- >>> The best way to predict the future is to create it.  - Alan Kay >>> >>> We must not tolerate intolerance; >>>        however our response must be carefully measured: >>>             too strong would be hypocritical and risks >>> spiraling out of control; >>>             too weak risks being mistaken for tacit approval. >>> >>> >>> On Fri, Jul 2, 2021 at 9:59 AM Stephen Hemminger >>> >> > wrote: >>> >>> On Fri, 2 Jul 2021 09:42:24 -0700 >>> Dave Taht >> > wrote: >>> >>> > "Debunking Bechtolsheim credibly would get a lot of >>> attention to the >>> > bufferbloat cause, I suspect." - dpreed >>> > >>> > "Why Big Data Needs Big Buffer Switches" - >>> > >>> http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf >>> >>> > >>> >>> Also, a lot depends on the TCP congestion control >>> algorithm being used. >>> They are using NewReno which only researchers use in >>> real life. >>> >>> Even TCP Cubic has gone through several revisions. In my >>> experience, the >>> NS-2 models don't correlate well to real world behavior. >>> >>> In real world tests, TCP Cubic will consume any buffer >>> it sees at a >>> congested link. Maybe that is what they mean by capture >>> effect. >>> >>> There is also a weird oscillation effect with multiple >>> streams, where one >>> flow will take the buffer, then see a packet loss and >>> back off, the >>> other flow will take over the buffer until it sees loss. >>> >>> _______________________________________________ >>> >>> _______________________________________________ >>