[Bloat] Questions for Bufferbloat Wikipedia article

General list for discussing Bufferbloat
 help / color / mirror / Atom feed

* [Bloat] Questions for Bufferbloat Wikipedia article
@ 2021-04-05 12:46 Rich Brown
  2021-04-05 15:13 ` Stephen Hemminger
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Rich Brown @ 2021-04-05 12:46 UTC (permalink / raw)
  To: bloat, Richard E. Brown

Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)

Before I get into it, I want to ask real experts for some guidance... Here goes:

1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) 

a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.)

b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) 

c) Or some other definition?

2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?

3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices?

4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport...

Many thanks!

Rich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown
@ 2021-04-05 15:13 ` Stephen Hemminger
  2021-04-05 15:24   ` David Lang
  2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 26+ messages in thread
From: Stephen Hemminger @ 2021-04-05 15:13 UTC (permalink / raw)
  To: Rich Brown; +Cc: bloat

On Mon, 5 Apr 2021 08:46:15 -0400
Rich Brown <richb.hanover@gmail.com> wrote:

> Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> 
> Before I get into it, I want to ask real experts for some guidance... Here goes:
> 
> 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) 
> 
> a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.)
> 
> b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) 
> 
> c) Or some other definition?
> 
> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?
> 
> 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices?
> 
> 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport...
> 
> Many thanks!
> 
> Rich

I like to think of Bufferbloat as a combination of large buffers and how algorithms react to those buffers.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 15:13 ` Stephen Hemminger
@ 2021-04-05 15:24   ` David Lang
  2021-04-05 15:57     ` Dave Collier-Brown
  2021-04-05 16:25     ` Kelvin Edmison
  0 siblings, 2 replies; 26+ messages in thread
From: David Lang @ 2021-04-05 15:24 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Rich Brown, bloat

[-- Attachment #1: Type: text/plain, Size: 3084 bytes --]

On Mon, 5 Apr 2021, Stephen Hemminger wrote:

> On Mon, 5 Apr 2021 08:46:15 -0400
> Rich Brown <richb.hanover@gmail.com> wrote:
>
>> Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>> 
>> Before I get into it, I want to ask real experts for some guidance... Here goes:
>> 
>> 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) 
>> 
>> a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.)
>> 
>> b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) 
>> 
>> c) Or some other definition?
>> 
>> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?
>> 
>> 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices?
>> 
>> 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport...
>> 
>> Many thanks!
>> 
>> Rich
>
> I like to think of Bufferbloat as a combination of large buffers and how algorithms react to those buffers.

I think there are two things

1. what bufferbloat is

    bufferbloat is the result of memory getting cheaper faster than bandwidth 
increased, combined with throughput benchmarking that drastically penalized 
end-to-end retries.

I think this definition is pretty academic and not something to worry about 
using.

2. why it's a problem

the problems show up when the buffer represents too much time worth of data to 
transmit (the time between when the last byte in the buffer gets inserted into 
the buffer and when it gets transmitted)

So in a high bandwidth environment (like a datacenter) you can use much larger 
buffers than when you are on a low bandwidth line

David Lang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 15:24   ` David Lang
@ 2021-04-05 15:57     ` Dave Collier-Brown
  2021-04-05 16:25     ` Kelvin Edmison
  1 sibling, 0 replies; 26+ messages in thread
From: Dave Collier-Brown @ 2021-04-05 15:57 UTC (permalink / raw)
  To: bloat

[-- Attachment #1: Type: text/plain, Size: 3954 bytes --]

To speak to the original question, I'd say bufferbloat

  * is undesirable latency
  * was discovered when adding buffers counter-intuitively /slowed
    /packet flow.

That's so as to catch the reader's attention and immediately cast light 
on the (memorable but mysterious) name.

--dave


On 2021-04-05 11:24 a.m., David Lang wrote:
> On Mon, 5 Apr 2021, Stephen Hemminger wrote:
>
>> On Mon, 5 Apr 2021 08:46:15 -0400
>> Rich Brown <richb.hanover@gmail.com> wrote:
>>
>>> Dave Täht has put me up to revising the current Bufferbloat article 
>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>
>>> Before I get into it, I want to ask real experts for some 
>>> guidance... Here goes:
>>>
>>> 1) What is *our* definition of Bufferbloat? (We invented the term, 
>>> so I think we get to define it.)
>>> a) Are we content with the definition from the bufferbloat.net site, 
>>> "Bufferbloat is the undesirable latency that comes from a router or 
>>> other network equipment buffering too much data." (This suggests 
>>> bufferbloat is latency, and could be measured in seconds/msec.)
>>>
>>> b) Or should we use something like Jim Gettys' definition from the 
>>> Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), 
>>> "Bufferbloat is the existence of excessively large (bloated) buffers 
>>> in systems, particularly network communication systems." (This 
>>> suggests bufferbloat is an unfortunate state of nature, measured in 
>>> units of "unhappiness" :-)
>>> c) Or some other definition?
>>>
>>> 2) All network equipment can be bloated. I have seen (but not really 
>>> followed) controversy regarding the amount of buffering needed in 
>>> the Data Center. Is it worth having the Wikipedia article 
>>> distinguish between Data Center equipment and CPE/home/last mile 
>>> equipment? Similarly, is the "bloat condition" and its mitigation 
>>> qualitatively different between those applications? Finally, do any 
>>> of us know how frequently data centers/backbone ISPs experience 
>>> buffer-induced latencies? What's the magnitude of the impact?
>>>
>>> 3) The Wikipedia article mentions guidance that network gear should 
>>> accommodate buffering 250 msec of traffic(!) Is this a real "rule of 
>>> thumb" or just an often-repeated but unscientific suggestion? Can 
>>> someone give pointers to best practices?
>>>
>>> 4) Meta question: Can anyone offer any advice on making a wholesale 
>>> change to a Wikipedia article? Before I offer a fork-lift 
>>> replacement I would a) solicit advice on the new text from this 
>>> list, and b) try to make contact with some of the reviewers and 
>>> editors who've been maintaining the page to establish some bona 
>>> fides and rapport...
>>>
>>> Many thanks!
>>>
>>> Rich
>>
>> I like to think of Bufferbloat as a combination of large buffers and 
>> how algorithms react to those buffers.
>
> I think there are two things
>
> 1. what bufferbloat is
>
>    bufferbloat is the result of memory getting cheaper faster than 
> bandwidth increased, combined with throughput benchmarking that 
> drastically penalized end-to-end retries.
>
> I think this definition is pretty academic and not something to worry 
> about using.
>
> 2. why it's a problem
>
> the problems show up when the buffer represents too much time worth of 
> data to transmit (the time between when the last byte in the buffer 
> gets inserted into the buffer and when it gets transmitted)
>
> So in a high bandwidth environment (like a datacenter) you can use 
> much larger buffers than when you are on a low bandwidth line
>
> David Lang
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

-- 
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com |              -- Mark Twain


[-- Attachment #2: Type: text/html, Size: 5914 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 15:24   ` David Lang
  2021-04-05 15:57     ` Dave Collier-Brown
@ 2021-04-05 16:25     ` Kelvin Edmison
  1 sibling, 0 replies; 26+ messages in thread
From: Kelvin Edmison @ 2021-04-05 16:25 UTC (permalink / raw)
  To: David Lang; +Cc: Stephen Hemminger, Rich Brown, bloat

[-- Attachment #1: Type: text/plain, Size: 4924 bytes --]

I've been lurking on the bufferbloat mailing list for a while now, without
volunteering in the same fashion as the core contributors.  But I do have
some thoughts as someone who is not quite at the level of writing kernel
drivers; maybe this is helpful when updating the definition.

I think we need to define what it is (in terms of user-perceivable
experience) before we get to the causes and why it's a problem.
In essence, link it to what average people know already, and draw them in
to the next level of detail.

To that end, I would propose the following for discussion:
Bufferbloat is the difference in latency for a connection when it is
lightly loaded vs when it is fully loaded.  (Here, i am trying to provide
terms that are somewhat clear and simple to an average user, that will
connect them to things they do already i.e. fully use their internet
connection for an upload or download.)

Then, I think it is useful to move into some examples of how it can be
perceived (audio call stutter, video call stutter) especially in the
presence of multiple competing users with different priorities (gaming vs.
uploading documents or presentations).

And then we can dig into the causes (e.g. over-provisioned buffers, poor
inter-flow management, etc), means of explicitly measuring it, approaches
for mitigating or fixing it, etc.

I hope this is useful,
  Kelvin

On Mon, Apr 5, 2021 at 11:24 AM David Lang <david@lang.hm> wrote:

> On Mon, 5 Apr 2021, Stephen Hemminger wrote:
>
> > On Mon, 5 Apr 2021 08:46:15 -0400
> > Rich Brown <richb.hanover@gmail.com> wrote:
> >
> >> Dave Täht has put me up to revising the current Bufferbloat article on
> Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> >>
> >> Before I get into it, I want to ask real experts for some guidance...
> Here goes:
> >>
> >> 1) What is *our* definition of Bufferbloat? (We invented the term, so I
> think we get to define it.)
> >>
> >> a) Are we content with the definition from the bufferbloat.net site,
> "Bufferbloat is the undesirable latency that comes from a router or other
> network equipment buffering too much data." (This suggests bufferbloat is
> latency, and could be measured in seconds/msec.)
> >>
> >> b) Or should we use something like Jim Gettys' definition from the Dark
> Buffers article (https://ieeexplore.ieee.org/document/5755608),
> "Bufferbloat is the existence of excessively large (bloated) buffers in
> systems, particularly network communication systems." (This suggests
> bufferbloat is an unfortunate state of nature, measured in units of
> "unhappiness" :-)
> >>
> >> c) Or some other definition?
> >>
> >> 2) All network equipment can be bloated. I have seen (but not really
> followed) controversy regarding the amount of buffering needed in the Data
> Center. Is it worth having the Wikipedia article distinguish between Data
> Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat
> condition" and its mitigation qualitatively different between those
> applications? Finally, do any of us know how frequently data
> centers/backbone ISPs experience buffer-induced latencies? What's the
> magnitude of the impact?
> >>
> >> 3) The Wikipedia article mentions guidance that network gear should
> accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb"
> or just an often-repeated but unscientific suggestion? Can someone give
> pointers to best practices?
> >>
> >> 4) Meta question: Can anyone offer any advice on making a wholesale
> change to a Wikipedia article? Before I offer a fork-lift replacement I
> would a) solicit advice on the new text from this list, and b) try to make
> contact with some of the reviewers and editors who've been maintaining the
> page to establish some bona fides and rapport...
> >>
> >> Many thanks!
> >>
> >> Rich
> >
> > I like to think of Bufferbloat as a combination of large buffers and how
> algorithms react to those buffers.
>
> I think there are two things
>
> 1. what bufferbloat is
>
>     bufferbloat is the result of memory getting cheaper faster than
> bandwidth
> increased, combined with throughput benchmarking that drastically
> penalized
> end-to-end retries.
>
> I think this definition is pretty academic and not something to worry
> about
> using.
>
> 2. why it's a problem
>
> the problems show up when the buffer represents too much time worth of
> data to
> transmit (the time between when the last byte in the buffer gets inserted
> into
> the buffer and when it gets transmitted)
>
> So in a high bandwidth environment (like a datacenter) you can use much
> larger
> buffers than when you are on a low bandwidth line
>
> David Lang_______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>

[-- Attachment #2: Type: text/html, Size: 6089 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2
  2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown
  2021-04-05 15:13 ` Stephen Hemminger
@ 2021-04-05 18:00 ` Rich Brown
  2021-04-05 18:08   ` David Lang
  2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller
  2021-04-06 18:54 ` Neil Davies
  3 siblings, 1 reply; 26+ messages in thread
From: Rich Brown @ 2021-04-05 18:00 UTC (permalink / raw)
  To: bloat, Richard E. Brown

Thanks, all, for the responses re: Bufferbloat definition. I can work with that information.

Next question...

> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?

Many thanks!

Rich


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2
  2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown
@ 2021-04-05 18:08   ` David Lang
  2021-04-05 20:30     ` Erik Auerswald
  0 siblings, 1 reply; 26+ messages in thread
From: David Lang @ 2021-04-05 18:08 UTC (permalink / raw)
  To: Rich Brown; +Cc: bloat

On Mon, 5 Apr 2021, Rich Brown wrote:

> Next question...
>
>> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?

the bandwidth available in datacenters is high enough that it's much harder to 
run into grief there (recognizing that not every piece of datacenter equipment 
is hooked to 100G circuits)

I think it's best to talk about excessive buffers in terms of time rather than 
bytes, and you can then show the difference between two buffers of the same 
size, one connected to a 10Mb (or 1Mb) DSL upload vs 100G datacenter circuit. 
After that one example, the rest of the article can talk about time and it will 
be globally applicable.

David Lang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2
  2021-04-05 18:08   ` David Lang
@ 2021-04-05 20:30     ` Erik Auerswald
  2021-04-05 20:36       ` Dave Taht
  0 siblings, 1 reply; 26+ messages in thread
From: Erik Auerswald @ 2021-04-05 20:30 UTC (permalink / raw)
  To: bloat

Hi,

On Mon, Apr 05, 2021 at 11:08:07AM -0700, David Lang wrote:
> On Mon, 5 Apr 2021, Rich Brown wrote:
> 
> >Next question...
> >
> >>2) All network equipment can be bloated. I have seen (but not
> >>really followed) controversy regarding the amount of buffering
> >>needed in the Data Center. Is it worth having the Wikipedia article
> >>distinguish between Data Center equipment and CPE/home/last mile
> >>equipment? Similarly, is the "bloat condition" and its mitigation
> >>qualitatively different between those applications? Finally, do
> >>any of us know how frequently data centers/backbone ISPs experience
> >>buffer-induced latencies? What's the magnitude of the impact?

I do not have experience with "web scale" data centers or "backbone"
ISPs, but I think I can add related information.

From my work experience with (mostly) enterprise and service provider
networks I would say that bufferbloat effects are relatively rarely
observed there.  Many network engineers do not know about bufferbloat
and do not believe in its existence after being told about bufferbloat.
I have seen a latency consideration for a country-wide network that
explicitly excluded queuing delays as irrelevant and cited just
propagation and serialization delay as relevant for the end-to-end
latency.  Demonstrating bufferbloat effects with a test setup with
prolonged congestion is usually labeled unrealistic and ignored.

Campus networks and ("small") data centers are usually overprovisioned
with bandwidth and thus do not exhibit prolonged congestion.
Additionally, a lot of enterprise networking gear, specifically
"switches," do not have oversized buffers.

Campus networks more often show problems with too small buffers for a
given application (e.g., cameras streaming data via RTP with large "key
frames" sent at line rate), such that "microbursts" result in packet
drops and thus observable problems even with low bandwidth utilization
over longer time frames (minutes).  The idea that buffers could be too
large does not seem realistic there.

"Routers" for the ISP market (not "home routers", but network devices
used inside the ISP's core and aggregation networks and similar) often
do have unreasonably ("bloated") buffer capacity, but they are usually
operated without persistent congestion.  When persistent congestion
does happen on a customer connection, and bufferbloat does result in
unusably high latency, the customer is often told to send at a lower
rate, but "bufferbloat" is usually not recognized as the root cause,
and thus not addressed.

It seems to me as if "bufferbloat" is most noticable on the consumer
end of mass market network connections.  I.e., low margin markets with
non-technical customers.

If CAKE behind the access circuit of an end customer can mitigate
bufferbloat, then bufferbloat effects are only visible there and do not
show up in other parts of the network.

> the bandwidth available in datacenters is high enough that it's much
> harder to run into grief there (recognizing that not every piece of
> datacenter equipment is hooked to 100G circuits)

That is my impression as well.

> I think it's best to talk about excessive buffers in terms of time
> rather than bytes, and you can then show the difference between two
> buffers of the same size, one connected to a 10Mb (or 1Mb) DSL upload
> vs 100G datacenter circuit. After that one example, the rest of the
> article can talk about time and it will be globally applicable.

I too think that _time_ is the important unit regarding buffers, even
though they are mostly described in units of data (bytes or packets).

Thanks,
Erik
-- 
To have our best advice ignored is the common fate of all who take on
the role of consultant, ever since Cassandra pointed out the dangers of
bringing a wooden horse within the walls of Troy.
                        -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2
  2021-04-05 20:30     ` Erik Auerswald
@ 2021-04-05 20:36       ` Dave Taht
  0 siblings, 0 replies; 26+ messages in thread
From: Dave Taht @ 2021-04-05 20:36 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat

My own fervent wish is new switches suffering from microbursts did
better 5 tuple fq, in addition to per-port fq.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown
  2021-04-05 15:13 ` Stephen Hemminger
  2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown
@ 2021-04-05 21:49 ` Sebastian Moeller
  2021-04-05 21:55   ` Dave Taht
  2021-04-06  0:47   ` Erik Auerswald
  2021-04-06 18:54 ` Neil Davies
  3 siblings, 2 replies; 26+ messages in thread
From: Sebastian Moeller @ 2021-04-05 21:49 UTC (permalink / raw)
  To: Rich Brown; +Cc: bloat

Hi Rich,

all good questions, and interesting responses so far.


> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
> 
> Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> 
> Before I get into it, I want to ask real experts for some guidance... Here goes:
> 
> 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) 
> 
> a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.)
> 
> b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) 

	I do not even think these are mutually exclusive; "over-sized but under-managed buffers" cause avoidable variable latency, aka Jitter, which is the bane of all interactive use-cases. The lower jitter the better, and jitter can be measured in units of time, but also acts as "currency" in the unhappiness domain ;). The challenge is that we know that no/too small buffers cause undesirable loss of throughput (but small latency under load), while too large buffers cause undesirable increase in latency under load (but decent throughput), so the challenge is to get buffering right to keep throughput acceptably high, while at the same time keeping latency under load acceptable low...
	The solution basically is large buffers with adaptive management that works hard to keep latency under load increase and throughput inside an acceptable "corridor".


> c) Or some other definition?
> 
> 2) All network equipment can be bloated.

	+1; depending on condition. Corollary: static buffer sizing is unlikely to be the right answer unless the load is constant...


> I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center.

	Conceptually the same as everywhere else, just enough to keep throughput up ;) But e.g. for traditional TCPs the amount of expected buffer needs increases with RTT of a flow, so intra-datacenter flows with low RTTs will only require relative small buffers to cope.


> Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment?

	That depends on our audience, but realistically over-sized but under-managed buffers can and do occur everywhere, so maybe better include all?


> Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications?

	IMHO, not really, we have two places to twiddle, the buffer (and how it is managed) and the two endpoints transferring data. Our go to solution deals with buffer management, but protocols can also help, e.g. by using pacing (spreading out packets based on the estimated throughput) instead of sending in bursts. Or using different protocols that are more adaptive to the perceived buffering along a path, like BBR (which as you surely knows, tries to actively measure a path's capacity by regularly sending closely spaced probe packets and measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded).
	Methods at both places are not guaranteed to work hand in hand though (naive BBR fails to recognize an AQM on the path that keeps latency under load well-bounded, which was noted and fixed in later BBR incarnations); making the whole problem space "a mess".



> Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?

	I have to pass, -ENODATA ;)

> 
> 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices?

	I am sure that any fixed number will be wrong ;) there might be numbers worse than others though.


> 
> 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article?

	Maybe don't? Instead of doing this in one go, evolve the existing article piece-wise, avoiding the wrong impression of a hostile take-over? And allowing for a nicer history of targeted commits?


> Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport...

	I guess, if you get the buy-in from the current maintainers a fork-lift upgrade might work...

Best Regards
	Sebastian


> 
> Many thanks!
> 
> Rich
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller
@ 2021-04-05 21:55   ` Dave Taht
  2021-04-06  0:47   ` Erik Auerswald
  1 sibling, 0 replies; 26+ messages in thread
From: Dave Taht @ 2021-04-05 21:55 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Rich Brown, bloat

The biggest internet spike I ever saw was this one.

https://web.archive.org/web/20171113211640/http://blog.cerowrt.org/post/bufferbloat_on_the_backbone/
(the image has expired elsewhere. Someone tell Orwell!)

Ironically it occurred during a videoconference with the shuttleworth folk.

In improving the bufferbloat definition, I think some pretty graphs
with circles and arrows on a paragraph of each one certifying the
evidence for it, would be a case of american blind justice...

https://www.youtube.com/watch?v=W5_8U4j51lI

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller
  2021-04-05 21:55   ` Dave Taht
@ 2021-04-06  0:47   ` Erik Auerswald
  2021-04-06  6:31     ` Sebastian Moeller
  1 sibling, 1 reply; 26+ messages in thread
From: Erik Auerswald @ 2021-04-06  0:47 UTC (permalink / raw)
  To: bloat

Hi,

On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
> 
> all good questions, and interesting responses so far.

I'll add some details below, I mostly concur with your responses.

> > On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
> > 
> > Dave Täht has put me up to revising the current Bufferbloat article
> > on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> > [...]
> [...] while too large buffers cause undesirable increase in latency
> under load (but decent throughput), [...]

With too large buffers, even throughput degrades when TCP considers
a delayed segment lost (or DNS gives up because the answers arrive
too late).  I do think there is _too_ large for buffers, period.

> The solution basically is large buffers with adaptive management that

I would prefer the word "sufficient" instead of "large."

> works hard to keep latency under load increase and throughput inside
> an acceptable "corridor".

I concur that there is quite some usable range of buffer capacity when
considering the latency/throughput trade-off, and AQM seems like a good
solution to managing that.

My preference is to sacrifice throughput for better latency, but then
I have been bitten by too much latency quite often, but never by too
little throughput caused by small buffers.  YMMV.

> [...]
> But e.g. for traditional TCPs the amount of expected buffer needs
> increases with RTT of a flow

Does it?  Does the propagation delay provide automatic "buffering" in the
network?  Does the receiver need to advertise sufficient buffer capacity
(receive window) to allow the sender to fill the pipe?  Does the sender
need to provide sufficient buffer capacity to retransmit lost segments?
Where are buffers actually needed?

I am not convinced that large buffers in the network are needed for high
throughput of high RTT TCP flows.

See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for
some information and links to a few papers.

> [...]

Thanks,
Erik
-- 
The computing scientist’s main challenge is not to get confused by
the complexities of his own making.
                        -- Edsger W. Dijkstra

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06  0:47   ` Erik Auerswald
@ 2021-04-06  6:31     ` Sebastian Moeller
  2021-04-06 18:50       ` Erik Auerswald
  2021-04-06 20:01       ` Bless, Roland (TM)
  0 siblings, 2 replies; 26+ messages in thread
From: Sebastian Moeller @ 2021-04-06  6:31 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat

Hi Eric,

thanks for your thoughts.

> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
> 
> Hi,
> 
> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>> 
>> all good questions, and interesting responses so far.
> 
> I'll add some details below, I mostly concur with your responses.
> 
>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>> 
>>> Dave Täht has put me up to revising the current Bufferbloat article
>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>> [...]
>> [...] while too large buffers cause undesirable increase in latency
>> under load (but decent throughput), [...]
> 
> With too large buffers, even throughput degrades when TCP considers
> a delayed segment lost (or DNS gives up because the answers arrive
> too late).  I do think there is _too_ large for buffers, period.

	Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;)


> 
>> The solution basically is large buffers with adaptive management that
> 
> I would prefer the word "sufficient" instead of "large."

	If properly managed there is no upper end for the size, it might not be used though, no?


> 
>> works hard to keep latency under load increase and throughput inside
>> an acceptable "corridor".
> 
> I concur that there is quite some usable range of buffer capacity when
> considering the latency/throughput trade-off, and AQM seems like a good
> solution to managing that.

	I fear it is the only network side mitigation technique?


> 
> My preference is to sacrifice throughput for better latency, but then
> I have been bitten by too much latency quite often, but never by too
> little throughput caused by small buffers.  YMMV.

	Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need).

> 
>> [...]
>> But e.g. for traditional TCPs the amount of expected buffer needs
>> increases with RTT of a flow
> 
> Does it?  Does the propagation delay provide automatic "buffering" in the
> network?  Does the receiver need to advertise sufficient buffer capacity
> (receive window) to allow the sender to fill the pipe?  Does the sender
> need to provide sufficient buffer capacity to retransmit lost segments?
> Where are buffers actually needed?

	At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news.


> 
> I am not convinced that large buffers in the network are needed for high
> throughput of high RTT TCP flows.
> 
> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for
> some information and links to a few papers.

	Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;)

Best Regards
	Sebastian


> 
>> [...]
> 
> Thanks,
> Erik
> -- 
> The computing scientist’s main challenge is not to get confused by
> the complexities of his own making.
>                        -- Edsger W. Dijkstra
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06  6:31     ` Sebastian Moeller
@ 2021-04-06 18:50       ` Erik Auerswald
  2021-04-06 20:02         ` Bless, Roland (TM)
  2021-04-06 20:01       ` Bless, Roland (TM)
  1 sibling, 1 reply; 26+ messages in thread
From: Erik Auerswald @ 2021-04-06 18:50 UTC (permalink / raw)
  To: bloat

Hi,

On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:
> > On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
> > On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
> >>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
> >>> 
> >>> Dave Täht has put me up to revising the current Bufferbloat article
> >>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> >>> [...]
> >> [...] while too large buffers cause undesirable increase in latency
> >> under load (but decent throughput), [...]
> > 
> > With too large buffers, even throughput degrades when TCP considers
> > a delayed segment lost (or DNS gives up because the answers arrive
> > too late).  I do think there is _too_ large for buffers, period.
> 
> Fair enough, timeouts could be changed though if required ;) but I fully
> concur that laergeish buffers require management to become useful ;)

Yes, large unmanaged buffers are at the core of the bufferbloat problem.
One can make buffers small again, or manage them appropriately.
The latter promises better results, the former is much simpler.

Thanks,
Erik
-- 
Am I secure? I don't know. Does that mean I should just disable all
security functionality and have an open root shell bound to a well known
port? No. Obviously.
                        -- Matthew Garret

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 18:50       ` Erik Auerswald
@ 2021-04-06 20:02         ` Bless, Roland (TM)
  2021-04-06 21:59           ` Erik Auerswald
  0 siblings, 1 reply; 26+ messages in thread
From: Bless, Roland (TM) @ 2021-04-06 20:02 UTC (permalink / raw)
  To: Erik Auerswald, bloat

Hi,

On 06.04.21 at 20:50 Erik Auerswald wrote:
> Hi,
> 
> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:
>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>>>>
>>>>> Dave Täht has put me up to revising the current Bufferbloat article
>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>>> [...]
>>>> [...] while too large buffers cause undesirable increase in latency
>>>> under load (but decent throughput), [...]
>>>
>>> With too large buffers, even throughput degrades when TCP considers
>>> a delayed segment lost (or DNS gives up because the answers arrive
>>> too late).  I do think there is _too_ large for buffers, period.
>>
>> Fair enough, timeouts could be changed though if required ;) but I fully
>> concur that laergeish buffers require management to become useful ;)
> 
> Yes, large unmanaged buffers are at the core of the bufferbloat problem.

I disagree here: it is basically the combination
of loss-based congestion control with unmanaged
tail-drop buffers. There are at least two solutions
to the bufferbloat problem
1) better congestion control algorithms
2) active queue management (+fq maybe)

You can achieve high throughput and low delay with
a corresponding congestion control (e.g., see this
study of how to achieve a common limit on queuing
delay for multiple flows: https://ieeexplore.ieee.org/document/8109356)
even in large buffers.

> One can make buffers small again, or manage them appropriately.
> The latter promises better results, the former is much simpler.

Small buffers definitely limit the queuing delay as well as
jitter. However, how much performance is potentially lost due to
the small buffer depends a lot on the arrival distribution.
  Regards,
  Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 20:02         ` Bless, Roland (TM)
@ 2021-04-06 21:59           ` Erik Auerswald
  2021-04-06 23:32             ` Stephen Hemminger
  2021-04-07 11:06             ` Bless, Roland (TM)
  0 siblings, 2 replies; 26+ messages in thread
From: Erik Auerswald @ 2021-04-06 21:59 UTC (permalink / raw)
  To: bloat

Hi,

On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote:
> On 06.04.21 at 20:50 Erik Auerswald wrote:
> >On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:
> >>>On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
> >>>On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
> >>>>>On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
> >>>>>
> >>>>>Dave Täht has put me up to revising the current Bufferbloat article
> >>>>>on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> >>>>>[...]
> >Yes, large unmanaged buffers are at the core of the bufferbloat problem.
> 
> I disagree here: it is basically the combination
> of loss-based congestion control with unmanaged
> tail-drop buffers.

That worked for decades, then stopped working as well as before.
What changed?

Yes, there are complex interactions with how packet switched networks
are used.  Otherwise we would probably not find ourselves in the current
situation.

To me, the potential of having to wait minutes (yes, minutes!) for
the result of a key stroke over an SSH session is not worth the potential
throughput performance gain of buffers that cannot be called small.

> There are at least two solutions
> to the bufferbloat problem
> 1) better congestion control algorithms
> 2) active queue management (+fq maybe)

Both approaches aim to not use all of the available buffer space, if
there are unreasonably large buffers, i.e., they aim to not build a
large standing queue.

> [...]
> Small buffers definitely limit the queuing delay as well as
> jitter. However, how much performance is potentially lost due to
> the small buffer depends a lot on the arrival distribution.

Could the better congestion control algorithms avoid the potential
performance loss by not requiring large buffers for high throughput?
Might small buffers incentivise to not send huge bursts of data and hope
for the best?

FQ with AQM aims to allow the absorption of large traffic bursts (i.e.,
use of large buffers) without affecting _other_ flows too much.

I would consider the combination of FQ+AQM, better congestion control
algorithms, and large buffers as an optimization, but using just large
buffers without any of the other two approaches as a mistake currently
called bufferbloat.  As such I see large unmanaged buffers at the core
of the bufferbloat problem.

FQ+AQM for every large buffer may solve the bufferbloat problem by
attacking the "unmanaged" part of the problem.  Small buffers may solve
it by attacking the "large" part of the problem.  Small buffers may
bring their own share of problems, but IMHO those are much less than
those of bufferbloat.

I do not see TCP congestion control improvements, even combining
sender-side improvements with receiver-side methods as in rLEDBAT[0],
as a solution to bufferbloat, but rather as a mitigation.

[0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/

Anyway, I think it is obvious that I am willing to sacrifice more
throughput for better latency than others.

Thanks,
Erik
-- 
Simplicity is prerequisite for reliability.
                        -- Edsger W. Dijkstra

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 21:59           ` Erik Auerswald
@ 2021-04-06 23:32             ` Stephen Hemminger
  2021-04-06 23:54               ` David Lang
  2021-04-07 11:06             ` Bless, Roland (TM)
  1 sibling, 1 reply; 26+ messages in thread
From: Stephen Hemminger @ 2021-04-06 23:32 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat

On Tue, 6 Apr 2021 23:59:53 +0200
Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:

> Hi,
> 
> On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote:
> > On 06.04.21 at 20:50 Erik Auerswald wrote:  
> > >On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:  
> > >>>On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
> > >>>On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:  
> > >>>>>On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
> > >>>>>
> > >>>>>Dave Täht has put me up to revising the current Bufferbloat article
> > >>>>>on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> > >>>>>[...]  
> > >Yes, large unmanaged buffers are at the core of the bufferbloat problem.  
> > 
> > I disagree here: it is basically the combination
> > of loss-based congestion control with unmanaged
> > tail-drop buffers.  
> 
> That worked for decades, then stopped working as well as before.
> What changed?
> 
> Yes, there are complex interactions with how packet switched networks
> are used.  Otherwise we would probably not find ourselves in the current
> situation.
> 
> To me, the potential of having to wait minutes (yes, minutes!) for
> the result of a key stroke over an SSH session is not worth the potential
> throughput performance gain of buffers that cannot be called small.
> 
> > There are at least two solutions
> > to the bufferbloat problem
> > 1) better congestion control algorithms
> > 2) active queue management (+fq maybe)  
> 
> Both approaches aim to not use all of the available buffer space, if
> there are unreasonably large buffers, i.e., they aim to not build a
> large standing queue.
> 
> > [...]
> > Small buffers definitely limit the queuing delay as well as
> > jitter. However, how much performance is potentially lost due to
> > the small buffer depends a lot on the arrival distribution.  
> 
> Could the better congestion control algorithms avoid the potential
> performance loss by not requiring large buffers for high throughput?
> Might small buffers incentivise to not send huge bursts of data and hope
> for the best?
> 
> FQ with AQM aims to allow the absorption of large traffic bursts (i.e.,
> use of large buffers) without affecting _other_ flows too much.
> 
> I would consider the combination of FQ+AQM, better congestion control
> algorithms, and large buffers as an optimization, but using just large
> buffers without any of the other two approaches as a mistake currently
> called bufferbloat.  As such I see large unmanaged buffers at the core
> of the bufferbloat problem.
> 
> FQ+AQM for every large buffer may solve the bufferbloat problem by
> attacking the "unmanaged" part of the problem.  Small buffers may solve
> it by attacking the "large" part of the problem.  Small buffers may
> bring their own share of problems, but IMHO those are much less than
> those of bufferbloat.
> 
> I do not see TCP congestion control improvements, even combining
> sender-side improvements with receiver-side methods as in rLEDBAT[0],
> as a solution to bufferbloat, but rather as a mitigation.
> 
> [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/
> 
> Anyway, I think it is obvious that I am willing to sacrifice more
> throughput for better latency than others.
> 

For Wikipedia it is important to make clear:
  * the symptoms = large latency
  * the cause = large buffers and aggressive protocols
  * the solutions = AQM, smaller buffers, pacing, better congestion control, etc.

People can argue over best combination of solutions but the symptoms and
causes should be defined, and non-contentious.

It is too easy to go off in the weeds and have the solution of the day.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 23:32             ` Stephen Hemminger
@ 2021-04-06 23:54               ` David Lang
  0 siblings, 0 replies; 26+ messages in thread
From: David Lang @ 2021-04-06 23:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Erik Auerswald, bloat

On Tue, 6 Apr 2021, Stephen Hemminger wrote:

> For Wikipedia it is important to make clear:
>  * the symptoms = large latency

more precisely, large latency under load

David Lang

>  * the cause = large buffers and aggressive protocols
>  * the solutions = AQM, smaller buffers, pacing, better congestion control, etc.
>
> People can argue over best combination of solutions but the symptoms and
> causes should be defined, and non-contentious.
>
> It is too easy to go off in the weeds and have the solution of the day.
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 21:59           ` Erik Auerswald
  2021-04-06 23:32             ` Stephen Hemminger
@ 2021-04-07 11:06             ` Bless, Roland (TM)
  2021-04-27  1:41               ` Dave Taht
  1 sibling, 1 reply; 26+ messages in thread
From: Bless, Roland (TM) @ 2021-04-07 11:06 UTC (permalink / raw)
  To: Erik Auerswald, bloat

Hi Erik,

see inline.

On 06.04.21 at 23:59 Erik Auerswald wrote:
> Hi,
>
> On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote:
>> On 06.04.21 at 20:50 Erik Auerswald wrote:
>>> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:
>>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>>>>>>
>>>>>>> Dave Täht has put me up to revising the current Bufferbloat article
>>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>>>>> [...]
>>> Yes, large unmanaged buffers are at the core of the bufferbloat problem.
>> I disagree here: it is basically the combination
>> of loss-based congestion control with unmanaged
>> tail-drop buffers.
> That worked for decades, then stopped working as well as before.
> What changed?
Larger buffers in many places and several orders of magnitude higher 
link speeds
as well as higher awareness for latency as an important QoS parameter.
> Yes, there are complex interactions with how packet switched networks
> are used.  Otherwise we would probably not find ourselves in the current
> situation.
>
> To me, the potential of having to wait minutes (yes, minutes!) for
> the result of a key stroke over an SSH session is not worth the potential
> throughput performance gain of buffers that cannot be called small.
>
>> There are at least two solutions
>> to the bufferbloat problem
>> 1) better congestion control algorithms
>> 2) active queue management (+fq maybe)
> Both approaches aim to not use all of the available buffer space, if
> there are unreasonably large buffers, i.e., they aim to not build a
> large standing queue.
>
>> [...]
>> Small buffers definitely limit the queuing delay as well as
>> jitter. However, how much performance is potentially lost due to
>> the small buffer depends a lot on the arrival distribution.
> Could the better congestion control algorithms avoid the potential
> performance loss by not requiring large buffers for high throughput?
Yes, at least our TCP LoLa approach achieves high throughput without 
loss and
a configurable limited queuing delay. So in principle this is possible.
> Might small buffers incentivise to not send huge bursts of data and hope
> for the best?
There are different causes of bursts. You might get a huge burst from 
many flows
that send only a single packet each, or you might get a huge burst from 
a few connections
that transmit lots of back-to-back packets. Therefore, it depends on the 
location
of the bottleneck and on the traffic arrival distribution.
> FQ with AQM aims to allow the absorption of large traffic bursts (i.e.,
> use of large buffers) without affecting _other_ flows too much.
>
> I would consider the combination of FQ+AQM, better congestion control
> algorithms, and large buffers as an optimization, but using just large
> buffers without any of the other two approaches as a mistake currently
> called bufferbloat.  As such I see large unmanaged buffers at the core
> of the bufferbloat problem.
My counter example is that large unmanaged buffers would not necessarily
lead to the bufferbloat problem if we had other congestion controls that 
avoid
creating large standing queues. However, in practice, I also see only AQMs
and better CCs in combination, because we have to live with legacy CCs
for some time.
> FQ+AQM for every large buffer may solve the bufferbloat problem by
> attacking the "unmanaged" part of the problem.  Small buffers may solve
> it by attacking the "large" part of the problem.  Small buffers may
> bring their own share of problems, but IMHO those are much less than
> those of bufferbloat.
>
> I do not see TCP congestion control improvements, even combining
> sender-side improvements with receiver-side methods as in rLEDBAT[0],
> as a solution to bufferbloat, but rather as a mitigation.
>
> [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/
As already said: the TCP LoLa concept shows that it is possible
to solve the bufferbloat problem by a different congestion control approach.
However, the coexistence of LoLa with loss-based CCs will always be
a problem unless you separate both CC types by separate queues.
Currently, LoLa is rather an academic study showing what is possible
in theory, but it is far from being usable in the wild Internet,
as it would require much more work to cope with all the peculiarities
out there.

Regards,
  Roland



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-07 11:06             ` Bless, Roland (TM)
@ 2021-04-27  1:41               ` Dave Taht
  2021-04-27  7:25                 ` Bless, Roland (TM)
  0 siblings, 1 reply; 26+ messages in thread
From: Dave Taht @ 2021-04-27  1:41 UTC (permalink / raw)
  To: Bless, Roland (TM); +Cc: Erik Auerswald, bloat

roland do you have running code for lola on linux? I'm running some
starlink tests...

On Wed, Apr 7, 2021 at 4:06 AM Bless, Roland (TM) <roland.bless@kit.edu> wrote:
>
> Hi Erik,
>
> see inline.
>
> On 06.04.21 at 23:59 Erik Auerswald wrote:
> > Hi,
> >
> > On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote:
> >> On 06.04.21 at 20:50 Erik Auerswald wrote:
> >>> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:
> >>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
> >>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
> >>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Dave Täht has put me up to revising the current Bufferbloat article
> >>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> >>>>>>> [...]
> >>> Yes, large unmanaged buffers are at the core of the bufferbloat problem.
> >> I disagree here: it is basically the combination
> >> of loss-based congestion control with unmanaged
> >> tail-drop buffers.
> > That worked for decades, then stopped working as well as before.
> > What changed?
> Larger buffers in many places and several orders of magnitude higher
> link speeds
> as well as higher awareness for latency as an important QoS parameter.
> > Yes, there are complex interactions with how packet switched networks
> > are used.  Otherwise we would probably not find ourselves in the current
> > situation.
> >
> > To me, the potential of having to wait minutes (yes, minutes!) for
> > the result of a key stroke over an SSH session is not worth the potential
> > throughput performance gain of buffers that cannot be called small.
> >
> >> There are at least two solutions
> >> to the bufferbloat problem
> >> 1) better congestion control algorithms
> >> 2) active queue management (+fq maybe)
> > Both approaches aim to not use all of the available buffer space, if
> > there are unreasonably large buffers, i.e., they aim to not build a
> > large standing queue.
> >
> >> [...]
> >> Small buffers definitely limit the queuing delay as well as
> >> jitter. However, how much performance is potentially lost due to
> >> the small buffer depends a lot on the arrival distribution.
> > Could the better congestion control algorithms avoid the potential
> > performance loss by not requiring large buffers for high throughput?
> Yes, at least our TCP LoLa approach achieves high throughput without
> loss and
> a configurable limited queuing delay. So in principle this is possible.
> > Might small buffers incentivise to not send huge bursts of data and hope
> > for the best?
> There are different causes of bursts. You might get a huge burst from
> many flows
> that send only a single packet each, or you might get a huge burst from
> a few connections
> that transmit lots of back-to-back packets. Therefore, it depends on the
> location
> of the bottleneck and on the traffic arrival distribution.
> > FQ with AQM aims to allow the absorption of large traffic bursts (i.e.,
> > use of large buffers) without affecting _other_ flows too much.
> >
> > I would consider the combination of FQ+AQM, better congestion control
> > algorithms, and large buffers as an optimization, but using just large
> > buffers without any of the other two approaches as a mistake currently
> > called bufferbloat.  As such I see large unmanaged buffers at the core
> > of the bufferbloat problem.
> My counter example is that large unmanaged buffers would not necessarily
> lead to the bufferbloat problem if we had other congestion controls that
> avoid
> creating large standing queues. However, in practice, I also see only AQMs
> and better CCs in combination, because we have to live with legacy CCs
> for some time.
> > FQ+AQM for every large buffer may solve the bufferbloat problem by
> > attacking the "unmanaged" part of the problem.  Small buffers may solve
> > it by attacking the "large" part of the problem.  Small buffers may
> > bring their own share of problems, but IMHO those are much less than
> > those of bufferbloat.
> >
> > I do not see TCP congestion control improvements, even combining
> > sender-side improvements with receiver-side methods as in rLEDBAT[0],
> > as a solution to bufferbloat, but rather as a mitigation.
> >
> > [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/
> As already said: the TCP LoLa concept shows that it is possible
> to solve the bufferbloat problem by a different congestion control approach.
> However, the coexistence of LoLa with loss-based CCs will always be
> a problem unless you separate both CC types by separate queues.
> Currently, LoLa is rather an academic study showing what is possible
> in theory, but it is far from being usable in the wild Internet,
> as it would require much more work to cope with all the peculiarities
> out there.
>
> Regards,
>   Roland
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



-- 
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman

dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-27  1:41               ` Dave Taht
@ 2021-04-27  7:25                 ` Bless, Roland (TM)
  0 siblings, 0 replies; 26+ messages in thread
From: Bless, Roland (TM) @ 2021-04-27  7:25 UTC (permalink / raw)
  To: Dave Taht; +Cc: Erik Auerswald, bloat

Hi Dave,

On 27.04.21 at 03:41 Dave Taht wrote:
> roland do you have running code for lola on linux? I'm running some
> starlink tests...

I think the latest code is here and unfortunately
it hasn't been updated for a while:
https://git.scc.kit.edu/TCP-LoLa/TCP-LoLa_for_Linux

However, in case that there are loss-based congestion controls
present at the bottleneck in addition to LoLa flows, LoLa will not get
any reasonable bandwidth, because we did not yet build in a more
aggressive mode for these cases in order to not sacrifice LoLa's low
delay goal. So you can give it a try, but it has not been
engineered for real world usage so far, so some default parameters
may not fit to your use case.

Regards,
  Roland

> On Wed, Apr 7, 2021 at 4:06 AM Bless, Roland (TM) <roland.bless@kit.edu> wrote:
>>
>> Hi Erik,
>>
>> see inline.
>>
>> On 06.04.21 at 23:59 Erik Auerswald wrote:
>>> Hi,
>>>
>>> On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote:
>>>> On 06.04.21 at 20:50 Erik Auerswald wrote:
>>>>> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote:
>>>>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>>>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>>>>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Dave Täht has put me up to revising the current Bufferbloat article
>>>>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>>>>>>> [...]
>>>>> Yes, large unmanaged buffers are at the core of the bufferbloat problem.
>>>> I disagree here: it is basically the combination
>>>> of loss-based congestion control with unmanaged
>>>> tail-drop buffers.
>>> That worked for decades, then stopped working as well as before.
>>> What changed?
>> Larger buffers in many places and several orders of magnitude higher
>> link speeds
>> as well as higher awareness for latency as an important QoS parameter.
>>> Yes, there are complex interactions with how packet switched networks
>>> are used.  Otherwise we would probably not find ourselves in the current
>>> situation.
>>>
>>> To me, the potential of having to wait minutes (yes, minutes!) for
>>> the result of a key stroke over an SSH session is not worth the potential
>>> throughput performance gain of buffers that cannot be called small.
>>>
>>>> There are at least two solutions
>>>> to the bufferbloat problem
>>>> 1) better congestion control algorithms
>>>> 2) active queue management (+fq maybe)
>>> Both approaches aim to not use all of the available buffer space, if
>>> there are unreasonably large buffers, i.e., they aim to not build a
>>> large standing queue.
>>>
>>>> [...]
>>>> Small buffers definitely limit the queuing delay as well as
>>>> jitter. However, how much performance is potentially lost due to
>>>> the small buffer depends a lot on the arrival distribution.
>>> Could the better congestion control algorithms avoid the potential
>>> performance loss by not requiring large buffers for high throughput?
>> Yes, at least our TCP LoLa approach achieves high throughput without
>> loss and
>> a configurable limited queuing delay. So in principle this is possible.
>>> Might small buffers incentivise to not send huge bursts of data and hope
>>> for the best?
>> There are different causes of bursts. You might get a huge burst from
>> many flows
>> that send only a single packet each, or you might get a huge burst from
>> a few connections
>> that transmit lots of back-to-back packets. Therefore, it depends on the
>> location
>> of the bottleneck and on the traffic arrival distribution.
>>> FQ with AQM aims to allow the absorption of large traffic bursts (i.e.,
>>> use of large buffers) without affecting _other_ flows too much.
>>>
>>> I would consider the combination of FQ+AQM, better congestion control
>>> algorithms, and large buffers as an optimization, but using just large
>>> buffers without any of the other two approaches as a mistake currently
>>> called bufferbloat.  As such I see large unmanaged buffers at the core
>>> of the bufferbloat problem.
>> My counter example is that large unmanaged buffers would not necessarily
>> lead to the bufferbloat problem if we had other congestion controls that
>> avoid
>> creating large standing queues. However, in practice, I also see only AQMs
>> and better CCs in combination, because we have to live with legacy CCs
>> for some time.
>>> FQ+AQM for every large buffer may solve the bufferbloat problem by
>>> attacking the "unmanaged" part of the problem.  Small buffers may solve
>>> it by attacking the "large" part of the problem.  Small buffers may
>>> bring their own share of problems, but IMHO those are much less than
>>> those of bufferbloat.
>>>
>>> I do not see TCP congestion control improvements, even combining
>>> sender-side improvements with receiver-side methods as in rLEDBAT[0],
>>> as a solution to bufferbloat, but rather as a mitigation.
>>>
>>> [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/
>> As already said: the TCP LoLa concept shows that it is possible
>> to solve the bufferbloat problem by a different congestion control approach.
>> However, the coexistence of LoLa with loss-based CCs will always be
>> a problem unless you separate both CC types by separate queues.
>> Currently, LoLa is rather an academic study showing what is possible
>> in theory, but it is far from being usable in the wild Internet,
>> as it would require much more work to cope with all the peculiarities
>> out there.
>>
>> Regards,
>>    Roland
>>
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
> 
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06  6:31     ` Sebastian Moeller
  2021-04-06 18:50       ` Erik Auerswald
@ 2021-04-06 20:01       ` Bless, Roland (TM)
  2021-04-06 21:30         ` Sebastian Moeller
  1 sibling, 1 reply; 26+ messages in thread
From: Bless, Roland (TM) @ 2021-04-06 20:01 UTC (permalink / raw)
  To: Sebastian Moeller, Erik Auerswald; +Cc: bloat

Hi Sebastian,

see comments at the end.

On 06.04.21 at 08:31 Sebastian Moeller wrote:
> Hi Eric,
> 
> thanks for your thoughts.
> 
>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>
>> Hi,
>>
>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>>>
>>> all good questions, and interesting responses so far.
>>
>> I'll add some details below, I mostly concur with your responses.
>>
>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>>>
>>>> Dave Täht has put me up to revising the current Bufferbloat article
>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>> [...]
>>> [...] while too large buffers cause undesirable increase in latency
>>> under load (but decent throughput), [...]
>>
>> With too large buffers, even throughput degrades when TCP considers
>> a delayed segment lost (or DNS gives up because the answers arrive
>> too late).  I do think there is _too_ large for buffers, period.
> 
> 	Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;)
> 
> 
>>
>>> The solution basically is large buffers with adaptive management that
>>
>> I would prefer the word "sufficient" instead of "large."
> 
> 	If properly managed there is no upper end for the size, it might not be used though, no?
> 
> 
>>
>>> works hard to keep latency under load increase and throughput inside
>>> an acceptable "corridor".
>>
>> I concur that there is quite some usable range of buffer capacity when
>> considering the latency/throughput trade-off, and AQM seems like a good
>> solution to managing that.
> 
> 	I fear it is the only network side mitigation technique?
> 
> 
>>
>> My preference is to sacrifice throughput for better latency, but then
>> I have been bitten by too much latency quite often, but never by too
>> little throughput caused by small buffers.  YMMV.
> 
> 	Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need).
> 
>>
>>> [...]
>>> But e.g. for traditional TCPs the amount of expected buffer needs
>>> increases with RTT of a flow
>>
>> Does it?  Does the propagation delay provide automatic "buffering" in the
>> network?  Does the receiver need to advertise sufficient buffer capacity
>> (receive window) to allow the sender to fill the pipe?  Does the sender
>> need to provide sufficient buffer capacity to retransmit lost segments?
>> Where are buffers actually needed?
> 
> 	At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news.
> 
> 
>>
>> I am not convinced that large buffers in the network are needed for high
>> throughput of high RTT TCP flows.
>>
>> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for
>> some information and links to a few papers.

Thanks for the link Erik, but BBR is not properly described there
"When the RTT creeps upward -- this taken as a signal of buffer 
occupancy congestion" and Sebastian also mentioned: "measures the 
induced latency increase from those, interpreting to much latency as 
sign that the capacity was reached/exceeded". BBR does not use
delay or its gradient as congestion signal.

> 	Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;)

Nope, a BDP-sized buffer is not required to allow 100% utilization with
a single flow, because it depends on the used congestion control. For
loss-based congestion control like Reno or Cubic, this may be true,
but not necessarily for other congestion controls.

Regards,
  Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 20:01       ` Bless, Roland (TM)
@ 2021-04-06 21:30         ` Sebastian Moeller
  2021-04-06 21:36           ` Jonathan Morton
  2021-04-07 10:39           ` Bless, Roland (TM)
  0 siblings, 2 replies; 26+ messages in thread
From: Sebastian Moeller @ 2021-04-06 21:30 UTC (permalink / raw)
  To: Bless, Roland (TM); +Cc: Erik Auerswald, bloat

Hi Roland,

thanks, much appreciated.

> On Apr 6, 2021, at 22:01, Bless, Roland (TM) <roland.bless@kit.edu> wrote:
> 
> Hi Sebastian,
> 
> see comments at the end.
> 
> On 06.04.21 at 08:31 Sebastian Moeller wrote:
>> Hi Eric,
>> thanks for your thoughts.
>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>> 
>>> Hi,
>>> 
>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>>>> 
>>>> all good questions, and interesting responses so far.
>>> 
>>> I'll add some details below, I mostly concur with your responses.
>>> 
>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>>>> 
>>>>> Dave Täht has put me up to revising the current Bufferbloat article
>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>>> [...]
>>>> [...] while too large buffers cause undesirable increase in latency
>>>> under load (but decent throughput), [...]
>>> 
>>> With too large buffers, even throughput degrades when TCP considers
>>> a delayed segment lost (or DNS gives up because the answers arrive
>>> too late).  I do think there is _too_ large for buffers, period.
>> 	Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;)
>>> 
>>>> The solution basically is large buffers with adaptive management that
>>> 
>>> I would prefer the word "sufficient" instead of "large."
>> 	If properly managed there is no upper end for the size, it might not be used though, no?
>>> 
>>>> works hard to keep latency under load increase and throughput inside
>>>> an acceptable "corridor".
>>> 
>>> I concur that there is quite some usable range of buffer capacity when
>>> considering the latency/throughput trade-off, and AQM seems like a good
>>> solution to managing that.
>> 	I fear it is the only network side mitigation technique?
>>> 
>>> My preference is to sacrifice throughput for better latency, but then
>>> I have been bitten by too much latency quite often, but never by too
>>> little throughput caused by small buffers.  YMMV.
>> 	Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need).
>>> 
>>>> [...]
>>>> But e.g. for traditional TCPs the amount of expected buffer needs
>>>> increases with RTT of a flow
>>> 
>>> Does it?  Does the propagation delay provide automatic "buffering" in the
>>> network?  Does the receiver need to advertise sufficient buffer capacity
>>> (receive window) to allow the sender to fill the pipe?  Does the sender
>>> need to provide sufficient buffer capacity to retransmit lost segments?
>>> Where are buffers actually needed?
>> 	At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news.
>>> 
>>> I am not convinced that large buffers in the network are needed for high
>>> throughput of high RTT TCP flows.
>>> 
>>> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for
>>> some information and links to a few papers.
> 
> Thanks for the link Erik, but BBR is not properly described there
> "When the RTT creeps upward -- this taken as a signal of buffer occupancy congestion" and Sebastian also mentioned: "measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded". BBR does not use
> delay or its gradient as congestion signal.

	Looking at https://queue.acm.org/detail.cfm?id=3022184, I still think that it is not completely wrong to abstractly say BBR evaluates RTT changes as function of the current sending rate to probe the bottlenecks capacity (and adjust its sending rate based on that estimated capacity), but that might either indicate I am looking at the whole thing at too abstract a level, or, as I fear, that I am simply misunderstanding BBR's principle of operation... (or both ;)) (Sidenote I keep making: for a protocol believing it knows better than to interpret all packet losses as signs of congestion, it seems rather an ovrsight not having implemented a rfc3168 style CE response...)


> 
>> 	Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;)
> 
> Nope, a BDP-sized buffer is not required to allow 100% utilization with
> a single flow, because it depends on the used congestion control. For
> loss-based congestion control like Reno or Cubic, this may be true,
> but not necessarily for other congestion controls.

	Yes, I should have hedged that better. For protocols like the ubiquitous TCP CUBIC (seems to be used by most major operating systems nowadays) a single flow might need BDP buffering to get close to 100% utilization. I am not wanting to say cubic is more important than other protocols, but it still represents a significant share of internet traffic. And any scheme to counter bufferbloat, could do worse than accept that reality and to allow for sufficient buffering to allow such protocols acceptable levels of utilization (all the while keeping the latency under load increase under control).


Best Regards
	Sebastian


> 
> Regards,
> Roland


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 21:30         ` Sebastian Moeller
@ 2021-04-06 21:36           ` Jonathan Morton
  2021-04-07 10:39           ` Bless, Roland (TM)
  1 sibling, 0 replies; 26+ messages in thread
From: Jonathan Morton @ 2021-04-06 21:36 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Bless, Roland (TM), bloat

> On 7 Apr, 2021, at 12:30 am, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> I still think that it is not completely wrong to abstractly say BBR evaluates RTT changes as function of the current sending rate to probe the bottlenecks capacity (and adjust its sending rate based on that estimated capacity), but that might either indicate I am looking at the whole thing at too abstract a level, or, as I fear, that I am simply misunderstanding BBR's principle of operation...

It might be more accurate to say that it estimates the delivery rate at the receiver by observing the ack stream, and aims to match that with the send rate.  There is some periodic probing upwards to see if a higher delivery rate is possible, followed by a downwards drain cycle which, I think, pays some attention to the observed RTT.  And there is also a cwnd mechanism overlaid as a safety valve.

Overall, it's very much a hybrid approach.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-06 21:30         ` Sebastian Moeller
  2021-04-06 21:36           ` Jonathan Morton
@ 2021-04-07 10:39           ` Bless, Roland (TM)
  1 sibling, 0 replies; 26+ messages in thread
From: Bless, Roland (TM) @ 2021-04-07 10:39 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Erik Auerswald, bloat

Hi Sebastian,

see inline.

On 06.04.21 at 23:30 Sebastian Moeller wrote:
>> On Apr 6, 2021, at 22:01, Bless, Roland (TM) <roland.bless@kit.edu> wrote:
>>
>> Hi Sebastian,
>>
>> see comments at the end.
>>
>> On 06.04.21 at 08:31 Sebastian Moeller wrote:
>>> Hi Eric,
>>> thanks for your thoughts.
>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote:
>>>>> all good questions, and interesting responses so far.
>>>> I'll add some details below, I mostly concur with your responses.
>>>>
>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote:
>>>>>>
>>>>>> Dave Täht has put me up to revising the current Bufferbloat article
>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
>>>>>> [...]
>>>>> [...] while too large buffers cause undesirable increase in latency
>>>>> under load (but decent throughput), [...]
>>>> With too large buffers, even throughput degrades when TCP considers
>>>> a delayed segment lost (or DNS gives up because the answers arrive
>>>> too late).  I do think there is _too_ large for buffers, period.
>>> 	Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;)
>>>>> The solution basically is large buffers with adaptive management that
>>>> I would prefer the word "sufficient" instead of "large."
>>> 	If properly managed there is no upper end for the size, it might not be used though, no?
>>>>> works hard to keep latency under load increase and throughput inside
>>>>> an acceptable "corridor".
>>>> I concur that there is quite some usable range of buffer capacity when
>>>> considering the latency/throughput trade-off, and AQM seems like a good
>>>> solution to managing that.
>>> 	I fear it is the only network side mitigation technique?
>>>> My preference is to sacrifice throughput for better latency, but then
>>>> I have been bitten by too much latency quite often, but never by too
>>>> little throughput caused by small buffers.  YMMV.
>>> 	Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need).
>>>>> [...]
>>>>> But e.g. for traditional TCPs the amount of expected buffer needs
>>>>> increases with RTT of a flow
>>>> Does it?  Does the propagation delay provide automatic "buffering" in the
>>>> network?  Does the receiver need to advertise sufficient buffer capacity
>>>> (receive window) to allow the sender to fill the pipe?  Does the sender
>>>> need to provide sufficient buffer capacity to retransmit lost segments?
>>>> Where are buffers actually needed?
>>> 	At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news.
>>>> I am not convinced that large buffers in the network are needed for high
>>>> throughput of high RTT TCP flows.
>>>>
>>>> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for
>>>> some information and links to a few papers.
>> Thanks for the link Erik, but BBR is not properly described there
>> "When the RTT creeps upward -- this taken as a signal of buffer occupancy congestion" and Sebastian also mentioned: "measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded". BBR does not use
>> delay or its gradient as congestion signal.
> 	Looking at https://queue.acm.org/detail.cfm?id=3022184, I still think that it is not completely wrong to abstractly say BBR evaluates RTT changes as function of the current sending rate to probe the bottlenecks capacity (and adjust its sending rate based on that estimated capacity), but that might either indicate I am looking at the whole thing at too abstract a level, or, as I fear, that I am simply misunderstanding BBR's principle of operation... (or both ;)) (Sidenote I keep making: for a protocol believing it knows better than to interpret all packet losses as signs of congestion, it seems rather an ovrsight not having implemented a rfc3168 style CE response...)
>
I think both, but you are in good company. Several people have 
misinterpreted how BBR actually works.
In BBRv1, the measured RTT is only used for the inflight cap CWnd of 2 BDP.
The BBR team considers delay as being a too noisy signal (see slide 10
https://www.ietf.org/proceedings/97/slides/slides-97-maprg-traffic-policing-in-the-internet-yuchung-cheng-and-neal-cardwell-00.pdf)
and therefore doesn't use it as congestion signal. Actually, BBRv1 does not
react to any congestion signal, there isn't even any backoff reaction.
BBRv2, however, reacts to packet loss (>=2%) or ECN signals.

>>> 	Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;)
>> Nope, a BDP-sized buffer is not required to allow 100% utilization with
>> a single flow, because it depends on the used congestion control. For
>> loss-based congestion control like Reno or Cubic, this may be true,
>> but not necessarily for other congestion controls.
> 	Yes, I should have hedged that better. For protocols like the ubiquitous TCP CUBIC (seems to be used by most major operating systems nowadays) a single flow might need BDP buffering to get close to 100% utilization. I am not wanting to say cubic is more important than other protocols, but it still represents a significant share of internet traffic. And any scheme to counter bufferbloat, could do worse than accept that reality and to allow for sufficient buffering to allow such protocols acceptable levels of utilization (all the while keeping the latency under load increase under control).
I didn't get what you were trying to say with your last sentence.
My point was that the BDP rule of thumb was tied to a specific type of 
congestion controls and
that the buffer sizing rule should probably rather reflect the burst 
absorption requirements
(see "good queue" in the Codel paper) than specifics of congestion 
controls. CC schemes
that try to counter bufferbloat typically suffer under the presence of 
loss-based congestion
controls, because they only react to loss and this requires a full 
buffer (unless an AQM is
in place), which causes queuing delay.

Regards,
  Roland




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Bloat] Questions for Bufferbloat Wikipedia article
  2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown
                   ` (2 preceding siblings ...)
  2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller
@ 2021-04-06 18:54 ` Neil Davies
  3 siblings, 0 replies; 26+ messages in thread
From: Neil Davies @ 2021-04-06 18:54 UTC (permalink / raw)
  To: Rich Brown; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 2766 bytes --]

It should be noted that a) and b) are related by the “service rate” - if you want to look at how to measure “unhappiness” in networking you might want to look at the Broadband Forum’s work on “Quality Attenuation” discussed in TR452.1[1] where there is a formal definition and calculus for it…

Neil
[1]https://www.broadband-forum.org/download/TR-452.1.pdf

> On 5 Apr 2021, at 13:46, Rich Brown <richb.hanover@gmail.com> wrote:
> 
> Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> 
> Before I get into it, I want to ask real experts for some guidance... Here goes:
> 
> 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.)
> 
> a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.)
> 
> b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-)
> 
> c) Or some other definition?
> 
> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?
> 
> 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices?
> 
> 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport...
> 
> Many thanks!
> 
> Rich
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-04-27  7:25 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown
2021-04-05 15:13 ` Stephen Hemminger
2021-04-05 15:24   ` David Lang
2021-04-05 15:57     ` Dave Collier-Brown
2021-04-05 16:25     ` Kelvin Edmison
2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown
2021-04-05 18:08   ` David Lang
2021-04-05 20:30     ` Erik Auerswald
2021-04-05 20:36       ` Dave Taht
2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller
2021-04-05 21:55   ` Dave Taht
2021-04-06  0:47   ` Erik Auerswald
2021-04-06  6:31     ` Sebastian Moeller
2021-04-06 18:50       ` Erik Auerswald
2021-04-06 20:02         ` Bless, Roland (TM)
2021-04-06 21:59           ` Erik Auerswald
2021-04-06 23:32             ` Stephen Hemminger
2021-04-06 23:54               ` David Lang
2021-04-07 11:06             ` Bless, Roland (TM)
2021-04-27  1:41               ` Dave Taht
2021-04-27  7:25                 ` Bless, Roland (TM)
2021-04-06 20:01       ` Bless, Roland (TM)
2021-04-06 21:30         ` Sebastian Moeller
2021-04-06 21:36           ` Jonathan Morton
2021-04-07 10:39           ` Bless, Roland (TM)
2021-04-06 18:54 ` Neil Davies

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox