* [Bloat] Questions for Bufferbloat Wikipedia article @ 2021-04-05 12:46 Rich Brown 2021-04-05 15:13 ` Stephen Hemminger ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Rich Brown @ 2021-04-05 12:46 UTC (permalink / raw) To: bloat, Richard E. Brown Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) Before I get into it, I want to ask real experts for some guidance... Here goes: 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.) b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) c) Or some other definition? 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices? 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport... Many thanks! Rich ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown @ 2021-04-05 15:13 ` Stephen Hemminger 2021-04-05 15:24 ` David Lang 2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown ` (2 subsequent siblings) 3 siblings, 1 reply; 26+ messages in thread From: Stephen Hemminger @ 2021-04-05 15:13 UTC (permalink / raw) To: Rich Brown; +Cc: bloat On Mon, 5 Apr 2021 08:46:15 -0400 Rich Brown <richb.hanover@gmail.com> wrote: > Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > > Before I get into it, I want to ask real experts for some guidance... Here goes: > > 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) > > a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.) > > b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) > > c) Or some other definition? > > 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? > > 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices? > > 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport... > > Many thanks! > > Rich I like to think of Bufferbloat as a combination of large buffers and how algorithms react to those buffers. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 15:13 ` Stephen Hemminger @ 2021-04-05 15:24 ` David Lang 2021-04-05 15:57 ` Dave Collier-Brown 2021-04-05 16:25 ` Kelvin Edmison 0 siblings, 2 replies; 26+ messages in thread From: David Lang @ 2021-04-05 15:24 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Rich Brown, bloat [-- Attachment #1: Type: text/plain, Size: 3084 bytes --] On Mon, 5 Apr 2021, Stephen Hemminger wrote: > On Mon, 5 Apr 2021 08:46:15 -0400 > Rich Brown <richb.hanover@gmail.com> wrote: > >> Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >> >> Before I get into it, I want to ask real experts for some guidance... Here goes: >> >> 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) >> >> a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.) >> >> b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) >> >> c) Or some other definition? >> >> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? >> >> 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices? >> >> 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport... >> >> Many thanks! >> >> Rich > > I like to think of Bufferbloat as a combination of large buffers and how algorithms react to those buffers. I think there are two things 1. what bufferbloat is bufferbloat is the result of memory getting cheaper faster than bandwidth increased, combined with throughput benchmarking that drastically penalized end-to-end retries. I think this definition is pretty academic and not something to worry about using. 2. why it's a problem the problems show up when the buffer represents too much time worth of data to transmit (the time between when the last byte in the buffer gets inserted into the buffer and when it gets transmitted) So in a high bandwidth environment (like a datacenter) you can use much larger buffers than when you are on a low bandwidth line David Lang ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 15:24 ` David Lang @ 2021-04-05 15:57 ` Dave Collier-Brown 2021-04-05 16:25 ` Kelvin Edmison 1 sibling, 0 replies; 26+ messages in thread From: Dave Collier-Brown @ 2021-04-05 15:57 UTC (permalink / raw) To: bloat [-- Attachment #1: Type: text/plain, Size: 3954 bytes --] To speak to the original question, I'd say bufferbloat * is undesirable latency * was discovered when adding buffers counter-intuitively /slowed /packet flow. That's so as to catch the reader's attention and immediately cast light on the (memorable but mysterious) name. --dave On 2021-04-05 11:24 a.m., David Lang wrote: > On Mon, 5 Apr 2021, Stephen Hemminger wrote: > >> On Mon, 5 Apr 2021 08:46:15 -0400 >> Rich Brown <richb.hanover@gmail.com> wrote: >> >>> Dave Täht has put me up to revising the current Bufferbloat article >>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>> >>> Before I get into it, I want to ask real experts for some >>> guidance... Here goes: >>> >>> 1) What is *our* definition of Bufferbloat? (We invented the term, >>> so I think we get to define it.) >>> a) Are we content with the definition from the bufferbloat.net site, >>> "Bufferbloat is the undesirable latency that comes from a router or >>> other network equipment buffering too much data." (This suggests >>> bufferbloat is latency, and could be measured in seconds/msec.) >>> >>> b) Or should we use something like Jim Gettys' definition from the >>> Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), >>> "Bufferbloat is the existence of excessively large (bloated) buffers >>> in systems, particularly network communication systems." (This >>> suggests bufferbloat is an unfortunate state of nature, measured in >>> units of "unhappiness" :-) >>> c) Or some other definition? >>> >>> 2) All network equipment can be bloated. I have seen (but not really >>> followed) controversy regarding the amount of buffering needed in >>> the Data Center. Is it worth having the Wikipedia article >>> distinguish between Data Center equipment and CPE/home/last mile >>> equipment? Similarly, is the "bloat condition" and its mitigation >>> qualitatively different between those applications? Finally, do any >>> of us know how frequently data centers/backbone ISPs experience >>> buffer-induced latencies? What's the magnitude of the impact? >>> >>> 3) The Wikipedia article mentions guidance that network gear should >>> accommodate buffering 250 msec of traffic(!) Is this a real "rule of >>> thumb" or just an often-repeated but unscientific suggestion? Can >>> someone give pointers to best practices? >>> >>> 4) Meta question: Can anyone offer any advice on making a wholesale >>> change to a Wikipedia article? Before I offer a fork-lift >>> replacement I would a) solicit advice on the new text from this >>> list, and b) try to make contact with some of the reviewers and >>> editors who've been maintaining the page to establish some bona >>> fides and rapport... >>> >>> Many thanks! >>> >>> Rich >> >> I like to think of Bufferbloat as a combination of large buffers and >> how algorithms react to those buffers. > > I think there are two things > > 1. what bufferbloat is > > bufferbloat is the result of memory getting cheaper faster than > bandwidth increased, combined with throughput benchmarking that > drastically penalized end-to-end retries. > > I think this definition is pretty academic and not something to worry > about using. > > 2. why it's a problem > > the problems show up when the buffer represents too much time worth of > data to transmit (the time between when the last byte in the buffer > gets inserted into the buffer and when it gets transmitted) > > So in a high bandwidth environment (like a datacenter) you can use > much larger buffers than when you are on a low bandwidth line > > David Lang > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat -- David Collier-Brown, | Always do right. This will gratify System Programmer and Author | some people and astonish the rest dave.collier-brown@indexexchange.com | -- Mark Twain [-- Attachment #2: Type: text/html, Size: 5914 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 15:24 ` David Lang 2021-04-05 15:57 ` Dave Collier-Brown @ 2021-04-05 16:25 ` Kelvin Edmison 1 sibling, 0 replies; 26+ messages in thread From: Kelvin Edmison @ 2021-04-05 16:25 UTC (permalink / raw) To: David Lang; +Cc: Stephen Hemminger, Rich Brown, bloat [-- Attachment #1: Type: text/plain, Size: 4924 bytes --] I've been lurking on the bufferbloat mailing list for a while now, without volunteering in the same fashion as the core contributors. But I do have some thoughts as someone who is not quite at the level of writing kernel drivers; maybe this is helpful when updating the definition. I think we need to define what it is (in terms of user-perceivable experience) before we get to the causes and why it's a problem. In essence, link it to what average people know already, and draw them in to the next level of detail. To that end, I would propose the following for discussion: Bufferbloat is the difference in latency for a connection when it is lightly loaded vs when it is fully loaded. (Here, i am trying to provide terms that are somewhat clear and simple to an average user, that will connect them to things they do already i.e. fully use their internet connection for an upload or download.) Then, I think it is useful to move into some examples of how it can be perceived (audio call stutter, video call stutter) especially in the presence of multiple competing users with different priorities (gaming vs. uploading documents or presentations). And then we can dig into the causes (e.g. over-provisioned buffers, poor inter-flow management, etc), means of explicitly measuring it, approaches for mitigating or fixing it, etc. I hope this is useful, Kelvin On Mon, Apr 5, 2021 at 11:24 AM David Lang <david@lang.hm> wrote: > On Mon, 5 Apr 2021, Stephen Hemminger wrote: > > > On Mon, 5 Apr 2021 08:46:15 -0400 > > Rich Brown <richb.hanover@gmail.com> wrote: > > > >> Dave Täht has put me up to revising the current Bufferbloat article on > Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > >> > >> Before I get into it, I want to ask real experts for some guidance... > Here goes: > >> > >> 1) What is *our* definition of Bufferbloat? (We invented the term, so I > think we get to define it.) > >> > >> a) Are we content with the definition from the bufferbloat.net site, > "Bufferbloat is the undesirable latency that comes from a router or other > network equipment buffering too much data." (This suggests bufferbloat is > latency, and could be measured in seconds/msec.) > >> > >> b) Or should we use something like Jim Gettys' definition from the Dark > Buffers article (https://ieeexplore.ieee.org/document/5755608), > "Bufferbloat is the existence of excessively large (bloated) buffers in > systems, particularly network communication systems." (This suggests > bufferbloat is an unfortunate state of nature, measured in units of > "unhappiness" :-) > >> > >> c) Or some other definition? > >> > >> 2) All network equipment can be bloated. I have seen (but not really > followed) controversy regarding the amount of buffering needed in the Data > Center. Is it worth having the Wikipedia article distinguish between Data > Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat > condition" and its mitigation qualitatively different between those > applications? Finally, do any of us know how frequently data > centers/backbone ISPs experience buffer-induced latencies? What's the > magnitude of the impact? > >> > >> 3) The Wikipedia article mentions guidance that network gear should > accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" > or just an often-repeated but unscientific suggestion? Can someone give > pointers to best practices? > >> > >> 4) Meta question: Can anyone offer any advice on making a wholesale > change to a Wikipedia article? Before I offer a fork-lift replacement I > would a) solicit advice on the new text from this list, and b) try to make > contact with some of the reviewers and editors who've been maintaining the > page to establish some bona fides and rapport... > >> > >> Many thanks! > >> > >> Rich > > > > I like to think of Bufferbloat as a combination of large buffers and how > algorithms react to those buffers. > > I think there are two things > > 1. what bufferbloat is > > bufferbloat is the result of memory getting cheaper faster than > bandwidth > increased, combined with throughput benchmarking that drastically > penalized > end-to-end retries. > > I think this definition is pretty academic and not something to worry > about > using. > > 2. why it's a problem > > the problems show up when the buffer represents too much time worth of > data to > transmit (the time between when the last byte in the buffer gets inserted > into > the buffer and when it gets transmitted) > > So in a high bandwidth environment (like a datacenter) you can use much > larger > buffers than when you are on a low bandwidth line > > David Lang_______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat > [-- Attachment #2: Type: text/html, Size: 6089 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2 2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown 2021-04-05 15:13 ` Stephen Hemminger @ 2021-04-05 18:00 ` Rich Brown 2021-04-05 18:08 ` David Lang 2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller 2021-04-06 18:54 ` Neil Davies 3 siblings, 1 reply; 26+ messages in thread From: Rich Brown @ 2021-04-05 18:00 UTC (permalink / raw) To: bloat, Richard E. Brown Thanks, all, for the responses re: Bufferbloat definition. I can work with that information. Next question... > 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? Many thanks! Rich ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2 2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown @ 2021-04-05 18:08 ` David Lang 2021-04-05 20:30 ` Erik Auerswald 0 siblings, 1 reply; 26+ messages in thread From: David Lang @ 2021-04-05 18:08 UTC (permalink / raw) To: Rich Brown; +Cc: bloat On Mon, 5 Apr 2021, Rich Brown wrote: > Next question... > >> 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? the bandwidth available in datacenters is high enough that it's much harder to run into grief there (recognizing that not every piece of datacenter equipment is hooked to 100G circuits) I think it's best to talk about excessive buffers in terms of time rather than bytes, and you can then show the difference between two buffers of the same size, one connected to a 10Mb (or 1Mb) DSL upload vs 100G datacenter circuit. After that one example, the rest of the article can talk about time and it will be globally applicable. David Lang ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2 2021-04-05 18:08 ` David Lang @ 2021-04-05 20:30 ` Erik Auerswald 2021-04-05 20:36 ` Dave Taht 0 siblings, 1 reply; 26+ messages in thread From: Erik Auerswald @ 2021-04-05 20:30 UTC (permalink / raw) To: bloat Hi, On Mon, Apr 05, 2021 at 11:08:07AM -0700, David Lang wrote: > On Mon, 5 Apr 2021, Rich Brown wrote: > > >Next question... > > > >>2) All network equipment can be bloated. I have seen (but not > >>really followed) controversy regarding the amount of buffering > >>needed in the Data Center. Is it worth having the Wikipedia article > >>distinguish between Data Center equipment and CPE/home/last mile > >>equipment? Similarly, is the "bloat condition" and its mitigation > >>qualitatively different between those applications? Finally, do > >>any of us know how frequently data centers/backbone ISPs experience > >>buffer-induced latencies? What's the magnitude of the impact? I do not have experience with "web scale" data centers or "backbone" ISPs, but I think I can add related information. From my work experience with (mostly) enterprise and service provider networks I would say that bufferbloat effects are relatively rarely observed there. Many network engineers do not know about bufferbloat and do not believe in its existence after being told about bufferbloat. I have seen a latency consideration for a country-wide network that explicitly excluded queuing delays as irrelevant and cited just propagation and serialization delay as relevant for the end-to-end latency. Demonstrating bufferbloat effects with a test setup with prolonged congestion is usually labeled unrealistic and ignored. Campus networks and ("small") data centers are usually overprovisioned with bandwidth and thus do not exhibit prolonged congestion. Additionally, a lot of enterprise networking gear, specifically "switches," do not have oversized buffers. Campus networks more often show problems with too small buffers for a given application (e.g., cameras streaming data via RTP with large "key frames" sent at line rate), such that "microbursts" result in packet drops and thus observable problems even with low bandwidth utilization over longer time frames (minutes). The idea that buffers could be too large does not seem realistic there. "Routers" for the ISP market (not "home routers", but network devices used inside the ISP's core and aggregation networks and similar) often do have unreasonably ("bloated") buffer capacity, but they are usually operated without persistent congestion. When persistent congestion does happen on a customer connection, and bufferbloat does result in unusably high latency, the customer is often told to send at a lower rate, but "bufferbloat" is usually not recognized as the root cause, and thus not addressed. It seems to me as if "bufferbloat" is most noticable on the consumer end of mass market network connections. I.e., low margin markets with non-technical customers. If CAKE behind the access circuit of an end customer can mitigate bufferbloat, then bufferbloat effects are only visible there and do not show up in other parts of the network. > the bandwidth available in datacenters is high enough that it's much > harder to run into grief there (recognizing that not every piece of > datacenter equipment is hooked to 100G circuits) That is my impression as well. > I think it's best to talk about excessive buffers in terms of time > rather than bytes, and you can then show the difference between two > buffers of the same size, one connected to a 10Mb (or 1Mb) DSL upload > vs 100G datacenter circuit. After that one example, the rest of the > article can talk about time and it will be globally applicable. I too think that _time_ is the important unit regarding buffers, even though they are mostly described in units of data (bytes or packets). Thanks, Erik -- To have our best advice ignored is the common fate of all who take on the role of consultant, ever since Cassandra pointed out the dangers of bringing a wooden horse within the walls of Troy. -- C.A.R. Hoare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article - question #2 2021-04-05 20:30 ` Erik Auerswald @ 2021-04-05 20:36 ` Dave Taht 0 siblings, 0 replies; 26+ messages in thread From: Dave Taht @ 2021-04-05 20:36 UTC (permalink / raw) To: Erik Auerswald; +Cc: bloat My own fervent wish is new switches suffering from microbursts did better 5 tuple fq, in addition to per-port fq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown 2021-04-05 15:13 ` Stephen Hemminger 2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown @ 2021-04-05 21:49 ` Sebastian Moeller 2021-04-05 21:55 ` Dave Taht 2021-04-06 0:47 ` Erik Auerswald 2021-04-06 18:54 ` Neil Davies 3 siblings, 2 replies; 26+ messages in thread From: Sebastian Moeller @ 2021-04-05 21:49 UTC (permalink / raw) To: Rich Brown; +Cc: bloat Hi Rich, all good questions, and interesting responses so far. > On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: > > Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > > Before I get into it, I want to ask real experts for some guidance... Here goes: > > 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) > > a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.) > > b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) I do not even think these are mutually exclusive; "over-sized but under-managed buffers" cause avoidable variable latency, aka Jitter, which is the bane of all interactive use-cases. The lower jitter the better, and jitter can be measured in units of time, but also acts as "currency" in the unhappiness domain ;). The challenge is that we know that no/too small buffers cause undesirable loss of throughput (but small latency under load), while too large buffers cause undesirable increase in latency under load (but decent throughput), so the challenge is to get buffering right to keep throughput acceptably high, while at the same time keeping latency under load acceptable low... The solution basically is large buffers with adaptive management that works hard to keep latency under load increase and throughput inside an acceptable "corridor". > c) Or some other definition? > > 2) All network equipment can be bloated. +1; depending on condition. Corollary: static buffer sizing is unlikely to be the right answer unless the load is constant... > I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Conceptually the same as everywhere else, just enough to keep throughput up ;) But e.g. for traditional TCPs the amount of expected buffer needs increases with RTT of a flow, so intra-datacenter flows with low RTTs will only require relative small buffers to cope. > Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? That depends on our audience, but realistically over-sized but under-managed buffers can and do occur everywhere, so maybe better include all? > Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? IMHO, not really, we have two places to twiddle, the buffer (and how it is managed) and the two endpoints transferring data. Our go to solution deals with buffer management, but protocols can also help, e.g. by using pacing (spreading out packets based on the estimated throughput) instead of sending in bursts. Or using different protocols that are more adaptive to the perceived buffering along a path, like BBR (which as you surely knows, tries to actively measure a path's capacity by regularly sending closely spaced probe packets and measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded). Methods at both places are not guaranteed to work hand in hand though (naive BBR fails to recognize an AQM on the path that keeps latency under load well-bounded, which was noted and fixed in later BBR incarnations); making the whole problem space "a mess". > Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? I have to pass, -ENODATA ;) > > 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices? I am sure that any fixed number will be wrong ;) there might be numbers worse than others though. > > 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Maybe don't? Instead of doing this in one go, evolve the existing article piece-wise, avoiding the wrong impression of a hostile take-over? And allowing for a nicer history of targeted commits? > Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport... I guess, if you get the buy-in from the current maintainers a fork-lift upgrade might work... Best Regards Sebastian > > Many thanks! > > Rich > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller @ 2021-04-05 21:55 ` Dave Taht 2021-04-06 0:47 ` Erik Auerswald 1 sibling, 0 replies; 26+ messages in thread From: Dave Taht @ 2021-04-05 21:55 UTC (permalink / raw) To: Sebastian Moeller; +Cc: Rich Brown, bloat The biggest internet spike I ever saw was this one. https://web.archive.org/web/20171113211640/http://blog.cerowrt.org/post/bufferbloat_on_the_backbone/ (the image has expired elsewhere. Someone tell Orwell!) Ironically it occurred during a videoconference with the shuttleworth folk. In improving the bufferbloat definition, I think some pretty graphs with circles and arrows on a paragraph of each one certifying the evidence for it, would be a case of american blind justice... https://www.youtube.com/watch?v=W5_8U4j51lI ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller 2021-04-05 21:55 ` Dave Taht @ 2021-04-06 0:47 ` Erik Auerswald 2021-04-06 6:31 ` Sebastian Moeller 1 sibling, 1 reply; 26+ messages in thread From: Erik Auerswald @ 2021-04-06 0:47 UTC (permalink / raw) To: bloat Hi, On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: > > all good questions, and interesting responses so far. I'll add some details below, I mostly concur with your responses. > > On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: > > > > Dave Täht has put me up to revising the current Bufferbloat article > > on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > > [...] > [...] while too large buffers cause undesirable increase in latency > under load (but decent throughput), [...] With too large buffers, even throughput degrades when TCP considers a delayed segment lost (or DNS gives up because the answers arrive too late). I do think there is _too_ large for buffers, period. > The solution basically is large buffers with adaptive management that I would prefer the word "sufficient" instead of "large." > works hard to keep latency under load increase and throughput inside > an acceptable "corridor". I concur that there is quite some usable range of buffer capacity when considering the latency/throughput trade-off, and AQM seems like a good solution to managing that. My preference is to sacrifice throughput for better latency, but then I have been bitten by too much latency quite often, but never by too little throughput caused by small buffers. YMMV. > [...] > But e.g. for traditional TCPs the amount of expected buffer needs > increases with RTT of a flow Does it? Does the propagation delay provide automatic "buffering" in the network? Does the receiver need to advertise sufficient buffer capacity (receive window) to allow the sender to fill the pipe? Does the sender need to provide sufficient buffer capacity to retransmit lost segments? Where are buffers actually needed? I am not convinced that large buffers in the network are needed for high throughput of high RTT TCP flows. See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for some information and links to a few papers. > [...] Thanks, Erik -- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- Edsger W. Dijkstra ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 0:47 ` Erik Auerswald @ 2021-04-06 6:31 ` Sebastian Moeller 2021-04-06 18:50 ` Erik Auerswald 2021-04-06 20:01 ` Bless, Roland (TM) 0 siblings, 2 replies; 26+ messages in thread From: Sebastian Moeller @ 2021-04-06 6:31 UTC (permalink / raw) To: Erik Auerswald; +Cc: bloat Hi Eric, thanks for your thoughts. > On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > > Hi, > > On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >> >> all good questions, and interesting responses so far. > > I'll add some details below, I mostly concur with your responses. > >>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>> >>> Dave Täht has put me up to revising the current Bufferbloat article >>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>> [...] >> [...] while too large buffers cause undesirable increase in latency >> under load (but decent throughput), [...] > > With too large buffers, even throughput degrades when TCP considers > a delayed segment lost (or DNS gives up because the answers arrive > too late). I do think there is _too_ large for buffers, period. Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;) > >> The solution basically is large buffers with adaptive management that > > I would prefer the word "sufficient" instead of "large." If properly managed there is no upper end for the size, it might not be used though, no? > >> works hard to keep latency under load increase and throughput inside >> an acceptable "corridor". > > I concur that there is quite some usable range of buffer capacity when > considering the latency/throughput trade-off, and AQM seems like a good > solution to managing that. I fear it is the only network side mitigation technique? > > My preference is to sacrifice throughput for better latency, but then > I have been bitten by too much latency quite often, but never by too > little throughput caused by small buffers. YMMV. Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need). > >> [...] >> But e.g. for traditional TCPs the amount of expected buffer needs >> increases with RTT of a flow > > Does it? Does the propagation delay provide automatic "buffering" in the > network? Does the receiver need to advertise sufficient buffer capacity > (receive window) to allow the sender to fill the pipe? Does the sender > need to provide sufficient buffer capacity to retransmit lost segments? > Where are buffers actually needed? At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news. > > I am not convinced that large buffers in the network are needed for high > throughput of high RTT TCP flows. > > See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for > some information and links to a few papers. Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;) Best Regards Sebastian > >> [...] > > Thanks, > Erik > -- > The computing scientist’s main challenge is not to get confused by > the complexities of his own making. > -- Edsger W. Dijkstra > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 6:31 ` Sebastian Moeller @ 2021-04-06 18:50 ` Erik Auerswald 2021-04-06 20:02 ` Bless, Roland (TM) 2021-04-06 20:01 ` Bless, Roland (TM) 1 sibling, 1 reply; 26+ messages in thread From: Erik Auerswald @ 2021-04-06 18:50 UTC (permalink / raw) To: bloat Hi, On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: > > On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > > On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: > >>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: > >>> > >>> Dave Täht has put me up to revising the current Bufferbloat article > >>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > >>> [...] > >> [...] while too large buffers cause undesirable increase in latency > >> under load (but decent throughput), [...] > > > > With too large buffers, even throughput degrades when TCP considers > > a delayed segment lost (or DNS gives up because the answers arrive > > too late). I do think there is _too_ large for buffers, period. > > Fair enough, timeouts could be changed though if required ;) but I fully > concur that laergeish buffers require management to become useful ;) Yes, large unmanaged buffers are at the core of the bufferbloat problem. One can make buffers small again, or manage them appropriately. The latter promises better results, the former is much simpler. Thanks, Erik -- Am I secure? I don't know. Does that mean I should just disable all security functionality and have an open root shell bound to a well known port? No. Obviously. -- Matthew Garret ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 18:50 ` Erik Auerswald @ 2021-04-06 20:02 ` Bless, Roland (TM) 2021-04-06 21:59 ` Erik Auerswald 0 siblings, 1 reply; 26+ messages in thread From: Bless, Roland (TM) @ 2021-04-06 20:02 UTC (permalink / raw) To: Erik Auerswald, bloat Hi, On 06.04.21 at 20:50 Erik Auerswald wrote: > Hi, > > On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: >>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>>>> >>>>> Dave Täht has put me up to revising the current Bufferbloat article >>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>>>> [...] >>>> [...] while too large buffers cause undesirable increase in latency >>>> under load (but decent throughput), [...] >>> >>> With too large buffers, even throughput degrades when TCP considers >>> a delayed segment lost (or DNS gives up because the answers arrive >>> too late). I do think there is _too_ large for buffers, period. >> >> Fair enough, timeouts could be changed though if required ;) but I fully >> concur that laergeish buffers require management to become useful ;) > > Yes, large unmanaged buffers are at the core of the bufferbloat problem. I disagree here: it is basically the combination of loss-based congestion control with unmanaged tail-drop buffers. There are at least two solutions to the bufferbloat problem 1) better congestion control algorithms 2) active queue management (+fq maybe) You can achieve high throughput and low delay with a corresponding congestion control (e.g., see this study of how to achieve a common limit on queuing delay for multiple flows: https://ieeexplore.ieee.org/document/8109356) even in large buffers. > One can make buffers small again, or manage them appropriately. > The latter promises better results, the former is much simpler. Small buffers definitely limit the queuing delay as well as jitter. However, how much performance is potentially lost due to the small buffer depends a lot on the arrival distribution. Regards, Roland ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 20:02 ` Bless, Roland (TM) @ 2021-04-06 21:59 ` Erik Auerswald 2021-04-06 23:32 ` Stephen Hemminger 2021-04-07 11:06 ` Bless, Roland (TM) 0 siblings, 2 replies; 26+ messages in thread From: Erik Auerswald @ 2021-04-06 21:59 UTC (permalink / raw) To: bloat Hi, On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote: > On 06.04.21 at 20:50 Erik Auerswald wrote: > >On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: > >>>On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > >>>On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: > >>>>>On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: > >>>>> > >>>>>Dave Täht has put me up to revising the current Bufferbloat article > >>>>>on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > >>>>>[...] > >Yes, large unmanaged buffers are at the core of the bufferbloat problem. > > I disagree here: it is basically the combination > of loss-based congestion control with unmanaged > tail-drop buffers. That worked for decades, then stopped working as well as before. What changed? Yes, there are complex interactions with how packet switched networks are used. Otherwise we would probably not find ourselves in the current situation. To me, the potential of having to wait minutes (yes, minutes!) for the result of a key stroke over an SSH session is not worth the potential throughput performance gain of buffers that cannot be called small. > There are at least two solutions > to the bufferbloat problem > 1) better congestion control algorithms > 2) active queue management (+fq maybe) Both approaches aim to not use all of the available buffer space, if there are unreasonably large buffers, i.e., they aim to not build a large standing queue. > [...] > Small buffers definitely limit the queuing delay as well as > jitter. However, how much performance is potentially lost due to > the small buffer depends a lot on the arrival distribution. Could the better congestion control algorithms avoid the potential performance loss by not requiring large buffers for high throughput? Might small buffers incentivise to not send huge bursts of data and hope for the best? FQ with AQM aims to allow the absorption of large traffic bursts (i.e., use of large buffers) without affecting _other_ flows too much. I would consider the combination of FQ+AQM, better congestion control algorithms, and large buffers as an optimization, but using just large buffers without any of the other two approaches as a mistake currently called bufferbloat. As such I see large unmanaged buffers at the core of the bufferbloat problem. FQ+AQM for every large buffer may solve the bufferbloat problem by attacking the "unmanaged" part of the problem. Small buffers may solve it by attacking the "large" part of the problem. Small buffers may bring their own share of problems, but IMHO those are much less than those of bufferbloat. I do not see TCP congestion control improvements, even combining sender-side improvements with receiver-side methods as in rLEDBAT[0], as a solution to bufferbloat, but rather as a mitigation. [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/ Anyway, I think it is obvious that I am willing to sacrifice more throughput for better latency than others. Thanks, Erik -- Simplicity is prerequisite for reliability. -- Edsger W. Dijkstra ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 21:59 ` Erik Auerswald @ 2021-04-06 23:32 ` Stephen Hemminger 2021-04-06 23:54 ` David Lang 2021-04-07 11:06 ` Bless, Roland (TM) 1 sibling, 1 reply; 26+ messages in thread From: Stephen Hemminger @ 2021-04-06 23:32 UTC (permalink / raw) To: Erik Auerswald; +Cc: bloat On Tue, 6 Apr 2021 23:59:53 +0200 Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > Hi, > > On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote: > > On 06.04.21 at 20:50 Erik Auerswald wrote: > > >On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: > > >>>On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > > >>>On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: > > >>>>>On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: > > >>>>> > > >>>>>Dave Täht has put me up to revising the current Bufferbloat article > > >>>>>on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > > >>>>>[...] > > >Yes, large unmanaged buffers are at the core of the bufferbloat problem. > > > > I disagree here: it is basically the combination > > of loss-based congestion control with unmanaged > > tail-drop buffers. > > That worked for decades, then stopped working as well as before. > What changed? > > Yes, there are complex interactions with how packet switched networks > are used. Otherwise we would probably not find ourselves in the current > situation. > > To me, the potential of having to wait minutes (yes, minutes!) for > the result of a key stroke over an SSH session is not worth the potential > throughput performance gain of buffers that cannot be called small. > > > There are at least two solutions > > to the bufferbloat problem > > 1) better congestion control algorithms > > 2) active queue management (+fq maybe) > > Both approaches aim to not use all of the available buffer space, if > there are unreasonably large buffers, i.e., they aim to not build a > large standing queue. > > > [...] > > Small buffers definitely limit the queuing delay as well as > > jitter. However, how much performance is potentially lost due to > > the small buffer depends a lot on the arrival distribution. > > Could the better congestion control algorithms avoid the potential > performance loss by not requiring large buffers for high throughput? > Might small buffers incentivise to not send huge bursts of data and hope > for the best? > > FQ with AQM aims to allow the absorption of large traffic bursts (i.e., > use of large buffers) without affecting _other_ flows too much. > > I would consider the combination of FQ+AQM, better congestion control > algorithms, and large buffers as an optimization, but using just large > buffers without any of the other two approaches as a mistake currently > called bufferbloat. As such I see large unmanaged buffers at the core > of the bufferbloat problem. > > FQ+AQM for every large buffer may solve the bufferbloat problem by > attacking the "unmanaged" part of the problem. Small buffers may solve > it by attacking the "large" part of the problem. Small buffers may > bring their own share of problems, but IMHO those are much less than > those of bufferbloat. > > I do not see TCP congestion control improvements, even combining > sender-side improvements with receiver-side methods as in rLEDBAT[0], > as a solution to bufferbloat, but rather as a mitigation. > > [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/ > > Anyway, I think it is obvious that I am willing to sacrifice more > throughput for better latency than others. > For Wikipedia it is important to make clear: * the symptoms = large latency * the cause = large buffers and aggressive protocols * the solutions = AQM, smaller buffers, pacing, better congestion control, etc. People can argue over best combination of solutions but the symptoms and causes should be defined, and non-contentious. It is too easy to go off in the weeds and have the solution of the day. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 23:32 ` Stephen Hemminger @ 2021-04-06 23:54 ` David Lang 0 siblings, 0 replies; 26+ messages in thread From: David Lang @ 2021-04-06 23:54 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Erik Auerswald, bloat On Tue, 6 Apr 2021, Stephen Hemminger wrote: > For Wikipedia it is important to make clear: > * the symptoms = large latency more precisely, large latency under load David Lang > * the cause = large buffers and aggressive protocols > * the solutions = AQM, smaller buffers, pacing, better congestion control, etc. > > People can argue over best combination of solutions but the symptoms and > causes should be defined, and non-contentious. > > It is too easy to go off in the weeds and have the solution of the day. > > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 21:59 ` Erik Auerswald 2021-04-06 23:32 ` Stephen Hemminger @ 2021-04-07 11:06 ` Bless, Roland (TM) 2021-04-27 1:41 ` Dave Taht 1 sibling, 1 reply; 26+ messages in thread From: Bless, Roland (TM) @ 2021-04-07 11:06 UTC (permalink / raw) To: Erik Auerswald, bloat Hi Erik, see inline. On 06.04.21 at 23:59 Erik Auerswald wrote: > Hi, > > On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote: >> On 06.04.21 at 20:50 Erik Auerswald wrote: >>> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: >>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>>>>>> >>>>>>> Dave Täht has put me up to revising the current Bufferbloat article >>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>>>>>> [...] >>> Yes, large unmanaged buffers are at the core of the bufferbloat problem. >> I disagree here: it is basically the combination >> of loss-based congestion control with unmanaged >> tail-drop buffers. > That worked for decades, then stopped working as well as before. > What changed? Larger buffers in many places and several orders of magnitude higher link speeds as well as higher awareness for latency as an important QoS parameter. > Yes, there are complex interactions with how packet switched networks > are used. Otherwise we would probably not find ourselves in the current > situation. > > To me, the potential of having to wait minutes (yes, minutes!) for > the result of a key stroke over an SSH session is not worth the potential > throughput performance gain of buffers that cannot be called small. > >> There are at least two solutions >> to the bufferbloat problem >> 1) better congestion control algorithms >> 2) active queue management (+fq maybe) > Both approaches aim to not use all of the available buffer space, if > there are unreasonably large buffers, i.e., they aim to not build a > large standing queue. > >> [...] >> Small buffers definitely limit the queuing delay as well as >> jitter. However, how much performance is potentially lost due to >> the small buffer depends a lot on the arrival distribution. > Could the better congestion control algorithms avoid the potential > performance loss by not requiring large buffers for high throughput? Yes, at least our TCP LoLa approach achieves high throughput without loss and a configurable limited queuing delay. So in principle this is possible. > Might small buffers incentivise to not send huge bursts of data and hope > for the best? There are different causes of bursts. You might get a huge burst from many flows that send only a single packet each, or you might get a huge burst from a few connections that transmit lots of back-to-back packets. Therefore, it depends on the location of the bottleneck and on the traffic arrival distribution. > FQ with AQM aims to allow the absorption of large traffic bursts (i.e., > use of large buffers) without affecting _other_ flows too much. > > I would consider the combination of FQ+AQM, better congestion control > algorithms, and large buffers as an optimization, but using just large > buffers without any of the other two approaches as a mistake currently > called bufferbloat. As such I see large unmanaged buffers at the core > of the bufferbloat problem. My counter example is that large unmanaged buffers would not necessarily lead to the bufferbloat problem if we had other congestion controls that avoid creating large standing queues. However, in practice, I also see only AQMs and better CCs in combination, because we have to live with legacy CCs for some time. > FQ+AQM for every large buffer may solve the bufferbloat problem by > attacking the "unmanaged" part of the problem. Small buffers may solve > it by attacking the "large" part of the problem. Small buffers may > bring their own share of problems, but IMHO those are much less than > those of bufferbloat. > > I do not see TCP congestion control improvements, even combining > sender-side improvements with receiver-side methods as in rLEDBAT[0], > as a solution to bufferbloat, but rather as a mitigation. > > [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/ As already said: the TCP LoLa concept shows that it is possible to solve the bufferbloat problem by a different congestion control approach. However, the coexistence of LoLa with loss-based CCs will always be a problem unless you separate both CC types by separate queues. Currently, LoLa is rather an academic study showing what is possible in theory, but it is far from being usable in the wild Internet, as it would require much more work to cope with all the peculiarities out there. Regards, Roland ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-07 11:06 ` Bless, Roland (TM) @ 2021-04-27 1:41 ` Dave Taht 2021-04-27 7:25 ` Bless, Roland (TM) 0 siblings, 1 reply; 26+ messages in thread From: Dave Taht @ 2021-04-27 1:41 UTC (permalink / raw) To: Bless, Roland (TM); +Cc: Erik Auerswald, bloat roland do you have running code for lola on linux? I'm running some starlink tests... On Wed, Apr 7, 2021 at 4:06 AM Bless, Roland (TM) <roland.bless@kit.edu> wrote: > > Hi Erik, > > see inline. > > On 06.04.21 at 23:59 Erik Auerswald wrote: > > Hi, > > > > On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote: > >> On 06.04.21 at 20:50 Erik Auerswald wrote: > >>> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: > >>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > >>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: > >>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: > >>>>>>> > >>>>>>> Dave Täht has put me up to revising the current Bufferbloat article > >>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > >>>>>>> [...] > >>> Yes, large unmanaged buffers are at the core of the bufferbloat problem. > >> I disagree here: it is basically the combination > >> of loss-based congestion control with unmanaged > >> tail-drop buffers. > > That worked for decades, then stopped working as well as before. > > What changed? > Larger buffers in many places and several orders of magnitude higher > link speeds > as well as higher awareness for latency as an important QoS parameter. > > Yes, there are complex interactions with how packet switched networks > > are used. Otherwise we would probably not find ourselves in the current > > situation. > > > > To me, the potential of having to wait minutes (yes, minutes!) for > > the result of a key stroke over an SSH session is not worth the potential > > throughput performance gain of buffers that cannot be called small. > > > >> There are at least two solutions > >> to the bufferbloat problem > >> 1) better congestion control algorithms > >> 2) active queue management (+fq maybe) > > Both approaches aim to not use all of the available buffer space, if > > there are unreasonably large buffers, i.e., they aim to not build a > > large standing queue. > > > >> [...] > >> Small buffers definitely limit the queuing delay as well as > >> jitter. However, how much performance is potentially lost due to > >> the small buffer depends a lot on the arrival distribution. > > Could the better congestion control algorithms avoid the potential > > performance loss by not requiring large buffers for high throughput? > Yes, at least our TCP LoLa approach achieves high throughput without > loss and > a configurable limited queuing delay. So in principle this is possible. > > Might small buffers incentivise to not send huge bursts of data and hope > > for the best? > There are different causes of bursts. You might get a huge burst from > many flows > that send only a single packet each, or you might get a huge burst from > a few connections > that transmit lots of back-to-back packets. Therefore, it depends on the > location > of the bottleneck and on the traffic arrival distribution. > > FQ with AQM aims to allow the absorption of large traffic bursts (i.e., > > use of large buffers) without affecting _other_ flows too much. > > > > I would consider the combination of FQ+AQM, better congestion control > > algorithms, and large buffers as an optimization, but using just large > > buffers without any of the other two approaches as a mistake currently > > called bufferbloat. As such I see large unmanaged buffers at the core > > of the bufferbloat problem. > My counter example is that large unmanaged buffers would not necessarily > lead to the bufferbloat problem if we had other congestion controls that > avoid > creating large standing queues. However, in practice, I also see only AQMs > and better CCs in combination, because we have to live with legacy CCs > for some time. > > FQ+AQM for every large buffer may solve the bufferbloat problem by > > attacking the "unmanaged" part of the problem. Small buffers may solve > > it by attacking the "large" part of the problem. Small buffers may > > bring their own share of problems, but IMHO those are much less than > > those of bufferbloat. > > > > I do not see TCP congestion control improvements, even combining > > sender-side improvements with receiver-side methods as in rLEDBAT[0], > > as a solution to bufferbloat, but rather as a mitigation. > > > > [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/ > As already said: the TCP LoLa concept shows that it is possible > to solve the bufferbloat problem by a different congestion control approach. > However, the coexistence of LoLa with loss-based CCs will always be > a problem unless you separate both CC types by separate queues. > Currently, LoLa is rather an academic study showing what is possible > in theory, but it is far from being usable in the wild Internet, > as it would require much more work to cope with all the peculiarities > out there. > > Regards, > Roland > > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat -- "For a successful technology, reality must take precedence over public relations, for Mother Nature cannot be fooled" - Richard Feynman dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-27 1:41 ` Dave Taht @ 2021-04-27 7:25 ` Bless, Roland (TM) 0 siblings, 0 replies; 26+ messages in thread From: Bless, Roland (TM) @ 2021-04-27 7:25 UTC (permalink / raw) To: Dave Taht; +Cc: Erik Auerswald, bloat Hi Dave, On 27.04.21 at 03:41 Dave Taht wrote: > roland do you have running code for lola on linux? I'm running some > starlink tests... I think the latest code is here and unfortunately it hasn't been updated for a while: https://git.scc.kit.edu/TCP-LoLa/TCP-LoLa_for_Linux However, in case that there are loss-based congestion controls present at the bottleneck in addition to LoLa flows, LoLa will not get any reasonable bandwidth, because we did not yet build in a more aggressive mode for these cases in order to not sacrifice LoLa's low delay goal. So you can give it a try, but it has not been engineered for real world usage so far, so some default parameters may not fit to your use case. Regards, Roland > On Wed, Apr 7, 2021 at 4:06 AM Bless, Roland (TM) <roland.bless@kit.edu> wrote: >> >> Hi Erik, >> >> see inline. >> >> On 06.04.21 at 23:59 Erik Auerswald wrote: >>> Hi, >>> >>> On Tue, Apr 06, 2021 at 10:02:21PM +0200, Bless, Roland (TM) wrote: >>>> On 06.04.21 at 20:50 Erik Auerswald wrote: >>>>> On Tue, Apr 06, 2021 at 08:31:01AM +0200, Sebastian Moeller wrote: >>>>>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >>>>>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >>>>>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Dave Täht has put me up to revising the current Bufferbloat article >>>>>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>>>>>>>> [...] >>>>> Yes, large unmanaged buffers are at the core of the bufferbloat problem. >>>> I disagree here: it is basically the combination >>>> of loss-based congestion control with unmanaged >>>> tail-drop buffers. >>> That worked for decades, then stopped working as well as before. >>> What changed? >> Larger buffers in many places and several orders of magnitude higher >> link speeds >> as well as higher awareness for latency as an important QoS parameter. >>> Yes, there are complex interactions with how packet switched networks >>> are used. Otherwise we would probably not find ourselves in the current >>> situation. >>> >>> To me, the potential of having to wait minutes (yes, minutes!) for >>> the result of a key stroke over an SSH session is not worth the potential >>> throughput performance gain of buffers that cannot be called small. >>> >>>> There are at least two solutions >>>> to the bufferbloat problem >>>> 1) better congestion control algorithms >>>> 2) active queue management (+fq maybe) >>> Both approaches aim to not use all of the available buffer space, if >>> there are unreasonably large buffers, i.e., they aim to not build a >>> large standing queue. >>> >>>> [...] >>>> Small buffers definitely limit the queuing delay as well as >>>> jitter. However, how much performance is potentially lost due to >>>> the small buffer depends a lot on the arrival distribution. >>> Could the better congestion control algorithms avoid the potential >>> performance loss by not requiring large buffers for high throughput? >> Yes, at least our TCP LoLa approach achieves high throughput without >> loss and >> a configurable limited queuing delay. So in principle this is possible. >>> Might small buffers incentivise to not send huge bursts of data and hope >>> for the best? >> There are different causes of bursts. You might get a huge burst from >> many flows >> that send only a single packet each, or you might get a huge burst from >> a few connections >> that transmit lots of back-to-back packets. Therefore, it depends on the >> location >> of the bottleneck and on the traffic arrival distribution. >>> FQ with AQM aims to allow the absorption of large traffic bursts (i.e., >>> use of large buffers) without affecting _other_ flows too much. >>> >>> I would consider the combination of FQ+AQM, better congestion control >>> algorithms, and large buffers as an optimization, but using just large >>> buffers without any of the other two approaches as a mistake currently >>> called bufferbloat. As such I see large unmanaged buffers at the core >>> of the bufferbloat problem. >> My counter example is that large unmanaged buffers would not necessarily >> lead to the bufferbloat problem if we had other congestion controls that >> avoid >> creating large standing queues. However, in practice, I also see only AQMs >> and better CCs in combination, because we have to live with legacy CCs >> for some time. >>> FQ+AQM for every large buffer may solve the bufferbloat problem by >>> attacking the "unmanaged" part of the problem. Small buffers may solve >>> it by attacking the "large" part of the problem. Small buffers may >>> bring their own share of problems, but IMHO those are much less than >>> those of bufferbloat. >>> >>> I do not see TCP congestion control improvements, even combining >>> sender-side improvements with receiver-side methods as in rLEDBAT[0], >>> as a solution to bufferbloat, but rather as a mitigation. >>> >>> [0] https://datatracker.ietf.org/doc/draft-irtf-iccrg-rledbat/ >> As already said: the TCP LoLa concept shows that it is possible >> to solve the bufferbloat problem by a different congestion control approach. >> However, the coexistence of LoLa with loss-based CCs will always be >> a problem unless you separate both CC types by separate queues. >> Currently, LoLa is rather an academic study showing what is possible >> in theory, but it is far from being usable in the wild Internet, >> as it would require much more work to cope with all the peculiarities >> out there. >> >> Regards, >> Roland >> >> >> _______________________________________________ >> Bloat mailing list >> Bloat@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/bloat > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 6:31 ` Sebastian Moeller 2021-04-06 18:50 ` Erik Auerswald @ 2021-04-06 20:01 ` Bless, Roland (TM) 2021-04-06 21:30 ` Sebastian Moeller 1 sibling, 1 reply; 26+ messages in thread From: Bless, Roland (TM) @ 2021-04-06 20:01 UTC (permalink / raw) To: Sebastian Moeller, Erik Auerswald; +Cc: bloat Hi Sebastian, see comments at the end. On 06.04.21 at 08:31 Sebastian Moeller wrote: > Hi Eric, > > thanks for your thoughts. > >> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >> >> Hi, >> >> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >>> >>> all good questions, and interesting responses so far. >> >> I'll add some details below, I mostly concur with your responses. >> >>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>>> >>>> Dave Täht has put me up to revising the current Bufferbloat article >>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>>> [...] >>> [...] while too large buffers cause undesirable increase in latency >>> under load (but decent throughput), [...] >> >> With too large buffers, even throughput degrades when TCP considers >> a delayed segment lost (or DNS gives up because the answers arrive >> too late). I do think there is _too_ large for buffers, period. > > Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;) > > >> >>> The solution basically is large buffers with adaptive management that >> >> I would prefer the word "sufficient" instead of "large." > > If properly managed there is no upper end for the size, it might not be used though, no? > > >> >>> works hard to keep latency under load increase and throughput inside >>> an acceptable "corridor". >> >> I concur that there is quite some usable range of buffer capacity when >> considering the latency/throughput trade-off, and AQM seems like a good >> solution to managing that. > > I fear it is the only network side mitigation technique? > > >> >> My preference is to sacrifice throughput for better latency, but then >> I have been bitten by too much latency quite often, but never by too >> little throughput caused by small buffers. YMMV. > > Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need). > >> >>> [...] >>> But e.g. for traditional TCPs the amount of expected buffer needs >>> increases with RTT of a flow >> >> Does it? Does the propagation delay provide automatic "buffering" in the >> network? Does the receiver need to advertise sufficient buffer capacity >> (receive window) to allow the sender to fill the pipe? Does the sender >> need to provide sufficient buffer capacity to retransmit lost segments? >> Where are buffers actually needed? > > At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news. > > >> >> I am not convinced that large buffers in the network are needed for high >> throughput of high RTT TCP flows. >> >> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for >> some information and links to a few papers. Thanks for the link Erik, but BBR is not properly described there "When the RTT creeps upward -- this taken as a signal of buffer occupancy congestion" and Sebastian also mentioned: "measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded". BBR does not use delay or its gradient as congestion signal. > Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;) Nope, a BDP-sized buffer is not required to allow 100% utilization with a single flow, because it depends on the used congestion control. For loss-based congestion control like Reno or Cubic, this may be true, but not necessarily for other congestion controls. Regards, Roland ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 20:01 ` Bless, Roland (TM) @ 2021-04-06 21:30 ` Sebastian Moeller 2021-04-06 21:36 ` Jonathan Morton 2021-04-07 10:39 ` Bless, Roland (TM) 0 siblings, 2 replies; 26+ messages in thread From: Sebastian Moeller @ 2021-04-06 21:30 UTC (permalink / raw) To: Bless, Roland (TM); +Cc: Erik Auerswald, bloat Hi Roland, thanks, much appreciated. > On Apr 6, 2021, at 22:01, Bless, Roland (TM) <roland.bless@kit.edu> wrote: > > Hi Sebastian, > > see comments at the end. > > On 06.04.21 at 08:31 Sebastian Moeller wrote: >> Hi Eric, >> thanks for your thoughts. >>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >>> >>> Hi, >>> >>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >>>> >>>> all good questions, and interesting responses so far. >>> >>> I'll add some details below, I mostly concur with your responses. >>> >>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>>>> >>>>> Dave Täht has put me up to revising the current Bufferbloat article >>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>>>> [...] >>>> [...] while too large buffers cause undesirable increase in latency >>>> under load (but decent throughput), [...] >>> >>> With too large buffers, even throughput degrades when TCP considers >>> a delayed segment lost (or DNS gives up because the answers arrive >>> too late). I do think there is _too_ large for buffers, period. >> Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;) >>> >>>> The solution basically is large buffers with adaptive management that >>> >>> I would prefer the word "sufficient" instead of "large." >> If properly managed there is no upper end for the size, it might not be used though, no? >>> >>>> works hard to keep latency under load increase and throughput inside >>>> an acceptable "corridor". >>> >>> I concur that there is quite some usable range of buffer capacity when >>> considering the latency/throughput trade-off, and AQM seems like a good >>> solution to managing that. >> I fear it is the only network side mitigation technique? >>> >>> My preference is to sacrifice throughput for better latency, but then >>> I have been bitten by too much latency quite often, but never by too >>> little throughput caused by small buffers. YMMV. >> Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need). >>> >>>> [...] >>>> But e.g. for traditional TCPs the amount of expected buffer needs >>>> increases with RTT of a flow >>> >>> Does it? Does the propagation delay provide automatic "buffering" in the >>> network? Does the receiver need to advertise sufficient buffer capacity >>> (receive window) to allow the sender to fill the pipe? Does the sender >>> need to provide sufficient buffer capacity to retransmit lost segments? >>> Where are buffers actually needed? >> At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news. >>> >>> I am not convinced that large buffers in the network are needed for high >>> throughput of high RTT TCP flows. >>> >>> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for >>> some information and links to a few papers. > > Thanks for the link Erik, but BBR is not properly described there > "When the RTT creeps upward -- this taken as a signal of buffer occupancy congestion" and Sebastian also mentioned: "measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded". BBR does not use > delay or its gradient as congestion signal. Looking at https://queue.acm.org/detail.cfm?id=3022184, I still think that it is not completely wrong to abstractly say BBR evaluates RTT changes as function of the current sending rate to probe the bottlenecks capacity (and adjust its sending rate based on that estimated capacity), but that might either indicate I am looking at the whole thing at too abstract a level, or, as I fear, that I am simply misunderstanding BBR's principle of operation... (or both ;)) (Sidenote I keep making: for a protocol believing it knows better than to interpret all packet losses as signs of congestion, it seems rather an ovrsight not having implemented a rfc3168 style CE response...) > >> Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;) > > Nope, a BDP-sized buffer is not required to allow 100% utilization with > a single flow, because it depends on the used congestion control. For > loss-based congestion control like Reno or Cubic, this may be true, > but not necessarily for other congestion controls. Yes, I should have hedged that better. For protocols like the ubiquitous TCP CUBIC (seems to be used by most major operating systems nowadays) a single flow might need BDP buffering to get close to 100% utilization. I am not wanting to say cubic is more important than other protocols, but it still represents a significant share of internet traffic. And any scheme to counter bufferbloat, could do worse than accept that reality and to allow for sufficient buffering to allow such protocols acceptable levels of utilization (all the while keeping the latency under load increase under control). Best Regards Sebastian > > Regards, > Roland ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 21:30 ` Sebastian Moeller @ 2021-04-06 21:36 ` Jonathan Morton 2021-04-07 10:39 ` Bless, Roland (TM) 1 sibling, 0 replies; 26+ messages in thread From: Jonathan Morton @ 2021-04-06 21:36 UTC (permalink / raw) To: Sebastian Moeller; +Cc: Bless, Roland (TM), bloat > On 7 Apr, 2021, at 12:30 am, Sebastian Moeller <moeller0@gmx.de> wrote: > > I still think that it is not completely wrong to abstractly say BBR evaluates RTT changes as function of the current sending rate to probe the bottlenecks capacity (and adjust its sending rate based on that estimated capacity), but that might either indicate I am looking at the whole thing at too abstract a level, or, as I fear, that I am simply misunderstanding BBR's principle of operation... It might be more accurate to say that it estimates the delivery rate at the receiver by observing the ack stream, and aims to match that with the send rate. There is some periodic probing upwards to see if a higher delivery rate is possible, followed by a downwards drain cycle which, I think, pays some attention to the observed RTT. And there is also a cwnd mechanism overlaid as a safety valve. Overall, it's very much a hybrid approach. - Jonathan Morton ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-06 21:30 ` Sebastian Moeller 2021-04-06 21:36 ` Jonathan Morton @ 2021-04-07 10:39 ` Bless, Roland (TM) 1 sibling, 0 replies; 26+ messages in thread From: Bless, Roland (TM) @ 2021-04-07 10:39 UTC (permalink / raw) To: Sebastian Moeller; +Cc: Erik Auerswald, bloat Hi Sebastian, see inline. On 06.04.21 at 23:30 Sebastian Moeller wrote: >> On Apr 6, 2021, at 22:01, Bless, Roland (TM) <roland.bless@kit.edu> wrote: >> >> Hi Sebastian, >> >> see comments at the end. >> >> On 06.04.21 at 08:31 Sebastian Moeller wrote: >>> Hi Eric, >>> thanks for your thoughts. >>>> On Apr 6, 2021, at 02:47, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >>>> >>>> Hi, >>>> >>>> On Mon, Apr 05, 2021 at 11:49:00PM +0200, Sebastian Moeller wrote: >>>>> all good questions, and interesting responses so far. >>>> I'll add some details below, I mostly concur with your responses. >>>> >>>>>> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover@gmail.com> wrote: >>>>>> >>>>>> Dave Täht has put me up to revising the current Bufferbloat article >>>>>> on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) >>>>>> [...] >>>>> [...] while too large buffers cause undesirable increase in latency >>>>> under load (but decent throughput), [...] >>>> With too large buffers, even throughput degrades when TCP considers >>>> a delayed segment lost (or DNS gives up because the answers arrive >>>> too late). I do think there is _too_ large for buffers, period. >>> Fair enough, timeouts could be changed though if required ;) but I fully concur that laergeish buffers require management to become useful ;) >>>>> The solution basically is large buffers with adaptive management that >>>> I would prefer the word "sufficient" instead of "large." >>> If properly managed there is no upper end for the size, it might not be used though, no? >>>>> works hard to keep latency under load increase and throughput inside >>>>> an acceptable "corridor". >>>> I concur that there is quite some usable range of buffer capacity when >>>> considering the latency/throughput trade-off, and AQM seems like a good >>>> solution to managing that. >>> I fear it is the only network side mitigation technique? >>>> My preference is to sacrifice throughput for better latency, but then >>>> I have been bitten by too much latency quite often, but never by too >>>> little throughput caused by small buffers. YMMV. >>> Yepp, with speedtests being the killer-application for fast end-user links (still, which is sad in itself), manufacturers and ISPs are incentivized to err on the side of too large for buffers, so the default buffering typically will not cause noticeable under-utilisation, as long as nobody wants to run single-flow speedtests over a geostationary satellite link ;). (I note that many/most speedtests silently default to test with multiple flows nowadays, with single stream tests being at least optional in some, which will reduce the expected buffering need). >>>>> [...] >>>>> But e.g. for traditional TCPs the amount of expected buffer needs >>>>> increases with RTT of a flow >>>> Does it? Does the propagation delay provide automatic "buffering" in the >>>> network? Does the receiver need to advertise sufficient buffer capacity >>>> (receive window) to allow the sender to fill the pipe? Does the sender >>>> need to provide sufficient buffer capacity to retransmit lost segments? >>>> Where are buffers actually needed? >>> At all those places ;) in the extreme a single packet buffer should be sufficient, but that places unrealistic high demands on the processing capabilities at all nodes of a network and does not account for anything unexpected (like another low starting). And in all cases doing things smarter can help, like pacing is better at the sender's side (with better meaning easier in the network), competent AQM is better at the bottleneck link, and at the receiver something like TCP SACK (and the required buffers to make that work) can help; all those cases work better with buffers. The catch is that buffers solve important issues while introducing new issues, that need fixing. I am sure you know all this, but spelling it out helps me to clarify my thoughts on the matter, so please just ignore if boring/old news. >>>> I am not convinced that large buffers in the network are needed for high >>>> throughput of high RTT TCP flows. >>>> >>>> See, e.g., https://people.ucsc.edu/~warner/Bufs/buffer-requirements for >>>> some information and links to a few papers. >> Thanks for the link Erik, but BBR is not properly described there >> "When the RTT creeps upward -- this taken as a signal of buffer occupancy congestion" and Sebastian also mentioned: "measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded". BBR does not use >> delay or its gradient as congestion signal. > Looking at https://queue.acm.org/detail.cfm?id=3022184, I still think that it is not completely wrong to abstractly say BBR evaluates RTT changes as function of the current sending rate to probe the bottlenecks capacity (and adjust its sending rate based on that estimated capacity), but that might either indicate I am looking at the whole thing at too abstract a level, or, as I fear, that I am simply misunderstanding BBR's principle of operation... (or both ;)) (Sidenote I keep making: for a protocol believing it knows better than to interpret all packet losses as signs of congestion, it seems rather an ovrsight not having implemented a rfc3168 style CE response...) > I think both, but you are in good company. Several people have misinterpreted how BBR actually works. In BBRv1, the measured RTT is only used for the inflight cap CWnd of 2 BDP. The BBR team considers delay as being a too noisy signal (see slide 10 https://www.ietf.org/proceedings/97/slides/slides-97-maprg-traffic-policing-in-the-internet-yuchung-cheng-and-neal-cardwell-00.pdf) and therefore doesn't use it as congestion signal. Actually, BBRv1 does not react to any congestion signal, there isn't even any backoff reaction. BBRv2, however, reacts to packet loss (>=2%) or ECN signals. >>> Thanks, I think the bandwidth delay product is still the worst case buffering required to allow 100% utilization with a single flow (a use case that at least for home links seems legit, for a back bone link probably not). But in any case if the buffers are properly managed their maximum size will not really matter, as long as it is larger than the required minimum ;) >> Nope, a BDP-sized buffer is not required to allow 100% utilization with >> a single flow, because it depends on the used congestion control. For >> loss-based congestion control like Reno or Cubic, this may be true, >> but not necessarily for other congestion controls. > Yes, I should have hedged that better. For protocols like the ubiquitous TCP CUBIC (seems to be used by most major operating systems nowadays) a single flow might need BDP buffering to get close to 100% utilization. I am not wanting to say cubic is more important than other protocols, but it still represents a significant share of internet traffic. And any scheme to counter bufferbloat, could do worse than accept that reality and to allow for sufficient buffering to allow such protocols acceptable levels of utilization (all the while keeping the latency under load increase under control). I didn't get what you were trying to say with your last sentence. My point was that the BDP rule of thumb was tied to a specific type of congestion controls and that the buffer sizing rule should probably rather reflect the burst absorption requirements (see "good queue" in the Codel paper) than specifics of congestion controls. CC schemes that try to counter bufferbloat typically suffer under the presence of loss-based congestion controls, because they only react to loss and this requires a full buffer (unless an AQM is in place), which causes queuing delay. Regards, Roland ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Bloat] Questions for Bufferbloat Wikipedia article 2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown ` (2 preceding siblings ...) 2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller @ 2021-04-06 18:54 ` Neil Davies 3 siblings, 0 replies; 26+ messages in thread From: Neil Davies @ 2021-04-06 18:54 UTC (permalink / raw) To: Rich Brown; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 2766 bytes --] It should be noted that a) and b) are related by the “service rate” - if you want to look at how to measure “unhappiness” in networking you might want to look at the Broadband Forum’s work on “Quality Attenuation” discussed in TR452.1[1] where there is a formal definition and calculus for it… Neil [1]https://www.broadband-forum.org/download/TR-452.1.pdf > On 5 Apr 2021, at 13:46, Rich Brown <richb.hanover@gmail.com> wrote: > > Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat) > > Before I get into it, I want to ask real experts for some guidance... Here goes: > > 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) > > a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.) > > b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) > > c) Or some other definition? > > 2) All network equipment can be bloated. I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center. Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment? Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications? Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact? > > 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices? > > 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article? Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport... > > Many thanks! > > Rich > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat [-- Attachment #2: Message signed with OpenPGP --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2021-04-27 7:25 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-05 12:46 [Bloat] Questions for Bufferbloat Wikipedia article Rich Brown 2021-04-05 15:13 ` Stephen Hemminger 2021-04-05 15:24 ` David Lang 2021-04-05 15:57 ` Dave Collier-Brown 2021-04-05 16:25 ` Kelvin Edmison 2021-04-05 18:00 ` [Bloat] Questions for Bufferbloat Wikipedia article - question #2 Rich Brown 2021-04-05 18:08 ` David Lang 2021-04-05 20:30 ` Erik Auerswald 2021-04-05 20:36 ` Dave Taht 2021-04-05 21:49 ` [Bloat] Questions for Bufferbloat Wikipedia article Sebastian Moeller 2021-04-05 21:55 ` Dave Taht 2021-04-06 0:47 ` Erik Auerswald 2021-04-06 6:31 ` Sebastian Moeller 2021-04-06 18:50 ` Erik Auerswald 2021-04-06 20:02 ` Bless, Roland (TM) 2021-04-06 21:59 ` Erik Auerswald 2021-04-06 23:32 ` Stephen Hemminger 2021-04-06 23:54 ` David Lang 2021-04-07 11:06 ` Bless, Roland (TM) 2021-04-27 1:41 ` Dave Taht 2021-04-27 7:25 ` Bless, Roland (TM) 2021-04-06 20:01 ` Bless, Roland (TM) 2021-04-06 21:30 ` Sebastian Moeller 2021-04-06 21:36 ` Jonathan Morton 2021-04-07 10:39 ` Bless, Roland (TM) 2021-04-06 18:54 ` Neil Davies
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox