[Bloat] Questions for Bufferbloat Wikipedia article

Sebastian Moeller moeller0 at gmx.de
Mon Apr 5 17:49:00 EDT 2021


Hi Rich,

all good questions, and interesting responses so far.


> On Apr 5, 2021, at 14:46, Rich Brown <richb.hanover at gmail.com> wrote:
> 
> Dave Täht has put me up to revising the current Bufferbloat article on Wikipedia (https://en.wikipedia.org/wiki/Bufferbloat)
> 
> Before I get into it, I want to ask real experts for some guidance... Here goes:
> 
> 1) What is *our* definition of Bufferbloat? (We invented the term, so I think we get to define it.) 
> 
> a) Are we content with the definition from the bufferbloat.net site, "Bufferbloat is the undesirable latency that comes from a router or other network equipment buffering too much data." (This suggests bufferbloat is latency, and could be measured in seconds/msec.)
> 
> b) Or should we use something like Jim Gettys' definition from the Dark Buffers article (https://ieeexplore.ieee.org/document/5755608), "Bufferbloat is the existence of excessively large (bloated) buffers in systems, particularly network communication systems." (This suggests bufferbloat is an unfortunate state of nature, measured in units of "unhappiness" :-) 

	I do not even think these are mutually exclusive; "over-sized but under-managed buffers" cause avoidable variable latency, aka Jitter, which is the bane of all interactive use-cases. The lower jitter the better, and jitter can be measured in units of time, but also acts as "currency" in the unhappiness domain ;). The challenge is that we know that no/too small buffers cause undesirable loss of throughput (but small latency under load), while too large buffers cause undesirable increase in latency under load (but decent throughput), so the challenge is to get buffering right to keep throughput acceptably high, while at the same time keeping latency under load acceptable low...
	The solution basically is large buffers with adaptive management that works hard to keep latency under load increase and throughput inside an acceptable "corridor".


> c) Or some other definition?
> 
> 2) All network equipment can be bloated.

	+1; depending on condition. Corollary: static buffer sizing is unlikely to be the right answer unless the load is constant...


> I have seen (but not really followed) controversy regarding the amount of buffering needed in the Data Center.

	Conceptually the same as everywhere else, just enough to keep throughput up ;) But e.g. for traditional TCPs the amount of expected buffer needs increases with RTT of a flow, so intra-datacenter flows with low RTTs will only require relative small buffers to cope.


> Is it worth having the Wikipedia article distinguish between Data Center equipment and CPE/home/last mile equipment?

	That depends on our audience, but realistically over-sized but under-managed buffers can and do occur everywhere, so maybe better include all?


> Similarly, is the "bloat condition" and its mitigation qualitatively different between those applications?

	IMHO, not really, we have two places to twiddle, the buffer (and how it is managed) and the two endpoints transferring data. Our go to solution deals with buffer management, but protocols can also help, e.g. by using pacing (spreading out packets based on the estimated throughput) instead of sending in bursts. Or using different protocols that are more adaptive to the perceived buffering along a path, like BBR (which as you surely knows, tries to actively measure a path's capacity by regularly sending closely spaced probe packets and measures the induced latency increase from those, interpreting to much latency as sign that the capacity was reached/exceeded).
	Methods at both places are not guaranteed to work hand in hand though (naive BBR fails to recognize an AQM on the path that keeps latency under load well-bounded, which was noted and fixed in later BBR incarnations); making the whole problem space "a mess".



> Finally, do any of us know how frequently data centers/backbone ISPs experience buffer-induced latencies? What's the magnitude of the impact?

	I have to pass, -ENODATA ;)

> 
> 3) The Wikipedia article mentions guidance that network gear should accommodate buffering 250 msec of traffic(!) Is this a real "rule of thumb" or just an often-repeated but unscientific suggestion? Can someone give pointers to best practices?

	I am sure that any fixed number will be wrong ;) there might be numbers worse than others though.


> 
> 4) Meta question: Can anyone offer any advice on making a wholesale change to a Wikipedia article?

	Maybe don't? Instead of doing this in one go, evolve the existing article piece-wise, avoiding the wrong impression of a hostile take-over? And allowing for a nicer history of targeted commits?


> Before I offer a fork-lift replacement I would a) solicit advice on the new text from this list, and b) try to make contact with some of the reviewers and editors who've been maintaining the page to establish some bona fides and rapport...

	I guess, if you get the buy-in from the current maintainers a fork-lift upgrade might work...

Best Regards
	Sebastian


> 
> Many thanks!
> 
> Rich
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



More information about the Bloat mailing list