[Bloat] Questions for Bufferbloat Wikipedia article - question #2

Mon Apr 5 16:30:41 EDT 2021

Hi,

On Mon, Apr 05, 2021 at 11:08:07AM -0700, David Lang wrote:
> On Mon, 5 Apr 2021, Rich Brown wrote:
> 
> >Next question...
> >
> >>2) All network equipment can be bloated. I have seen (but not
> >>really followed) controversy regarding the amount of buffering
> >>needed in the Data Center. Is it worth having the Wikipedia article
> >>distinguish between Data Center equipment and CPE/home/last mile
> >>equipment? Similarly, is the "bloat condition" and its mitigation
> >>qualitatively different between those applications? Finally, do
> >>any of us know how frequently data centers/backbone ISPs experience
> >>buffer-induced latencies? What's the magnitude of the impact?

I do not have experience with "web scale" data centers or "backbone"
ISPs, but I think I can add related information.

>From my work experience with (mostly) enterprise and service provider
networks I would say that bufferbloat effects are relatively rarely
observed there.  Many network engineers do not know about bufferbloat
and do not believe in its existence after being told about bufferbloat.
I have seen a latency consideration for a country-wide network that
explicitly excluded queuing delays as irrelevant and cited just
propagation and serialization delay as relevant for the end-to-end
latency.  Demonstrating bufferbloat effects with a test setup with
prolonged congestion is usually labeled unrealistic and ignored.

Campus networks and ("small") data centers are usually overprovisioned
with bandwidth and thus do not exhibit prolonged congestion.
Additionally, a lot of enterprise networking gear, specifically
"switches," do not have oversized buffers.

Campus networks more often show problems with too small buffers for a
given application (e.g., cameras streaming data via RTP with large "key
frames" sent at line rate), such that "microbursts" result in packet
drops and thus observable problems even with low bandwidth utilization
over longer time frames (minutes).  The idea that buffers could be too
large does not seem realistic there.

"Routers" for the ISP market (not "home routers", but network devices
used inside the ISP's core and aggregation networks and similar) often
do have unreasonably ("bloated") buffer capacity, but they are usually
operated without persistent congestion.  When persistent congestion
does happen on a customer connection, and bufferbloat does result in
unusably high latency, the customer is often told to send at a lower
rate, but "bufferbloat" is usually not recognized as the root cause,
and thus not addressed.

It seems to me as if "bufferbloat" is most noticable on the consumer
end of mass market network connections.  I.e., low margin markets with
non-technical customers.

If CAKE behind the access circuit of an end customer can mitigate
bufferbloat, then bufferbloat effects are only visible there and do not
show up in other parts of the network.

> the bandwidth available in datacenters is high enough that it's much
> harder to run into grief there (recognizing that not every piece of
> datacenter equipment is hooked to 100G circuits)

That is my impression as well.

> I think it's best to talk about excessive buffers in terms of time
> rather than bytes, and you can then show the difference between two
> buffers of the same size, one connected to a 10Mb (or 1Mb) DSL upload
> vs 100G datacenter circuit. After that one example, the rest of the
> article can talk about time and it will be globally applicable.

I too think that _time_ is the important unit regarding buffers, even
though they are mostly described in units of data (bytes or packets).

Thanks,
Erik
-- 
To have our best advice ignored is the common fate of all who take on
the role of consultant, ever since Cassandra pointed out the dangers of
bringing a wooden horse within the walls of Troy.
                        -- C.A.R. Hoare