[Cerowrt-devel] trying to make sense of what switch vendors say wrt buffer bloat

Jonathan Morton chromatix99 at gmail.com
Mon Jun 6 13:46:29 EDT 2016


> On 6 Jun, 2016, at 19:53, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
> 
>> Buffer bloat was a relevant on 10/100M switches, not 10Gb switches. At
>> 10Gb we can empty the queue in ~100ms, which is less than the TCP
>> retransmission timers, therefore no bloat. Buffer bloat can happen at
>> slower speeds, but not an issue at the speeds we have on our switches.
> 
> 100 ms of buffering at 10 Gbps? Holy cow!
> 
> There's no agreed-upon definition of what exactly constitutes 'bloat',
> and it really depends on the application. As such, I'm not surprised
> that this is the kind of answer you get if you ask "do your switches
> suffer from bufferbloat". A better question would be "how much buffer
> latency can your switches add to my traffic" - which they offer here.
> 
> If I read the answer right, anytime you have (say) two ingress ports
> sending traffic at full speed out one egress port, that traffic will be
> queued for 100 ms. I would certainly consider that broken, but well,
> YMMV depending on what you need them for...

In a switch, which I have to assume will be used in a LAN or DC context, I would consider 1ms buffering to be a sane value - regardless of link speed.  At 10Gbps this still works out to roughly 1MB of buffer per port.

At 10Mbps this requirement corresponds to a single packet per port; I would tolerate an increase to 10ms (about 6 full-size packets) in that specific case, purely to reduce packet loss from typical packet-pair transmission characteristics.  The same buffer size should therefore suffice for 10Mbps and 100Mbps Ethernet.

Their reference to TCP retransmission timers betrays both a fundamental misunderstanding of how TCP works and an ignorance of the fact that non-TCP traffic is also important (and is typically more latency sensitive).  Some customers would consider even 1ms to be glacially slow.

At 100ms buffering, their 10Gbps switch is effectively turning any DC it’s installed in into a transcontinental Internet path, as far as peak latency is concerned.  Just because RAM is cheap these days…

For anything above switch class (ie. with visibility at Layer 3 rather than 2), I would consider AQM mandatory to support a claim of “unbloated".  Even if it’s just WRED; it’s not considered a *good* AQM by today’s standards, but it beats a dumb FIFO hands down.

 - Jonathan Morton



More information about the Cerowrt-devel mailing list