[Cerowrt-devel] trying to make sense of what switch vendors say wrt buffer bloat

Dave Taht dave.taht at gmail.com
Tue Jun 7 10:46:46 EDT 2016


On Tue, Jun 7, 2016 at 3:46 AM, Mikael Abrahamsson <swmike at swm.pp.se> wrote:
> On Mon, 6 Jun 2016, dpreed at reed.com wrote:
>
>> Even better, it would be fun to get access to an Arista switch and some
>> high performance TCP sources and sinks, and demonstrate extreme bufferbloat
>> compared to a small-buffer switch.  Just a demo, not a simulation full of
>> assumptions and guesses.

In terms of doing this at low cost, we can pretty easily setup a linux
box nowadays that can forward at 10GigE using mellonox hardware.

In terms of finding a (set of) cheap 10GigE capable switches, the
needed investment looks to be in the 20k range to buy one. (?) That is
essentially more than the entire cerowrt hw budget for the past 5
years....

>
> So while it can be rightfully argued that we don't need 100ms worth of
> buffering (here it actually is kind of correct to say "ram is cheap" because
> as soon as you go for offchip RAM, it's now cheap).
>
> So these vendors have two choices:
>
> 1. 8-16MB on-chip buffer.
> 2. External RAM
>
> If you choose the external RAM one, you might as well put a lot of RAM
> there, and give the option to the customer to configure the port buffer
> settings any way they want.
>
> For the on-chip small buffer one, having 80 10GE ports,all sharing 8
> megabyte of buffer (let's say 10 ports are congesting, meaning each port
> gets 800kilobytes of buffer) and each port doing 1.25gigabyte/s of data,
> that's 0.64ms worth of buffer per congested port (I hope I got my math
> right). That is just too little unless you control the TCP stacks of the
> clients, and are just doing low-RTT communication.
>
> So while I'd admit that 100ms worth of FIFO is too much, what needs to
> happen now is to have them configured to do something clever and aiming to
> never have prolonged use of more than a few ms worth of buffer.
>
> It's hard to do AQM with half a millisecond worth of buffer, right?
>
> At least this has been shown by previous generation of datacenter switches
> that had miniscule buffers and ISPs tried to use them and when there were
> microbursts there was uncontrolled packet loss.

Of possible interest, measurementlabs encountered and thoroughly
debugged a microburst problem across their backbones from last year.
This is a good read, although I wish the graphs were more directly
comparable.

https://www.measurementlab.net/publications/SwitchDiscardNotice-Final-20160525.pdf
>
> --
> Mikael Abrahamsson    email: swmike at swm.pp.se
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


More information about the Cerowrt-devel mailing list