[Cerowrt-devel] trying to make sense of what switch vendors say wrt buffer bloat

dpreed at reed.com dpreed at reed.com
Mon Jun 6 22:52:37 EDT 2016


So did anyone write a response debunking their paper?   Their NS-2 simulation is most likely the erroneous part of their analysis - the white paper would not pass a review by qualified referees because there is no way to check their results and some of what they say beggars belief.
 
Bechtolsheim is one of those guys who can write any damn thing and it becomes "truth" - mostly because he co-founded Sun. But that doesn't mean that he can't make huge errors - any of us can.
 
The so-called TCP/IP Bandwidth Capture effect that he refers to doesn't sound like any capture effect I've ever heard of.  There is an "Ethernet Capture Effect" (which is cited), which is due to properties of CSMA/CD binary exponential backoff, not anything to do with TCP's flow/congestion control.  So it has that "truthiness" that makes glib people sound like they know what they are talking about, but I'd like to see a reference that says this is a property of TCP!
 
What's interesting is that the reference to the Ethernet Capture Effect in that white paper proposes a solution that involves changing the backoff algorithm slightly at the Ethernet level - NOT increasing buffer size!
 
Another thing that would probably improve matters a great deal would be to drop/ECN-mark packets when a contended output port on an Arista switch develops a backlog.  This will throttle TCP sources sharing the path.
 
The comments in the white paper that say that ACK contention in TCP in the reverse direction are the problem that causes the "so-called TCP/IP Bandwidth Capture effect" that is invented by the authors appears to be hogwash of the first order.
 
Debunking Bechtolsheim credibly would get a lot of attention to the bufferbloat cause, I suspect.
 


On Monday, June 6, 2016 5:16pm, "Ketan Kulkarni" <ketkulka at gmail.com> said:



some time back they had this whitepaper -
"Why Big Data Needs Big Buffer Switches"

[ http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf ]( http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf )
the type of apps they talk about is big data, hadoop etc


On Mon, Jun 6, 2016 at 11:37 AM, Mikael Abrahamsson <[ swmike at swm.pp.se ]( mailto:swmike at swm.pp.se )> wrote:
On Mon, 6 Jun 2016, Jonathan Morton wrote:

At 100ms buffering, their 10Gbps switch is effectively turning any DC it’s installed in into a transcontinental Internet path, as far as peak latency is concerned.  Just because RAM is cheap these days…Nono, nononononono. I can tell you they're spending serious money on inserting this kind of buffering memory into these kinds of devices. Buying these devices without deep buffers is a lot lower cost.

 These types of switch chips either have on-die memory (usually 16MB or less), or they have very expensive (a direct cost of lowered port density) off-chip buffering memory.

 Typically you do this:

 ports ---|-------
 ports ---|      |
 ports ---| chip |
 ports ---|-------

 Or you do this

 ports ---|------|---buffer
 ports ---| chip |---TCAM
          --------

 or if you do a multi-linecard-device

 ports ---|------|---buffer
          | chip |---TCAM
          --------
             |
         switch fabric

 (or any variant of them)

 So basically if you want to buffer and if you want large L2-L4 lookup tables, you have to sacrifice ports. Sacrifice lots of ports.

 So never say these kinds of devices add buffering because RAM is cheap. This is most definitely not why they're doing it. Buffer memory for them is EXTREMELY EXPENSIVE.

 -- 
 Mikael Abrahamsson    email: [ swmike at swm.pp.se ]( mailto:swmike at swm.pp.se )
_______________________________________________
 Cerowrt-devel mailing list
[ Cerowrt-devel at lists.bufferbloat.net ]( mailto:Cerowrt-devel at lists.bufferbloat.net )
[ https://lists.bufferbloat.net/listinfo/cerowrt-devel ]( https://lists.bufferbloat.net/listinfo/cerowrt-devel )

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20160606/dc5367f1/attachment-0001.html>


More information about the Cerowrt-devel mailing list