From: dpreed@reed.com
To: dpreed@reed.com
Cc: "Ketan Kulkarni" <ketkulka@gmail.com>,
"Jonathan Morton" <chromatix99@gmail.com>,
"cerowrt-devel@lists.bufferbloat.net"
<cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] trying to make sense of what switch vendors say wrt buffer bloat
Date: Mon, 6 Jun 2016 22:58:49 -0400 (EDT) [thread overview]
Message-ID: <1465268329.938313737@apps.rackspace.com> (raw)
In-Reply-To: <1465267957.902610235@apps.rackspace.com>
[-- Attachment #1: Type: text/plain, Size: 4295 bytes --]
Even better, it would be fun to get access to an Arista switch and some high performance TCP sources and sinks, and demonstrate extreme bufferbloat compared to a small-buffer switch. Just a demo, not a simulation full of assumptions and guesses.
RRUL, basically.
On Monday, June 6, 2016 10:52pm, dpreed@reed.com said:
So did anyone write a response debunking their paper? Their NS-2 simulation is most likely the erroneous part of their analysis - the white paper would not pass a review by qualified referees because there is no way to check their results and some of what they say beggars belief.
Bechtolsheim is one of those guys who can write any damn thing and it becomes "truth" - mostly because he co-founded Sun. But that doesn't mean that he can't make huge errors - any of us can.
The so-called TCP/IP Bandwidth Capture effect that he refers to doesn't sound like any capture effect I've ever heard of. There is an "Ethernet Capture Effect" (which is cited), which is due to properties of CSMA/CD binary exponential backoff, not anything to do with TCP's flow/congestion control. So it has that "truthiness" that makes glib people sound like they know what they are talking about, but I'd like to see a reference that says this is a property of TCP!
What's interesting is that the reference to the Ethernet Capture Effect in that white paper proposes a solution that involves changing the backoff algorithm slightly at the Ethernet level - NOT increasing buffer size!
Another thing that would probably improve matters a great deal would be to drop/ECN-mark packets when a contended output port on an Arista switch develops a backlog. This will throttle TCP sources sharing the path.
The comments in the white paper that say that ACK contention in TCP in the reverse direction are the problem that causes the "so-called TCP/IP Bandwidth Capture effect" that is invented by the authors appears to be hogwash of the first order.
Debunking Bechtolsheim credibly would get a lot of attention to the bufferbloat cause, I suspect.
On Monday, June 6, 2016 5:16pm, "Ketan Kulkarni" <ketkulka@gmail.com> said:
some time back they had this whitepaper -
"Why Big Data Needs Big Buffer Switches"
[ http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf ]( http://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf )
the type of apps they talk about is big data, hadoop etc
On Mon, Jun 6, 2016 at 11:37 AM, Mikael Abrahamsson <[ swmike@swm.pp.se ]( mailto:swmike@swm.pp.se )> wrote:
On Mon, 6 Jun 2016, Jonathan Morton wrote:
At 100ms buffering, their 10Gbps switch is effectively turning any DC it’s installed in into a transcontinental Internet path, as far as peak latency is concerned. Just because RAM is cheap these days…Nono, nononononono. I can tell you they're spending serious money on inserting this kind of buffering memory into these kinds of devices. Buying these devices without deep buffers is a lot lower cost.
These types of switch chips either have on-die memory (usually 16MB or less), or they have very expensive (a direct cost of lowered port density) off-chip buffering memory.
Typically you do this:
ports ---|-------
ports ---| |
ports ---| chip |
ports ---|-------
Or you do this
ports ---|------|---buffer
ports ---| chip |---TCAM
--------
or if you do a multi-linecard-device
ports ---|------|---buffer
| chip |---TCAM
--------
|
switch fabric
(or any variant of them)
So basically if you want to buffer and if you want large L2-L4 lookup tables, you have to sacrifice ports. Sacrifice lots of ports.
So never say these kinds of devices add buffering because RAM is cheap. This is most definitely not why they're doing it. Buffer memory for them is EXTREMELY EXPENSIVE.
--
Mikael Abrahamsson email: [ swmike@swm.pp.se ]( mailto:swmike@swm.pp.se )
_______________________________________________
Cerowrt-devel mailing list
[ Cerowrt-devel@lists.bufferbloat.net ]( mailto:Cerowrt-devel@lists.bufferbloat.net )
[ https://lists.bufferbloat.net/listinfo/cerowrt-devel ]( https://lists.bufferbloat.net/listinfo/cerowrt-devel )
[-- Attachment #2: Type: text/html, Size: 7997 bytes --]
next prev parent reply other threads:[~2016-06-07 2:58 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 15:29 Eric Johansson
2016-06-06 16:53 ` Toke Høiland-Jørgensen
2016-06-06 17:46 ` Jonathan Morton
2016-06-06 18:37 ` Mikael Abrahamsson
2016-06-06 21:16 ` Ketan Kulkarni
2016-06-07 2:52 ` dpreed
2016-06-07 2:58 ` dpreed [this message]
2016-06-07 10:46 ` Mikael Abrahamsson
2016-06-07 14:46 ` Dave Taht
2016-06-07 17:51 ` Eric Johansson
2016-06-10 21:45 ` dpreed
2016-06-11 1:36 ` Jonathan Morton
2016-06-11 8:25 ` Sebastian Moeller
2021-07-02 16:42 ` [Cerowrt-devel] Bechtolschiem Dave Taht
2021-07-02 16:59 ` [Cerowrt-devel] [Bloat] Bechtolschiem Stephen Hemminger
2021-07-02 19:46 ` Matt Mathis
2021-07-07 22:19 ` [Cerowrt-devel] Abandoning Window-based CC Considered Harmful (was Re: [Bloat] Bechtolschiem) Bless, Roland (TM)
2021-07-07 22:38 ` Matt Mathis
2021-07-08 11:24 ` [Cerowrt-devel] " Bless, Roland (TM)
2021-07-08 13:29 ` Matt Mathis
2021-07-08 14:05 ` [Cerowrt-devel] " Bless, Roland (TM)
2021-07-08 14:40 ` [Cerowrt-devel] [Bloat] Abandoning Window-based CC Considered Harmful (was Bechtolschiem) Jonathan Morton
2021-07-08 20:14 ` David P. Reed
2021-07-08 13:29 ` Neal Cardwell
2021-07-02 20:28 ` [Cerowrt-devel] [Bloat] Bechtolschiem Jonathan Morton
2016-06-07 22:31 ` [Cerowrt-devel] trying to make sense of what switch vendors say wrt buffer bloat Valdis.Kletnieks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cerowrt-devel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1465268329.938313737@apps.rackspace.com \
--to=dpreed@reed.com \
--cc=cerowrt-devel@lists.bufferbloat.net \
--cc=chromatix99@gmail.com \
--cc=ketkulka@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox