From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x231.google.com (mail-oi0-x231.google.com [IPv6:2607:f8b0:4003:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 2B8C93B2AE for ; Tue, 7 Jun 2016 10:46:48 -0400 (EDT) Received: by mail-oi0-x231.google.com with SMTP id p204so84531687oih.3 for ; Tue, 07 Jun 2016 07:46:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=rO913abFh4CcJh961jhz47umyLM18ccTtIk8ogxy7zA=; b=p9xi/cQSk1hfUPMrL+BNiFdMsZvXrVzUJPH1WxXGOruH39VfvQsjvrXBOMte1EuXec Jp3FFkL1H+9ZebGvZLdTVT5I4owCh2KLiPN16TW3jVCyUcGDrUtX1biaKeG9FyBfFThD tW6ewOQr8gKX52Yak5AdjBbpLLwyjQSZzBGE/nyqyjMezlT/eY+FstK6znATHdw16rrC DFXbQ6QHhlH8jKYDHDKNemIKbmBuBNPkzR/iGET5Bd1GuR+EM4+65pRsM2kUN/LNloAs nGhxbGq0Khq35aZM+MHLJyVjhygwZ1bxQfWjBe8+5e+fzuTzLadQ+tDUMQ8R8rDFhjuy fgtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=rO913abFh4CcJh961jhz47umyLM18ccTtIk8ogxy7zA=; b=bAvskTd2kl4QshCfDjVllDCvLektI/JJ/VxaTuIOos7GeFSrqrnqXovUuML/HHUkLj 18uGkNuD/Y7k6Hbx15b+9tYE21JTpvM2glEqGiMxJlILg1/L5UezNIQF8sDsT1croRT7 DM04ac4bDdJFaTe+diVsoXXcsyOSFb89KjI8XIoTGdQk6Jg3JzO3jm1Xy/5WtXh9m6pj BVALKLTKCJsG4yd//zj4a/UXy2mjR+2bRIrAP5c7zTnRg+XjVj7YBK51ytR97YKVzGXH 4zYU9WyjiKeTKThiklp0T3lb6s2rkLwhPpcUMRXVawLp3RjX/9Qu06HrwrB456RPBBuM ChVw== X-Gm-Message-State: ALyK8tK+KOUqBrqztY6hZEnetdPpbuIcv+UpRcCZaNcFtbfAqc6Fr7voJ7YnLxz/J3oPW30valy6bWRmElfPAw== X-Received: by 10.202.186.193 with SMTP id k184mr12498832oif.66.1465310807470; Tue, 07 Jun 2016 07:46:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.229.210 with HTTP; Tue, 7 Jun 2016 07:46:46 -0700 (PDT) In-Reply-To: References: <55fdf513-9c54-bea9-1f53-fe2c5229d7ba@eggo.org> <871t4as1h9.fsf@toke.dk> <3D32F19B-5DEA-48AD-97E7-D043C4EAEC51@gmail.com> <1465267957.902610235@apps.rackspace.com> <1465268329.938313737@apps.rackspace.com> From: Dave Taht Date: Tue, 7 Jun 2016 07:46:46 -0700 Message-ID: To: Mikael Abrahamsson Cc: David Reed , Jonathan Morton , "cerowrt-devel@lists.bufferbloat.net" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cerowrt-devel] trying to make sense of what switch vendors say wrt buffer bloat X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2016 14:46:48 -0000 On Tue, Jun 7, 2016 at 3:46 AM, Mikael Abrahamsson wrote= : > On Mon, 6 Jun 2016, dpreed@reed.com wrote: > >> Even better, it would be fun to get access to an Arista switch and some >> high performance TCP sources and sinks, and demonstrate extreme bufferbl= oat >> compared to a small-buffer switch. Just a demo, not a simulation full o= f >> assumptions and guesses. In terms of doing this at low cost, we can pretty easily setup a linux box nowadays that can forward at 10GigE using mellonox hardware. In terms of finding a (set of) cheap 10GigE capable switches, the needed investment looks to be in the 20k range to buy one. (?) That is essentially more than the entire cerowrt hw budget for the past 5 years.... > > So while it can be rightfully argued that we don't need 100ms worth of > buffering (here it actually is kind of correct to say "ram is cheap" beca= use > as soon as you go for offchip RAM, it's now cheap). > > So these vendors have two choices: > > 1. 8-16MB on-chip buffer. > 2. External RAM > > If you choose the external RAM one, you might as well put a lot of RAM > there, and give the option to the customer to configure the port buffer > settings any way they want. > > For the on-chip small buffer one, having 80 10GE ports,all sharing 8 > megabyte of buffer (let's say 10 ports are congesting, meaning each port > gets 800kilobytes of buffer) and each port doing 1.25gigabyte/s of data, > that's 0.64ms worth of buffer per congested port (I hope I got my math > right). That is just too little unless you control the TCP stacks of the > clients, and are just doing low-RTT communication. > > So while I'd admit that 100ms worth of FIFO is too much, what needs to > happen now is to have them configured to do something clever and aiming t= o > never have prolonged use of more than a few ms worth of buffer. > > It's hard to do AQM with half a millisecond worth of buffer, right? > > At least this has been shown by previous generation of datacenter switche= s > that had miniscule buffers and ISPs tried to use them and when there were > microbursts there was uncontrolled packet loss. Of possible interest, measurementlabs encountered and thoroughly debugged a microburst problem across their backbones from last year. This is a good read, although I wish the graphs were more directly comparable. https://www.measurementlab.net/publications/SwitchDiscardNotice-Final-20160= 525.pdf > > -- > Mikael Abrahamsson email: swmike@swm.pp.se > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel --=20 Dave T=C3=A4ht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org