From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.23]) by huchra.bufferbloat.net (Postfix) with SMTP id 8599A201A37 for ; Mon, 16 May 2011 00:45:50 -0700 (PDT) Received: (qmail invoked by alias); 16 May 2011 07:55:20 -0000 Received: from unknown (EHLO srichardlxp2) [213.143.107.142] by mail.gmx.net (mp059) with SMTP; 16 May 2011 09:55:20 +0200 X-Authenticated: #20720068 X-Provags-ID: V01U2FsdGVkX19hDcpEZ4lhtbJbgDEHKBgZC0wEbArZg10RqfISeh GT9rWAt1W3O67n Message-ID: From: "Richard Scheffenegger" To: "Jonathan Morton" , "Fred Baker" References: <4DB70FDA.6000507@mti-systems.com> <4DC2C9D2.8040703@freedesktop.org> <20110505091046.3c73e067@nehalam> <6E25D2CF-D0F0-4C41-BABC-4AB0C00862A6@pnsol.com> <35D8AC71C7BF46E29CC3118AACD97FA6@srichardlxp2> <1304964368.8149.202.camel@tardy> <4DD9A464-8845-49AA-ADC4-A0D36D91AAEC@cisco.com> <1305297321.8149.549.camel@tardy><014c01cc11a8$de78ac10$9b6a0430$@gross@avanw.com><8A928839-1D91-4F18-8252-F06BD004E37D@cisco.com><5946BA6B-4E00-43AF-A8A2-17FB3769F37B@cisco.com> <2EEFB9D5-E9CC-4612-8D91-F6B382E3C2FB@gmail.com> Date: Mon, 16 May 2011 09:51:19 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6090 X-Y-GMX-Trusted: 0 Cc: bloat@lists.bufferbloat.net Subject: Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2011 07:45:51 -0000 Jonathan, > What I'd like to see is a complete absence of need for retransmission on a > properly > built wired network. Obviously the capability still needs to be there to > cope with > the parts that aren't properly built or aren't wired, but TCP can do that. > Throttling > (in the form of Ethernet PAUSE) is simply the third possible method of > signalling > congestion in the network, alongside delay and loss - and it happens to be > quite > widely deployed already. Two comments: TCP can currently NOT deal properly with non-congestion loss (with other words, any loss will lead to a congestion control reaction - reduction of sending rate). TCP can only (mostly) deal with the recovery part in a hopefully timely fashion. In this area you'll find a high number of possible approaches, none of which is quite backwards-compatible with "standard" TCP. Second, you wouldn't want to deploy basic 802.3x to any network consisting of more than a single switch. If you do, you can run into an effect called congestion tree formation, where (simplified) the slowest receiver determines the global speed of your ethernet network. 802.1Qbb is also prone to congestion trees, even though the probability is somewhat reduced provided all priority classes are being used. Unfortunately, most traffic is in the same 802.1p class... Adequate solutions (more complex than the FCP buffer-credit based congestion avoidance) like 802.1Qau / QCN are not available commercially afaik. (They need new NICs + new Switches for the HW support). But I agree, a L3 device should be able to distribute L2 congestion information into the L3 header (even though today, cheap generic broadcom and perhaps even Realtek chipsets support ECN marking even when they are running as L2 switch; a speciality firmware (see the DCTCP papers) is required though. Best regards, Richard