From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from hubrelay-rd.bt.com (hubrelay-rd.bt.com [62.239.224.99]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client CN "hubrelay-rd.bt.com", Issuer "VeriSign Class 3 International Server CA - G3" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id F0B5221F0A2 for ; Sun, 15 Dec 2013 12:56:28 -0800 (PST) Received: from EVMHR72-UKRD.domain1.systemhost.net (10.36.3.110) by EVMHR67-UKRD.bt.com (10.187.101.22) with Microsoft SMTP Server (TLS) id 8.3.297.1; Sun, 15 Dec 2013 20:56:25 +0000 Received: from EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) by EVMHR72-UKRD.domain1.systemhost.net (10.36.3.110) with Microsoft SMTP Server (TLS) id 8.3.279.1; Sun, 15 Dec 2013 20:56:24 +0000 Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) with Microsoft SMTP Server id 14.2.347.0; Sun, 15 Dec 2013 20:56:23 +0000 Received: from BTP075694.jungle.bt.co.uk ([10.109.230.1]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id rBFKuLFq021776; Sun, 15 Dec 2013 20:56:22 GMT Message-ID: <201312152056.rBFKuLFq021776@bagheera.jungle.bt.co.uk> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sun, 15 Dec 2013 20:56:21 +0000 To: Naeem Khademi From: Bob Briscoe In-Reply-To: <655C07320163294895BBADA28372AF5D14C5DF@FR712WXCHMBA15.zeu. alcatel-lucent.com> References: <33D65DCB-B4CC-4D2C-8ED9-E3685AF7D820@gmail.com> <655C07320163294895BBADA28372AF5D14C5DF@FR712WXCHMBA15.zeu.alcatel-lucent.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Spam-Score: -1.053 () ALL_TRUSTED,MAILTO_TO_SPAM_ADDR X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158 Cc: bloat Mainlinglist Subject: Re: [Bloat] What is a good burst? -- AQM evaluation guidelines X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Dec 2013 20:56:29 -0000 Naeem, You don't need to go through a BDP calculation if calibrating queue length in time units - you can take the burst size directly as the RTT (otherwise you multiply RTT by link bandwidth to get BDP, then just divide again by link bandwidth to get burst size in time). It's important to use an RTT at the high end of the expected range, otherwise TCP flows with significantly higher RTT can get v poor performance. Having to pick a compromise RTT value is not ideal, because for public Internet you will typically need to assume a worst-case RTT of transcontinental proportions (c.200ms), which is why it is recommended to set 'interval' to 100ms in 'codel' and 'max_burst' to 100ms in PIE. However, most flows nowadays terminate at a CDN with RTT ~20ms. So, having configured the AQM to allow for 100ms RTT, every time the queue fills from idle, it will be delaying any loss signals for about 5 CDN-RTTs. It's an even tougher compromise if your AQM is within your host, which has to support the full range of RTTs from 200ms transcontinental to <2ms across your LAN (whether campus, enterprise or home - e.g. with your media server). Then, you have to configure the AQM to absorb 100ms bursts, so it will delay signals to LAN flows for ~50 of their RTTs. In this time, a multi-round-trip LAN TCP will have pushed the queue into tail-drop, long before the AQM has responded. The notion of a 'good burst' is only necessary if using drop as the signal though. For ECN-capable packets, we have been experimenting with shifting absorbtion of bursts from the network to L4 in the host (which knows its own RTT). We don't wait at all for a burst to persist before sending ECN signals, and then smooth out RTT-length bursts of ECN signals in the congestion avoidance algorithm in the transport. Also, during slow-start the transport doesn't need to smooth out the bursts at all, so it gets the signal within 1RTT and it can respond immediately. See and the thread that has just been discussing this: "[aqm] Text for aqm-recommendation on independent ECN config" Bob At 15:16 15/12/2013, Scharf, Michael (Michael) wrote: >There are ongoing discussions in the IETF TCPM working group on >sender-side pacing: > >http://www.ietf.org/mail-archive/web/tcpm/current/msg08167.html > >http://www.ietf.org/proceedings/88/slides/slides-88-tcpm-9.pdf > >Insight or contributions from further implementers would be highly welcome. > >Michael > > >________________________________________ >Von: bloat-bounces@lists.bufferbloat.net >[bloat-bounces@lists.bufferbloat.net]" im Auftrag von >"Jonathan Morton [chromatix99@gmail.com] >Gesendet: Sonntag, 15. Dezember 2013 13:26 >An: Naeem Khademi >Cc: bloat Mainlinglist >Betreff: Re: [Bloat] What is a good burst? -- AQM evaluation guidelines > >On 15 Dec, 2013, at 7:35 am, Naeem Khademi wrote: > > > the question remains: "what is a good burst (size) that AQMs should allow?" > >The ideal size of a TCP congestion window - which limits the size of >a burst on a TCP flow - is equal to the natural bandwidth-delay >product for the flow. That involves the available bandwidth and the >natural RTT delay - ie. without added queueing delay. > >Codel operates on this basis, making an assumption about typical RTT >delays, and permitting queue residency to temporarily rise to that >value without initiating marking operations. A larger burst would >be evidence of a congestion window that is too large, or an overall >sending rate that exceeds the bandwidth at the link the codel queue >controls. A persistent queue is always taken as evidence of the latter. > >In a datacentre or on a LAN, natural RTT delays are much shorter >(microseconds) than on the general Internet (milliseconds) - >conversely, available bandwidth is typically much higher (Gbps vs. >Mbps). The two factors approximately cancel out, so the >bandwidth-delay product remains roughly the same in typical cases - >although, of course, atypical cases such as satellite links (seconds >of latency) and major backbones (extreme aggregate bandwidth and >Internet-scale delays) also exist. However, RTT is more consistent >between installations than bandwidth is (factor of ten difference in >typical range of ADSL link speeds, factor of a hundred in WiFi), so >Codel uses a time basis rather than a byte-count basis for >regulation, and is by default tuned for typical overland Internet latencies. > >Fq_codel, as with other FQ-type qdiscs, tends to improve pacing when >multiple flows are present, by interleaving packets from different >queued bursts. Pacing is the general absence of bursts, and can be >implemented at source by a TCP sender that spreads packets within a >congestion window through an interval of time corresponding to the >measured RTT. AFAIK, very few TCP implementations actually do this, >probably due to a desire to avoid interrupt overheads (the CPU would >have to be woken up by a timer for each packet). It strikes me as >feasible for NIC hardware to take on some of this burden. > > - Jonathan Morton > >_______________________________________________ >Bloat mailing list >Bloat@lists.bufferbloat.net >https://lists.bufferbloat.net/listinfo/bloat >_______________________________________________ >Bloat mailing list >Bloat@lists.bufferbloat.net >https://lists.bufferbloat.net/listinfo/bloat ________________________________________________________________ Bob Briscoe, BT