From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eu1sys200aog114.obsmtp.com (eu1sys200aog114.obsmtp.com [207.126.144.137]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 2CFCB201A61 for ; Fri, 6 May 2011 04:48:36 -0700 (PDT) Received: from mail.la.pnsol.com ([89.145.213.110]) (using TLSv1) by eu1sys200aob114.postini.com ([207.126.147.11]) with SMTP ID DSNKTcPhNx0h12jXMEIlXyv/jpTJqF448V4W@postini.com; Fri, 06 May 2011 11:53:32 UTC Received: from ba6-office.pnsol.com ([172.20.5.199]) by mail.la.pnsol.com with esmtp (Exim 4.63) (envelope-from ) id 1QIJb4-00012z-PV; Fri, 06 May 2011 12:53:26 +0100 Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-17--41908140 From: Neil Davies In-Reply-To: <16D9F290-9FB1-4885-80C9-4F93283DA87C@spacething.org> Date: Fri, 6 May 2011 12:53:26 +0100 Message-Id: <6788C1E7-4C8F-438E-A341-C4A001BF24AB@pnsol.com> References: <4DB70FDA.6000507@mti-systems.com> <4DC2C9D2.8040703@freedesktop.org> <20110505091046.3c73e067@nehalam> <6E25D2CF-D0F0-4C41-BABC-4AB0C00862A6@pnsol.com> <16D9F290-9FB1-4885-80C9-4F93283DA87C@spacething.org> To: Sam Stickland X-Mailer: Apple Mail (2.1084) Cc: Stephen Hemminger , "bloat@lists.bufferbloat.net" Subject: Re: [Bloat] Burst Loss X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2011 11:48:38 -0000 --Apple-Mail-17--41908140 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 6 May 2011, at 12:40, Sam Stickland wrote: >=20 >=20 > On 5 May 2011, at 17:49, Neil Davies wrote: >=20 >> On the issue of loss - we did a study of the UK's ADSL access network = back in 2006 over several weeks, looking at the loss and delay that was = introduced into the bi-directional traffic. >>=20 >> We found that the delay variability (that bit left over after you've = taken the effects of geography and line sync rates) was broadly >> the same over the half dozen locations we studied - it was there all = the time to the same level of variance and that what did vary by time = of day was the loss rate. >>=20 >> We also found out, at the time much to our surprise - but we = understand why now, that loss was broadly independent of the offered = load - we used a constant data rate (with either fixed or variable = packet sizes) . >>=20 >> We found that loss rates were in the range 1% to 3% (which is what = would be expected from a large number of TCP streams contending for a = limiting resource). >>=20 >> As for burst loss, yes it does occur - but it could be argued that = this more the fault of the sending TCP stack than the network. >>=20 >> This phenomenon was well covered in the academic literature in the = '90s (if I remember correctly folks at INRIA lead the way) - it is all = down to the nature of random processes and how you observe them. =20 >>=20 >> Back to back packets see higher loss rates than packets more spread = out in time. Consider a pair of packets, back to back, arriving over a = 1Gbit/sec link into a queue being serviced at 34Mbit/sec, the first = packet being 'lost' is equivalent to saying that the first packet = 'observed' the queue full - the system's state is no longer a random = variable - it is known to be full. The second packet (lets assume it is = also a full one) 'makes an observation' of the state of that queue about = 12us later - but that is only 3% of the time that it takes to service = such large packets at 34 Mbit/sec. The system has not had any time to = 'relax' anywhere near to back its steady state, it is highly likely that = it is still full.=20 >>=20 >> Fixing this makes a phenomenal difference on the goodput (with the = usual delay effects that implies), we've even built and deployed systems = with this sort of engineering embedded (deployed as a network 'wrap') = that mean that end users can sustainably (days on end) achieve effective = throughput that is better than 98% of (the transmission media imposed) = maximum. What we had done is make the network behave closer to the = underlying statistical assumptions made in TCP's design. >=20 > How did you fix this? What alters the packet spacing? The network or = the host? It is a device in the network, it sits at the 'edge' of the access = network (at the ISP / Network Wholesaler boundary) - that resolves the = downstream issue. Neil >=20 > Sam --Apple-Mail-17--41908140 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii


On 5 May 2011, at 17:49, Neil Davies = <Neil.Davies@pnsol.com>= wrote:

On = the issue of loss - we did a study of the UK's ADSL access network back = in 2006 over several weeks, looking at the loss and delay that was = introduced into the bi-directional = traffic.

We found that the delay = variability (that bit left over after you've taken the effects of = geography and line sync rates) was broadly
the same over = the half dozen locations we studied - it was there all the time to the = same level of  variance and that what did vary by time of day was = the loss rate.

We also found out, at = the time much to our surprise - but we understand why now, that loss was = broadly independent of the offered load - we used a constant data rate = (with either fixed or variable packet sizes) = .

We found that loss rates were in the = range 1% to 3% (which is what would be expected from a large number of = TCP streams contending for a limiting = resource).

As for burst loss, yes it = does occur - but it could be argued that this more the fault of the = sending TCP stack than the = network.

This phenomenon was well = covered in the academic literature in the '90s (if I remember correctly = folks at INRIA lead the way) - it is all down to the nature of random = processes and how you observe them. =  

Back to back packets see higher = loss rates than packets more spread out in time. Consider a pair of = packets, back to back, arriving over a 1Gbit/sec link into a queue being = serviced at 34Mbit/sec, the first packet being 'lost' is equivalent to = saying that the first packet 'observed' the queue full - the system's = state is no longer a random variable - it is known to be full. The = second packet (lets assume it is also a full one) 'makes an observation' = of the state of that queue about 12us later - but that is only 3% of the = time that it takes to service such large packets at 34 Mbit/sec. The = system has not had any time to 'relax' anywhere near to back its steady = state, it is highly likely that it is still full. =

Fixing this makes a phenomenal = difference on the goodput (with the usual delay effects that implies), = we've even built and deployed systems with this sort of engineering = embedded (deployed as a network 'wrap') that mean that end users can = sustainably (days on end) achieve effective throughput that is better = than 98% of (the transmission media imposed) maximum. What we had done = is make the network behave closer to the underlying statistical = assumptions made in TCP's design.

Ho= w did you fix this? What alters the packet spacing? The network or the = host?


It is a = device in the network, it sits at the 'edge' of the access network (at = the ISP / Network Wholesaler boundary) - that resolves the downstream = issue.

Neil


Sam
=
= --Apple-Mail-17--41908140--