From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eu1sys200aog105.obsmtp.com (eu1sys200aog105.obsmtp.com [207.126.144.119]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 499B0201A74 for ; Thu, 5 May 2011 09:44:53 -0700 (PDT) Received: from mail.la.pnsol.com ([89.145.213.110]) (using TLSv1) by eu1sys200aob105.postini.com ([207.126.147.11]) with SMTP ID DSNKTcLVEAkVftV/65+IAtnT9A1usnXb7Gmn@postini.com; Thu, 05 May 2011 16:49:27 UTC Received: from ba6-office.pnsol.com ([172.20.5.199]) by mail.la.pnsol.com with esmtp (Exim 4.63) (envelope-from ) id 1QI1jr-0000r3-5z; Thu, 05 May 2011 17:49:19 +0100 Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Neil Davies In-Reply-To: <20110505091046.3c73e067@nehalam> Date: Thu, 5 May 2011 17:49:18 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <6E25D2CF-D0F0-4C41-BABC-4AB0C00862A6@pnsol.com> References: <4DB70FDA.6000507@mti-systems.com> <4DC2C9D2.8040703@freedesktop.org> <20110505091046.3c73e067@nehalam> To: Stephen Hemminger X-Mailer: Apple Mail (2.1084) Cc: bloat@lists.bufferbloat.net Subject: [Bloat] Burst Loss X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2011 16:44:55 -0000 On the issue of loss - we did a study of the UK's ADSL access network = back in 2006 over several weeks, looking at the loss and delay that was = introduced into the bi-directional traffic. We found that the delay variability (that bit left over after you've = taken the effects of geography and line sync rates) was broadly the same over the half dozen locations we studied - it was there all the = time to the same level of variance and that what did vary by time of = day was the loss rate. We also found out, at the time much to our surprise - but we understand = why now, that loss was broadly independent of the offered load - we used = a constant data rate (with either fixed or variable packet sizes) . We found that loss rates were in the range 1% to 3% (which is what would = be expected from a large number of TCP streams contending for a limiting = resource). As for burst loss, yes it does occur - but it could be argued that this = more the fault of the sending TCP stack than the network. This phenomenon was well covered in the academic literature in the '90s = (if I remember correctly folks at INRIA lead the way) - it is all down = to the nature of random processes and how you observe them. =20 Back to back packets see higher loss rates than packets more spread out = in time. Consider a pair of packets, back to back, arriving over a = 1Gbit/sec link into a queue being serviced at 34Mbit/sec, the first = packet being 'lost' is equivalent to saying that the first packet = 'observed' the queue full - the system's state is no longer a random = variable - it is known to be full. The second packet (lets assume it is = also a full one) 'makes an observation' of the state of that queue about = 12us later - but that is only 3% of the time that it takes to service = such large packets at 34 Mbit/sec. The system has not had any time to = 'relax' anywhere near to back its steady state, it is highly likely that = it is still full.=20 Fixing this makes a phenomenal difference on the goodput (with the usual = delay effects that implies), we've even built and deployed systems with = this sort of engineering embedded (deployed as a network 'wrap') that = mean that end users can sustainably (days on end) achieve effective = throughput that is better than 98% of (the transmission media imposed) = maximum. What we had done is make the network behave closer to the = underlying statistical assumptions made in TCP's design. Neil On 5 May 2011, at 17:10, Stephen Hemminger wrote: > On Thu, 05 May 2011 12:01:22 -0400 > Jim Gettys wrote: >=20 >> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote: >>> I'm curious, has anyone done some simulations to check if the=20 >>> following qualitative statement holds true, and if, what the=20 >>> quantitative effect is: >>>=20 >>> With bufferbloat, the TCP congestion control reaction is unduely=20 >>> delayed. When it finally happens, the tcp stream is likely facing a=20= >>> "burst loss" event - multiple consecutive packets get dropped. Worse=20= >>> yet, the sender with the lowest RTT across the bottleneck will = likely=20 >>> start to retransmit while the (tail-drop) queue is still = overflowing. >>>=20 >>> And a lost retransmission means a major setback in bandwidth (except=20= >>> for Linux with bulk transfers and SACK enabled), as the standard = (RFC=20 >>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms=20 >>> typically) to recover such a lost retransmission... >>>=20 >>> The second part (more important as an incentive to the ISPs = actually),=20 >>> how does the fraction of goodput vs. throughput change, when AQM=20 >>> schemes are deployed, and TCP CC reacts in a timely manner? Small = ISPs=20 >>> have to pay for their upstream volume, regardless if that is "real"=20= >>> work (goodput) or unneccessary retransmissions. >>>=20 >>> When I was at a small cable ISP in switzerland last week, surely=20 >>> enough bufferbloat was readily observable (17ms -> 220ms after 30 = sec=20 >>> of a bulk transfer), but at first they had the "not our problem" = view,=20 >>> until I started discussing burst loss / retransmissions / goodput vs=20= >>> throughput - with the latest point being a real commercial incentive=20= >>> to them. (They promised to check if AQM would be available in the = CPE=20 >>> / CMTS, and put latency bounds in their tenders going forward). >>>=20 >> I wish I had a good answer to your very good questions. Simulation=20= >> would be interesting though real daa is more convincing. >>=20 >> I haven't looked in detail at all that many traces to try to get a = feel=20 >> for how much bandwidth waste there actually is, and more formal = studies=20 >> like Netalyzr, SamKnows, or the Bismark project would be needed to=20 >> quantify the loss on the network as a whole. >>=20 >> I did spend some time last fall with the traces I've taken. In = those,=20 >> I've typically been seeing 1-3% packet loss in the main TCP = transfers. =20 >> On the wireless trace I took, I saw 9% loss, but whether that is=20 >> bufferbloat induced loss or not, I don't know (the data is out there = for=20 >> those who might want to dig). And as you note, the losses are=20 >> concentrated in bursts (probably due to the details of Cubic, so I'm = told). >>=20 >> I've had anecdotal reports (and some first hand experience) with much=20= >> higher loss rates, for example from Nick Weaver at ICSI; but I = believe=20 >> in playing things conservatively with any numbers I quote and I've = not=20 >> gotten consistent results when I've tried, so I just report what's in=20= >> the packet captures I did take. >>=20 >> A phenomena that could be occurring is that during congestion = avoidance=20 >> (until TCP loses its cookies entirely and probes for a higher = operating=20 >> point) that TCP is carefully timing it's packets to keep the buffers=20= >> almost exactly full, so that competing flows (in my case, simple = pings)=20 >> are likely to arrive just when there is no buffer space to accept = them=20 >> and therefore you see higher losses on them than you would on the = single=20 >> flow I've been tracing and getting loss statistics from. >>=20 >> People who want to look into this further would be a great help. >> - Jim >=20 > I would not put a lot of trust in measuring loss with pings.=20 > I heard that some ISP's do different processing on ICMP's used > for ping packets. They either prioritize them high to provide=20 > artificially good response (better marketing numbers); or=20 > prioritize them low since they aren't useful traffic. > There are also filters that only allow N ICMP requests per second > which means repeated probes will be dropped. >=20 >=20 >=20 > --=20 > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat