From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gettysjim@gmail.com>
Received: from mail-qw0-f43.google.com (mail-qw0-f43.google.com
	[209.85.216.43]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client CN "smtp.gmail.com",
	Issuer "Google Internet Authority" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 720C3201A74
	for <bloat@lists.bufferbloat.net>; Thu,  5 May 2011 08:56:54 -0700 (PDT)
Received: by qwf6 with SMTP id 6so2410107qwf.16
	for <bloat@lists.bufferbloat.net>; Thu, 05 May 2011 09:01:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:sender:message-id:date:from:organization
	:user-agent:mime-version:to:subject:references:in-reply-to
	:content-type:content-transfer-encoding;
	bh=bBfUwPHV36LeCqeDBsIGT7i9TK3bdAQ/iieRhRfmng4=;
	b=ADpchHhSuXo2r3ASE7tgXazY/5ts1Df5qfSrAf79vV4vDglLfV4jw+zAz4xbVpaWSo
	lGx/BaWBNkpHdEiidczn8RO0CuQl4tlFsXIyiwZNzdgzFx95u0YpehnGesIeuL6hBpM3
	BYrey+Vw3mIlZHgxJQey8WZx2E6CMt62NI1rs=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:organization:user-agent:mime-version:to
	:subject:references:in-reply-to:content-type
	:content-transfer-encoding;
	b=i6QKf21pUTASew7FJPSnJVJanPCMW9vDDqn1CxYuybist2QK9+c2LyNCcIBkK68+BO
	C64d7+QqjBde5BsWuGbgpapxctZR6mKBneIMo24rFIol5d/yc+F0SaJLLotjhhY51YnC
	6P3+kSmyWGzvV0vTyE73tOQo6giD0bTf3mQjI=
Received: by 10.52.66.165 with SMTP id g5mr2108984vdt.61.1304611285458;
	Thu, 05 May 2011 09:01:25 -0700 (PDT)
Received: from [192.168.1.119] (c-98-229-99-32.hsd1.ma.comcast.net
	[98.229.99.32])
	by mx.google.com with ESMTPS id e37sm857337vbm.17.2011.05.05.09.01.23
	(version=SSLv3 cipher=OTHER); Thu, 05 May 2011 09:01:23 -0700 (PDT)
Sender: Jim Gettys <gettysjim@gmail.com>
Message-ID: <4DC2C9D2.8040703@freedesktop.org>
Date: Thu, 05 May 2011 12:01:22 -0400
From: Jim Gettys <jg@freedesktop.org>
Organization: Bell Labs
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.2.14) Gecko/20110223 Lightning/1.0b2 Thunderbird/3.1.8
MIME-Version: 1.0
To: bloat@lists.bufferbloat.net
References: <BANLkTi=9Kgz4kXRzK_KC9LpSDBEoVQiseg@mail.gmail.com>	<BANLkTi=pOCQRdUA3_-_=q+m27H526rWK7w@mail.gmail.com><BANLkTimrxy6=8NQ+VpysJiRcWCX6RpbrWA@mail.gmail.com>	<4DB70FDA.6000507@mti-systems.com>
	<D02B19AE0CC44AFCBA30F6CD0B10C56C@srichardlxp2>
In-Reply-To: <D02B19AE0CC44AFCBA30F6CD0B10C56C@srichardlxp2>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 05 May 2011 15:56:55 -0000

On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
> I'm curious, has anyone done some simulations to check if the 
> following qualitative statement holds true, and if, what the 
> quantitative effect is:
>
> With bufferbloat, the TCP congestion control reaction is unduely 
> delayed. When it finally happens, the tcp stream is likely facing a 
> "burst loss" event - multiple consecutive packets get dropped. Worse 
> yet, the sender with the lowest RTT across the bottleneck will likely 
> start to retransmit while the (tail-drop) queue is still overflowing.
>
> And a lost retransmission means a major setback in bandwidth (except 
> for Linux with bulk transfers and SACK enabled), as the standard (RFC 
> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms 
> typically) to recover such a lost retransmission...
>
> The second part (more important as an incentive to the ISPs actually), 
> how does the fraction of goodput vs. throughput change, when AQM 
> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs 
> have to pay for their upstream volume, regardless if that is "real" 
> work (goodput) or unneccessary retransmissions.
>
> When I was at a small cable ISP in switzerland last week, surely 
> enough bufferbloat was readily observable (17ms -> 220ms after 30 sec 
> of a bulk transfer), but at first they had the "not our problem" view, 
> until I started discussing burst loss / retransmissions / goodput vs 
> throughput - with the latest point being a real commercial incentive 
> to them. (They promised to check if AQM would be available in the CPE 
> / CMTS, and put latency bounds in their tenders going forward).
>
I wish I had a good answer to your very good questions.  Simulation 
would be interesting though real daa is more convincing.

I haven't looked in detail at all that many traces to try to get a feel 
for how much bandwidth waste there actually is, and more formal studies 
like Netalyzr, SamKnows, or the Bismark project would be needed to 
quantify the loss on the network as a whole.

I did spend some time last fall with the traces I've taken.  In those, 
I've typically been seeing 1-3% packet loss in the main TCP transfers.  
On the wireless trace I took, I saw 9% loss, but whether that is 
bufferbloat induced loss or not, I don't know (the data is out there for 
those who might want to dig).  And as you note, the losses are 
concentrated in bursts (probably due to the details of Cubic, so I'm told).

I've had anecdotal reports (and some first hand experience) with much 
higher loss rates, for example from Nick Weaver at ICSI; but I believe 
in playing things conservatively with any numbers I quote and I've not 
gotten consistent results when I've tried, so I just report what's in 
the packet captures I did take.

A phenomena that could be occurring is that during congestion avoidance 
(until TCP loses its cookies entirely and probes for a higher operating 
point) that TCP is carefully timing it's packets to keep the buffers 
almost exactly full, so that competing flows (in my case, simple pings) 
are likely to arrive just when there is no buffer space to accept them 
and therefore you see higher losses on them than you would on the single 
flow I've been tracing and getting loss statistics from.

People who want to look into this further would be a great help.
                 - Jim