From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-la0-x22b.google.com (mail-la0-x22b.google.com [IPv6:2a00:1450:4010:c03::22b]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id B852821F783 for ; Fri, 29 Aug 2014 23:46:03 -0700 (PDT) Received: by mail-la0-f43.google.com with SMTP id ty20so3727873lab.2 for ; Fri, 29 Aug 2014 23:46:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=HeioU2BRGqKRqKxur0mWNosDoJjsRz+fAx/Y2FHCKiQ=; b=asv2k3QbsHc6WP0Kvz6iuE7BfoUwcYQZBnsT+HgYLv+FN4+RIYxNeXeHEZ8TocTLrA +vIyioMTpEwkvki+gdXHa6sqFVpFvUwPGHlqtk08HblN/T2PerbMKQPx2Guq6by7mXkO koF/VaWZSdYE7EC3LJPyNAlH5imjf7uw+yqRAhiOpHdgXLZ/vyvFZOVPurTr+GAOh6s0 h6T/w0Cg6gadaI8TXuKxBSliNuB1d8MllmYbKtrCTHHzDv6Bp89YgpleDNFMBmkMxNJd BUh4jMaaA8YRxKFJgQMYq1ZqaApSP0lELwXnvmd7kAHJAQLO7OuoQjBmJ7tYtIzFHRNv JhEw== X-Received: by 10.152.42.233 with SMTP id r9mr15546259lal.28.1409381161169; Fri, 29 Aug 2014 23:46:01 -0700 (PDT) Received: from bass.home.chromatix.fi (188-67-224-93.bb.dnainternet.fi. [188.67.224.93]) by mx.google.com with ESMTPSA id l10sm3324591lbc.3.2014.08.29.23.45.59 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 29 Aug 2014 23:46:00 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Jonathan Morton In-Reply-To: <20140829232853.07cef202@urahara> Date: Sat, 30 Aug 2014 09:45:58 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: References: <000001cfbefe$69194c70$3b4be550$@duckware.com> <000901cfc2c2$c21ae460$4650ad20$@duckware.com> <4A89264B-36C5-4D1F-9E5E-33F2B42C364E@gmail.com> <002201cfc2e4$565c1100$03143300$@duckware.com> <002a01cfc396$ba5c8510$2f158f30$@duckware.com> <569E96E0-297C-4895-B402-F2B55E1953FA@gmail.com> <20140829232853.07cef202@urahara> To: Stephen Hemminger X-Mailer: Apple Mail (2.1085) Cc: bloat@lists.bufferbloat.net Subject: Re: [Bloat] The Dark Problem with AQM in the Internet? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Aug 2014 06:46:04 -0000 On 30 Aug, 2014, at 9:28 am, Stephen Hemminger wrote: > On Sat, 30 Aug 2014 09:05:58 +0300 > Jonathan Morton wrote: >=20 >>=20 >> On 29 Aug, 2014, at 5:37 pm, Jerry Jongerius wrote: >>=20 >>>> did you check to see if packets were re-sent even if they weren't = lost? on of >>>> the side effects of excessive buffering is that it's possible for a = packet to >>>> be held in the buffer long enough that the sender thinks that it's = been >>>> lost and retransmits it, so the packet is effectivly 'lost' even if = it actually >>>> arrives at it's destination. >>>=20 >>> Yes. A duplicate packet for the missing packet is not seen. >>>=20 >>> The receiver 'misses' a packet; starts sending out tons of dup acks = (for all >>> packets in flight and queued up due to bufferbloat), and then way = later, the >>> packet does come in (after the RTT caused by bufferbloat; indicating = it is >>> the 'resent' packet). =20 >>=20 >> I think I've cracked this one - the cause, if not the solution. >>=20 >> Let's assume, for the moment, that Jerry is correct and PowerBoost = plays no part in this. That implies that the flow is not using the full = bandwidth after the loss, *and* that the additive increase of cwnd isn't = sufficient to recover to that point within the test period. >>=20 >> There *is* a sequence of events that can lead to that happening: >>=20 >> 1) Packet is lost, at the tail end of the bottleneck queue. >>=20 >> 2) Eventually, receiver sees the loss and starts sending duplicate = acks (each triggering CA_EVENT_SLOW_ACK path in the sender). Sender = (running Westwood+) assumes that each of these represents a received, = full-size packet, for bandwidth estimation purposes. >>=20 >> 3) The receiver doesn't send, or the sender doesn't receive, a = duplicate ack for every packet actually received. Maybe some firewall = sees a large number of identical packets arriving - without SACK or = timestamps, they *would* be identical - and filters some of them. The = bandwidth estimate therefore becomes significantly lower than the true = value, and additionally the RTO fires and causes the sender to reset = cwnd to 1 (CA_EVENT_LOSS). >>=20 >> 4) The retransmitted packet finally reaches the receiver, and the ack = it sends includes all the data received in the meantime (about 3.5MB). = This is not sufficient to immediately reset the bandwidth estimate to = the true value, because the BWE is sampled at RTT intervals, and also = includes low-pass filtering. >>=20 >> 5) This ends the recovery phase (CA_EVENT_CWR_COMPLETE), and the = sender resets the slow-start threshold to correspond to the estimated = delay-bandwidth product (MinRTT * BWE) at that moment. >>=20 >> 6) This estimated DBP is lower than the true value, so the subsequent = slow-start phase ends with the cwnd inadequately sized. Additive = increase would eventually correct that - but the key word is = *eventually*. >>=20 >> - Jonathan Morton >=20 > Bandwidth estimates by ack RTT is fraught with problems. The returning = ACK can be > delayed for any number of reasons such as other traffic or = aggregation. This kind > of delay based congestion control suffers badly from any latency = induced in the network. > So instead of causing bloat, it gets hit by bloat. In this case, the TCP is actually tracking RTT surprisingly well, but = the bandwidth estimate goes wrong because the duplicate ACKs go missing. = Note that if the MinRTT was estimated too high (which is the only = direction it could go), this would result in the slow-start threshold = being *higher* than required, and the symptoms observed would not occur, = since the cwnd would grow to the required value after recovery. This is the opposite effect from what happens to TCP Vegas in a bloated = environment. Vegas stops increasing cwnd when the estimated RTT is = noticeably higher than MinRTT, but if the true MinRTT changes (or it has = to compete with a non-Vegas TCP flow), it has trouble tracking that = fact. There is another possibility: that the assumption of non-queue RTT = being constant against varying bandwidth is incorrect. If that is the = case, then the observed behaviour can be explained without recourse to = lost duplicate ACKs - so Westwood+ is correctly tracking both MinRTT and = BWE - but (MinRTT * BWE) turns out to be a poor estimate of the true = BDP. I think this still fails to explain why the cwnd is reset (which = should occur only on RTO), but everything else potentially fits. I think we can distinguish the two theories by running tests against a = server that supports SACK and timestamps, and where ideally we can = capture packet traces at both ends. - Jonathan Morton