From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-out02.uio.no (mail-out02.uio.no [IPv6:2001:700:100:8210::71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id B02B33B29E for ; Thu, 29 Nov 2018 07:06:08 -0500 (EST) Received: from mail-mx12.uio.no ([129.240.10.84]) by mail-out02.uio.no with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from ) id 1gSL54-0003Ig-Hn; Thu, 29 Nov 2018 13:06:06 +0100 Received: from boomerang.ifi.uio.no ([129.240.68.135]) by mail-mx12.uio.no with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) user michawe (Exim 4.91) (envelope-from ) id 1gSL53-000GYY-Tm; Thu, 29 Nov 2018 13:06:06 +0100 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) From: Michael Welzl In-Reply-To: <963ACC89-890D-4EA6-9E5E-1E7315F07C5A@gmail.com> Date: Thu, 29 Nov 2018 13:06:03 +0100 Cc: Mikael Abrahamsson , bloat Content-Transfer-Encoding: quoted-printable Message-Id: <376C9A94-8EAA-4DCF-BFDC-ADA4E11A9FC7@ifi.uio.no> References: <65EAC6C1-4688-46B6-A575-A6C7F2C066C5@heistp.net> <38535869-BF61-4FC4-A0FB-96E91CC4F076@ifi.uio.no> <87va4gwe74.fsf@taht.net> <7125B446-F2C4-45B3-B48C-8720B1E35776@gmail.com> <7D833179-4D95-4C2F-B0AF-4FFD4D29DEE4@ifi.uio.no> <963ACC89-890D-4EA6-9E5E-1E7315F07C5A@gmail.com> To: Jonathan Morton X-Mailer: Apple Mail (2.3445.9.1) X-UiO-SPF-Received: Received-SPF: neutral (mail-mx12.uio.no: 129.240.68.135 is neither permitted nor denied by domain of ifi.uio.no) client-ip=129.240.68.135; envelope-from=michawe@ifi.uio.no; helo=boomerang.ifi.uio.no; X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO) X-UiO-Scanned: 6276B6139ADF40FC6E8C86B0C1EE8CE082F34243 Subject: Re: [Bloat] incremental deployment, transport and L4S (Re: when does the CoDel part of fq_codel help in the real world?) X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Nov 2018 12:06:08 -0000 > On 29 Nov 2018, at 11:30, Jonathan Morton = wrote: >=20 >>>> My alternative use of ECT(1) is more in keeping with the other = codepoints represented by those two bits, to allow ECN to provide more = fine-grained information about congestion than it presently does. The = main challenge is communicating the relevant information back to the = sender upon receipt, ideally without increasing overhead in the TCP/IP = headers. >>>=20 >>> You need to go into the IETF process and voice this opinion then, = because if nobody opposes in the near time then ECT(1) might go to L4S = interpretation of what is going on. They do have ECN feedback mechanisms = in their proposal, have you read it? It's a whole suite of documents, = architecture, AQM proposal, transport proposal, the entire thing. >>>=20 >>> On the other hand, what you want to do and what L4S tries to do = might be closely related. It doesn't sound too far off. >>=20 >> Indeed I think that the proposal of finer-grain feedback using 2 bits = instead of one is not adding anything to, but in fact strictly weaker = than L4S, where the granularity is in the order of the number of packets = that you sent per RTT, i.e. much higher. >=20 > An important facet you may be missing here is that we don't *only* = have 2 bits to work with, but a whole sequence of packets carrying these = 2-bit codepoints. We can convey fine-grained information by setting = codepoints stochastically or in a pattern, rather than by merely = choosing one of the three available (ignoring Not-ECT). The receiver = can then observe the density of codepoints and report that to the = sender. >=20 > Which is more-or-less the premise of DCTCP. However, DCTCP changes = the meaning of CE, instead of making use of ECT(1), which I think is the = big mistake that makes it undeployable. >=20 > So, from the middlebox perspective, very little changes. ECN-capable = packets still carry ECT(0) or ECT(1). You still set CE on ECT packets, = or drop Non-ECT packets, to signal when a serious level of persistent = queue has developed, so that the sender needs to back off a lot. But if = a less serious congestion condition exists, you can now signal *that* by = changing some proportion of ECT(0) codepoints to ECT(1), with the = intention that senders either reduce their cwnd growth rate, halt growth = entirely, or enter a gradual decline. Those are three things that ECN = cannot currently signal. >=20 > This change is invisible to existing, RFC-compliant, deployed = middleboxes and endpoints, so should be completely backwards-compatible = and incrementally deployable in the network. (The only thing it breaks = is the optional ECN integrity RFC that, according to fairly recent = measurements, literally nobody bothered implementing.) >=20 > Through TCP Timestamps, both sender and receiver can know fairly = precisely when a round-trip has occurred. The receiver can use this = information to calculate the ratio of ECT(0) and ECT(1) codepoints = received in the most recent RTT. A new TCP Option could replace TCP = Timestamps and the two bytes of padding that usually go with it, = allowing reporting of this ratio without actually increasing the size of = the TCP header. Large cwnds can be accommodated at the receiver by = shifting both counters right until they both fit in a byte each; it is = the ratio between them that is significant. >=20 > It is then incumbent on the sender to do something useful with that = information. A reasonable idea would be to aim for a 1:1 ratio via an = integrating control loop. Receipt of even one ECT(1) signal might be = considered grounds for exiting slow-start, while exceeding 1:2 ratio = should limit growth rate to "Reno linear" semantics (significant for = CUBIC), and exceeding 2:1 ratio should trigger a "Reno linear" = *decrease* of cwnd. Through all this, a single CE mark (reported in the = usual way via ECE and CWR) still has the usual effect of a = multiplicative decrease. >=20 > That's my proposal. - and it's an interesting one. Indeed, I wasn't aware that you're = thinking of a DCTCP-style signal from a string of packets. Of course, this is hard to get right - there are many possible flavours = to ideas like this ... but yes, interesting! Cheers, Michael