From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 46F923CB36 for ; Thu, 21 Mar 2019 03:46:43 -0400 (EDT) Received: by mail-lf1-x131.google.com with SMTP id u68so3866028lff.7 for ; Thu, 21 Mar 2019 00:46:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=HSYOiSKLH00OFDiItfzxG2rZfep9PdHEWqpAS2LoLhA=; b=rQj/s7S3OjHjC+Cf/chpaYU4PoQKT7d4gV4xZE9YJLsLt7Y2+YBscqQBRvvMI4KDf8 88CvEkL8BOK3urMxwiuupfS8uOiC15xW1hvUsi5KVOD3dFa5l6Vt0BAFSJW1f/gBnhyF sFqu2Aw7f5EbkdU/cDIaP5PdnW6EiaB7X+z1c3UvKJEUaI9cm7xEf8OB2xnKXsv7RdDq L01JwReqRc35SMylaRyrzr1We/tu2ht3fhiCXQzAw8869liydaJ5BuktlpQmrGOS7icU GM2HZpYWgbBrG59eLInufkWeSQM5Phq4il69g4rltP0NhPU8Kd5y4EzU+sbdzpFJ1H1q XPpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=HSYOiSKLH00OFDiItfzxG2rZfep9PdHEWqpAS2LoLhA=; b=rZ/+SCiwLfHMDaPPigExBBfpvU39345XHuZGf4StqgONffZ33+XVjkgZw1SzYuxLBh 00RS3KBCBP5NHufB7d8/Ad02+mWxQ3SihSKcjpVKQwszJHJVtH7M8vXbHAS7/hv5nzSD r1BHJHNFJ0TJzSgKoMDwcqO7gHu6KdkJ3VjU1shHddZ7R8TEgIlhPKkAVKuMFUzLN1FE sXbxFuEmP/6zpTD5NUa8IMc+KSGH+unBCylyy6IN2Jvk0QQImS1m6vK0xDUCSjH0rXnl hI+LhFuMwFWe1WtxyBfy2xWRytLw+aV99yA4iO2dAyOrOd48as/biRQ0jKbzXR2odxNV /7oQ== X-Gm-Message-State: APjAAAVvHOIiHloETKyPKp8Y0pcD7VZzjJDMQiE3ycHipjJO72Ggw+G7 kI4hkIzIOYljkqGb/evX84E= X-Google-Smtp-Source: APXvYqxn25E3N2o/IX74WvMNTM05fFNNksZNi9I422JugxWUjJd/Q+7nbTUr19nceSl4wfaHJwd8uA== X-Received: by 2002:a19:40cc:: with SMTP id n195mr1226691lfa.150.1553154402147; Thu, 21 Mar 2019 00:46:42 -0700 (PDT) Received: from jonathartonsmbp.lan (83-245-226-9-nat-p.elisa-mobile.fi. [83.245.226.9]) by smtp.gmail.com with ESMTPSA id y19sm814974lfd.62.2019.03.21.00.46.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Mar 2019 00:46:41 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) From: Jonathan Morton In-Reply-To: Date: Thu, 21 Mar 2019 09:46:39 +0200 Cc: "Holland, Jake" , tsvwg IETF list , bloat Content-Transfer-Encoding: quoted-printable Message-Id: References: <1E80578D-A589-4CA0-9015-B03B63042355@gmx.de> <27FA673A-2C4C-4652-943F-33FAA1CF1E83@gmx.de> <1552669283.555112988@apps.rackspace.com> <7029DA80-8B83-4775-8261-A4ADD2CF34C7@akamai.com> <1552846034.909628287@apps.rackspace.com> <5458c216-07b9-5b06-a381-326de49b53e0@bobbriscoe.net> <7e49b551-22e5-5d54-2a1c-69f53983d7e5@bobbriscoe.net> <04E62EA7-82EF-4F1B-A86D-5A23CA3B190A@gmail.com> To: Bob Briscoe X-Mailer: Apple Mail (2.3445.9.1) Subject: Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Mar 2019 07:46:43 -0000 > On 21 Mar, 2019, at 8:04 am, Bob Briscoe wrote: > Congestion controls are tricky to get stable in all situations. So it = is important to separate ideas and research from engineering of more = mature approaches that are ready for more widespread experimentation on = the public Internet. Our goal with L4S was to use proven algorithms, and = put in place mechanism to allow those algorithms to evolve. I hope that from my example, you can see how to adapt a more flexible = and "mature" version of DCTCP to use SCE. You should be able to use the = same algorithms that you've worked so hard on; only the signalling = method changes, and the trigger for falling back to Classic ECN = behaviour is explicit (a plain old CE mark). As for "proven algorithms", it was conclusively proven that DCTCP was = *not* compatible with Classic ECN middleboxes, and had only been proven = to work in tightly controlled environments. I am told that TCP Prague = has a failsafe, but I do not yet understand how that failsafe works, and = what I have been told sounds fragile. I am honestly perplexed that no = explanation of this is forthcoming. SCE works transparently with every deployed and proven congestion = control algorithm out there, which simply ignores the information SCE = provides. Adaptations of some of those algorithms to incorporate SCE = information seem to be straightforward to implement, especially since = ns-3 now supports AccECN, so initial full-system experiments should be = forthcoming quite soon. We should even be able to rehabilitate DCTCP = without resorting to failsafe workarounds - which *should* have you guys = jumping for joy, in theory. > As regards the desire to use SCE instead of the L4S approach of using = a classifier, please answer all the reasons I gave for why that won't = work, which I sent in response to your draft some days ago. I'm afraid that must have got lost in the noise. There *was* a lot of = noise; it gave me a headache. Regardless, I haven't seen any real claims that SCE won't work, except = for some quibbles about RTT-fair convergence with single queues, which I = subsequently found an elegant way to address. We do have a bit of a = publication bottleneck over here at the moment; limited manpower. I have mainly seen claims that SCE isn't a one-for-one replacement for = L4S using exactly the same mechanisms and infrastructure as L4S does. = Which is true, but unhelpful, because that would make SCE literally = identical to L4S with no advantages of its own. I'm willing to point = out ways to implement L4S' goals using SCE; see below. > The main one is incremental deployment: the source does not identify = its packets as distinct from others, so the source needs the network to = use some other identifier if it wants the network to put it in a queue = with latency that is isolated from packets not using the scheme. The = only way I can see to so this would be to use per-flow-queuing. I think = that is an unstated assumption of SCE. Strictly minimising latency for the individual flow, in the face of = competing non-SCE traffic sharing a single queue, is not a goal of SCE = per se; I consider it an orthogonal problem which is better addressed by = existing solutions. Coexisting with existing endpoints, existing = traffic and existing middleboxes is paramount, and forms our main = argument for incremental deployability. Solutions already available include FQ and Diffserv. I'll grant you = that FQ is easier to implement at lowish speeds, where a cheap CPU can = be loaded with flexible software to do the job. You appear to be more = focused on relatively high link capacities, as that is your main = argument against FQ. I'll just note in passing that good FQ can extract = a lot of responsiveness from relatively low-capacity links. Diffserv is widely deployed (in terms of hardware capabilities) and = should be a natural fit for distinguishing classes of traffic from each = other. It is rarely used by applications because the networks tend to = corrupt it in transit, and rarely make good use of the information into = the bargain. It strikes me that the cable industry may have more = influence over that than I do. > The SCE way round does not allow the ECN field to be used as a = classifier=E2=80=A6 The ECN field was never intended to be used as a classifier, except to = distinguish Not-ECT flows from ECT flows (which a middlebox does need to = know, to choose between mark and drop behaviours). It was intended to = be used to convey congestion information from the network to the = receiver. SCE adheres to that ideal. There is a perfectly good and under-utilised 6-bit field for carrying = classifier information, right there in the same byte as the ECN field. = You might want to ask the LE PHB guys for advice on making good use of = it. > You also don't get the benefit of being able to relax resequencing in = the network, because the network has no classifier to look at. My position is that the network is already free to relax resequencing = semantics, regardless of the traffic carried. IP does not guarantee = anything about packet ordering, and protocols built on top of it have = always had to cope with that, one way or another. Wifi's head-of-line blocking while performing link-level retries can = induce inter-flow coupled delays of many seconds in extreme cases, = destroying reliability completely. Recent work already reduces the = effort the Linux wifi stack puts into link-level retries, given that = most transports and protocols can survive some level of random loss. = This is done without relying on any classifier, because it benefits all = traffic. On high-capacity bonded links, the likelihood that two packets sent = near-simultaneously on different component links, and consequently = reordered, both belong to the same flow and will trigger a spurious = retransmission, seems to be low enough to not care about, even with = existing 3-dupack sensitive TCPs. Therefore, relaxing resequencing = requirements on these links should already be safe. Perhaps you have = hard data showing otherwise? > =E2=80=A6the SCE codepoint would need to be combined with a DSCP, and = I assume you don't want to do that. The SCE codepoint does not need to be combined with a DSCP. Whether or = not a DSCP assignment fits a given application is completely orthogonal = to SCE. You could quite reasonably implement something that looks very like = DualQ using a trivial DSCP classifier instead of an ECN-based = classifier, and that would be absolutely fine, and it would work with = SCE. It's just not necessary to make SCE work in the first place. Meanwhile, I still have not seen a detailed answer as to how, precisely, = TCP Prague reliably distinguishes a Classic ECN middlebox from an L4S = one, in order to activate its failsafe mechanism. Without that, I'm = afraid I must assume that TCP Prague is not incrementally deployable. Indeed, I was under the impression that DualQ and the use of ECT(1) as a = classifier stemmed from this incompatibility, rather than being = considered features in their own right. - Jonathan Morton