From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id D9D193B2A4 for ; Tue, 9 Mar 2021 13:09:23 -0500 (EST) Received: by mail-io1-xd2f.google.com with SMTP id g27so14952077iox.2 for ; Tue, 09 Mar 2021 10:09:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=XR+z6q3tL7R/sQ6eK7LsEN1QSOZ5FBWMYlLd3zXXEsE=; b=uINsUmMj94rzU6jC9hkrEql9ig406WsEjXRcv065mbMD8P0WJg4JlPy3bcnAFkEYdn jkWtqmOhlQAav+qRbu3KxqiS1QuNaUxSDWFtfkPHEdy79ezVg9r752KoAB6jCNfMouU+ DpPkNbjElbujSygWRoWZO9DErhDGQUHZASOjybLPqlSRlWdl9yUtX+fq9OiFb3xkyQeH QA8y6QFTw524zXpjWUVFdQqyBOYhRGAyiSi4p81eqXeAAy+2c/S9/696lSP9HWEOZT2a 1ffeuCofpzeEFZ/cjmmSS7TVvOGzl1trxojCWyTA0QY1r/1WeqpFQuGbz4A+JeERcch6 tREQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=XR+z6q3tL7R/sQ6eK7LsEN1QSOZ5FBWMYlLd3zXXEsE=; b=DthXHrsAmTuRVx+wczVoYznc2ApbNGWSSufajaBjnFnk8xSkCPIZu6+z2FsPJIDt0S gf+wUi2akfqv2BqaZxncIiz1LnOVqXu5tF7te+Yv0rFkWotKEe2kH8aqoApy3hvEw+Hp cXjuOX6yPnU7MzMg0CpFmDKLIWhcQw+lSQ/KAH0ul4I2Lu3bxmXpA1L5ycikqrTIXPpk K0iA8GBDrBYnHMLd/7Iig8CAa1HZj5uJYUOyME8wQ4MHxLDmvBkjI34P/eTd4a/aFh5D YOnfr0LFj8YBG7P15Ka1rJ2Y/No+lYD20bktUb5rhDOFhQv3qD5yRknchCLlzNZzHgmk Taxg== X-Gm-Message-State: AOAM5308CW3EpzvyT91Vot2YWHwUPQ11m4e6VNBTpcHFAVUDK/8NxSzt cfY8yASgtuxfAP6DBHIQ7Z9lErKPmfeUvmX/44w= X-Google-Smtp-Source: ABdhPJxC9MzJ7NDebM0goBX/xliVlNo0KlGNYwfqRzoYvi8GAJ4y9/e5JW5xsufWzL7lDen646jT7o1c7TY+q+Z4ir4= X-Received: by 2002:a5d:9599:: with SMTP id a25mr14675467ioo.25.1615313363145; Tue, 09 Mar 2021 10:09:23 -0800 (PST) MIME-Version: 1.0 References: <202103091519.129FJBQg077653@gndrsh.dnsmgr.net> In-Reply-To: From: Dave Taht Date: Tue, 9 Mar 2021 10:09:11 -0800 Message-ID: To: "Rodney W. Grimes" <4bone@gndrsh.dnsmgr.net> Cc: Pete Heist , ECN-Sane Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [Ecn-sane] A brick wall threshold for SCE_threshold X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Mar 2021 18:09:24 -0000 while I appreciated the experiments with making the assertion of SCE_threshold spread out, and not a brick wall, to me it seems best to coalesce on a brick wall for it at a fixed period of time the operator considers "queuing". Does the SCE team agree with this approach? I note, in trying to wrap my head around this as to how to make it work at all, for wifi, that I was kind of stuck on it happening after 1-2 txops, as that was the minimum queue depth that can be achieved today with today's hw. However, that is a minimum !250us, and a maximum of over 10ms, and that doesn't count at all the arbitratraton delay potential from many stations which can stretch out to hundreds of ms. Nowadays I think about it as merely being a fixed amount of time, not txops, in the 2.5ms range. I do not like jitter and interference inducing long txops in the first place, and present day 802.11ac can pack so much data into a single aggregate, that finding ways to get rid of longer txops in general has long been on my mind so as wifi could multiplex better over many more stations. as for LTE, gawd knows. Still, that multiplexing over stations takes a LONG time, and perhaps it would be better to apply a delta from the last station service time to the sce_threshold time, before considering it too much queuing. On Tue, Mar 9, 2021 at 7:28 AM Dave Taht wrote: > > I have basically hoped to find some sysadmin out there with battle > experience with a real dctcp deployment (such as the one at facebook) > that could share his or her > insights as to this debate. > > Most of the public testing with BBRv2 has instead been with a brick > wall setting to CE_threshold, the published data on it was a 260us, > which was about as low as raw iron linux can get. I tend to favor the > brick wall approach over anything more complex for the AQM component > of a SCE architecture, and to modify the transports to suit this > assumption. > > On Tue, Mar 9, 2021 at 7:19 AM Rodney W. Grimes <4bone@gndrsh.dnsmgr.net>= wrote: > > > > > I am of course much happier interacting here than over on tsvwg. > > > > > > I would like very much > > > > > > A large scale bit twiddling test from the larger providers of fq_code= l > > > and cake based hw and middleboxes. > > > > > > extensive testing on lte and wifi transports. even emulated. > > > the sce patches polished up and submitted for review to netdev as wha= t > > > we call there, an RFC > > > the ability to test both SCE and L4S on openwrt (backport thereof) > > > > > > I know there's been a lot of work on other qdiscs than cake, but > > > haven't paid much attention. netdev review would be nice. > > > > > > A simple internet-wide test of say bittorrent, or using an adserver > > > method like what apnic uses. > > > > > > Most of all I'd really like someone to take a stab at RFC3168 support > > > for BBR. And by that, I don't mean a reduction by half, but to > > > institute an RTT probe phase. A fixed rate reduction per RTT is simpl= y > > > not > > > going to work IMHO, period, on lte and wifi. I'm sure dave miller > > > would accept such a patch, and this would also lead towards better ec= n > > > support across the internet in general... at least from the > > > perspective of the one provider I have an in with, dropbox. > > > > > > SCE support for BBRv2 would be nice also. > > > > > > And I know, a pony, all sparkly and stuff. I find it very difficult t= o > > > summon the cojones to do a drop of work in this area, and I keep > > > cheering you on - especially on the bug fixing and wonderful scripts > > > and data you've provided so far, and I do wish I could find some way > > > to help that didn't cause so much ptsd in me. > > > > > > I wish I merely knew more about how dctp was configured in the data > > > center. So far as *I* know its on dedicated switches mostly. I would > > > vastly prefer a dscp codepoint for this, also. > > > > If you want I can take a discussion on how DCTCP is configured in > > data center switches in a side channel. Its basically using a > > drop function that has a starting queue depth, and slope that > > as you go higher on the slope your propability of marking > > increases. The 40 gbit experiments we did with HPE modified > > that part of the switch asic code to do SCE marks starting > > at 1% mark probability for 1% Queue depth, up to 100% marking > > at 100% Queue depth. Sadly that was a slight mistake on > > my part, and I thought we would get a chance to iterate > > and retune these, I just pulled those values out as a WOG > > and gave them to HPE to let them get on with setting it up. > > > > For most practical purposes an SCE mark and a DCTCP CE > > mark convey very similiar, if not the same information, > > as does a CE mark in L4S. > > > > Usually tunning of these queue depth and slope values involes > > data collection and several iterations of experiments. > > > > TOR vs SPINE switches are usually configured with different values. > > > > I believe there is even some work by someone that does data > > collection across the whole datacenter and tries to adjust > > these automagically and dynamically. > > > > > > > > On Mon, Mar 8, 2021 at 4:36 PM Pete Heist wrote: > > > > > > > > Sorry for reviving an old thread as I haven't been on this list in = a > > > > while: > > > > > > > > > > SCE proposes to use ect(1) as an indicator of some congestion a= nd > > > > does > > > > > > not explictly > > > > > > require a dscp codepoint in a FQ'd implementation. > > > > > Pretty much. I do think that a demonstration using an > > > > > additional DSCP to create a similar HOV lane for SCE would have g= one > > > > > miles in convincing people in the WG that L4S might really not be= as > > > > > swell as its proponents argue, IMHO it won the day more with its > > > > > attractive promise of low latency for all instead of what it > > > > delivers. > > > > > > > > On that, I don't think any of us knows how things will end up or ho= w > > > > long it will take to get there... > > > > > > > > I do agree that the interim meeting leading up to the codepoint > > > > decision could have gone better. Everything went great until it cam= e to > > > > how to deploy SCE in a small number of queues. We had dismissed the > > > > idea of using DSCP, because we thought it would be panned for its p= oor > > > > traversal over the Internet. That may still have been the case, but= it > > > > also may have worked if sold right. We thought that AF alone might = be > > > > enough to get past that part, but it wasn't. > > > > > > > > We already implemented a two-queue design that uses DSCP, but eithe= r > > > > there wasn't much interest, or we didn't sell it enough. Plus, for > > > > those who demand a two queue solution that requires no flow awarene= ss > > > > at all, DSCP alone may not get you there, because you still need so= me > > > > reasonably fair way to schedule the two queues. So that might have = been > > > > the next line of criticism. Scheduling in proportion to the number = of > > > > flows each queue contains is one effective way to do that, but that > > > > requires at least some concept of a flow. Perhaps there's another w= ay > > > > that doesn't introduce too much RTT unfairness, but I'm not sure. > > > > > > > > In our defense, there was already a lot of support built up for L4S= , > > > > and stepping in front of that was like stepping in front of a freig= ht > > > > train no matter what we did. I think we've made a decent argument i= n > > > > the most recent version of the SCE draft that ECN is a "network > > > > feature" which carries higher risks than drop-based signaling, and > > > > warrants the requirement for unresponsive flow mitigation, for > > > > starters. That of course requires some level of flow awareness, whi= ch > > > > then makes various queueing designs possible. And, there may still = be > > > > deployment possibilities with DSCP, as Rodney mentioned. > > > > > > > > Anyway, there's progress being made on SCE, with some new ideas and > > > > improvements to testing tools coming along. > > > > > > > > Pete > > > > > > > > > > > > Ecn-sane@lists.bufferbloat.net > > > dave@taht.net CTO, TekLibre, LLC Tel: 1-831-435-0729 > > -- > > Rod Grimes rgrimes@free= bsd.org > > > > -- > "For a successful technology, reality must take precedence over public > relations, for Mother Nature cannot be fooled" - Richard Feynman > > dave@taht.net CTO, TekLibre, LLC Tel: 1-831-435-0729 --=20 "For a successful technology, reality must take precedence over public relations, for Mother Nature cannot be fooled" - Richard Feynman dave@taht.net CTO, TekLibre, LLC Tel: 1-831-435-0729