From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd36.google.com (mail-io1-xd36.google.com [IPv6:2607:f8b0:4864:20::d36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id CC0753B2A4 for ; Sat, 16 May 2020 12:32:19 -0400 (EDT) Received: by mail-io1-xd36.google.com with SMTP id y10so6024099iov.4 for ; Sat, 16 May 2020 09:32:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=dPfHWy+gNITAzPSUJj3mwNSqd+H3NV0vlqCXI7P5d3I=; b=cZDq+zrxf1V6r5eizj9s4GWHJjbADSYysP+oZx07KffxX0O8E2/vcLM+Vv2TKXzIYl GjxMbOg5uLM43NdnK8XJQJ4VAXiir07S76S5HMGusUKO0PCzUrFoFCImX1+jhZXGO5O3 CGFzwWIKqkkVnGcsA9T9jsw8CU62art7kHCDBEC+EUu/Ddtxfzr5k+4h4I2X2/cXZ1Ke dyZgS64Yhs/+kCxIIu9IumTy1WS6JaZgp8V+asvKnajNOj6COsA8Wc9mXewVuMPbz6GZ DNGznQQVgZaqqBLATCDpabVf8hv0ahuery1rm/utquLZzUXqAo4YVR1C/iEV0YJmjWrL 9B4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=dPfHWy+gNITAzPSUJj3mwNSqd+H3NV0vlqCXI7P5d3I=; b=HY27kW65Ayw84sBeqgxGv6QTIYCfNTAZVFfk8z+gDnFG1dmbzD6hRQ8QE6dSjl7CLG j+anaSWI/qn+RsQGD3laCGIdfTHK86PDTQ46ppuE513YIH4neKhCLiHH1aBz0lVkY6wk VCfyvmEn0/+DDnbezy62Hiufi7/MJLTgh0ZIVL09XX/V5BVydlnoaPQ97A1U9Qp9tOir 1+RshNbLE1H6W+Az0HOYtdm5sD3Bb4RNomxTsBv1JduJCI1R/3rey6shdO4JrAsEvgHm /EOu/i4ygRLpA0yMYNljcMIiXMNmKOKVi13A87N9HeNaN2269fmnMJQwEM3zA5AF1+cq 1w0A== X-Gm-Message-State: AOAM5325sSRgxUzYle3xgyIK+sUfxBn1nloszqKiyAKn+NV7jje0XAYB j/8+/R92jwXYZ7AR2C7AQ92gLArMlyoIyFvcRociq2MNXDY= X-Google-Smtp-Source: ABdhPJxr19LwtActc9m2wOsocQuS7hRXzjlpXqXGtZoWWFUVcdAgL3v9RHxYjc7c2NJy5jIV3UvLCSthfqqnXELFfUw= X-Received: by 2002:a05:6638:1014:: with SMTP id r20mr8136341jab.29.1589646739157; Sat, 16 May 2020 09:32:19 -0700 (PDT) MIME-Version: 1.0 References: <6d925e3b-2781-0fb1-6936-7a6c006b9a21@bobbriscoe.net> In-Reply-To: <6d925e3b-2781-0fb1-6936-7a6c006b9a21@bobbriscoe.net> From: Dave Taht Date: Sat, 16 May 2020 09:32:07 -0700 Message-ID: To: Bob Briscoe Cc: ECN-Sane , tsvwg IETF list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Ecn-sane] Fwd: my backlogged comments on the ECT(1) interim call X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 May 2020 16:32:19 -0000 On Wed, Apr 29, 2020 at 2:31 AM Bob Briscoe wrote: > > Dave, > > Please don't tar everything with the same brush. Inline... > > On 27/04/2020 20:26, Dave Taht wrote: > > just because I read this list more often than tsvwg. > > > > ---------- Forwarded message --------- > > From: Dave Taht > > Date: Mon, Apr 27, 2020 at 12:24 PM > > Subject: my backlogged comments on the ECT(1) interim call > > To: tsvwg IETF list > > Cc: bloat > > > > > > It looks like the majority of what I say below is not related to the > > fate of the "bit". The push to take the bit was > > strong with this one, and me... can't we deploy more of what we > > already got in places where it matters? > > > > ... > > > > so: A) PLEA: From 10 years now, of me working on bufferbloat, working > > on real end-user and wifi traffic and real networks.... > > > > I would like folk here to stop benchmarking two flows that run for a lo= ng time > > and in one direction only... and thus exclusively in tcp congestion > > avoidance mode. > > [BB] All the results that the L4S team has ever published include short > flow mixes either with or without long flows. > 2020: http://folk.uio.no/asadsa/ecn-fbk/results_v2.2/full_heatmap_rr= r/ > 2019: > http://bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf#su= bsection.4.2 > 2019: https://www.files.netdevconf.info/f/febbe8c6a05b4ceab641/?dl= =3D1 > 2015: > http://bobbriscoe.net/projects/latency/dctth_preprint.pdf#subsection.7.2 > > I think this implies you have never actually looked at our data, which > would be highly concerning if true. I have never had access to your *data*. Just papers that cherry pick results that support your arguments. No repeatable experiments, no open source code, the only thing consistent about them has been... irreproduceable results. Once upon a time I was invited to keynote a talk at sigcomm ( https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf ), where I had an opportunity to lay into not sad state of network research today but all of science (they've not invited me back). So in researching the state of the art since I last checked in, I did go and read y'alls more recent stuff. Taking on this one : http://bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf#subs= ection.4.2 The experimental testbed design is decent. The actual experiment laid out in that section was as a test of everything... but the behaviors of the traffic types I care about most: voip and videoconferencing and web. I found the graphs in the appendix too difficult to compare and unreadable, and I would have preferred comparison plots. A) Referring to some page or another of my above paper... It came with "ludicrous constants". For a 40Mbit link, it had: Buffer: 40,000 pkt, ECN enabled Pie: Configured to drop at 25% probability # We put in 10% as an escape valve in the rfc, why 25%? Did it engage? fq_codel: default constants - dualpi: Target delay: 15 ms, TUpdate: 16 ms, L4S T: 1 ms, WRR Cweight: 10%,=CE=B1: 0.16,=CE=B2: 3.2, k: 2, Classic ECNdrop: 25% The source code I have for dualpi has a 1000 packet buffer. The dualpi example code (when last I looked at it) had 0 probability of drop. A naive user would just use that default. Secondly, your experiment seems to imply y'all think drop will never happen in the ll queue, even when ping -Q 1 -s 1000 -f is sufficient to demonstrate that probability. OK, so this gets me to... Most of the cpe and home router hardware I work with doesn't have much more than 64MB of memory, into you also have to fit a full operating system, routing table, utilities and so on. GRO is a thing, so the peak amount of memory a 40,000 packet buffer might use is is 40000 * 1500 * 64 =3D 3,840,000,000. ~4GB of memory. Worst case. For each interface in the system. For a 40Mbit simulation. Despite decades of work on making OSes reliable, running out of memory in any given component tends to have bad sideffects. OK, had this been a repeatable experiment, I'd have plugged in real world values, and repeated it. I think on some bug report or another I suggested y'all switch to byte, rather than packet limits, for the code, as you will especially see, mixed up and down traffic on the rrul_be tends to either exhaust a short fixed length packet fifo, or clog it up, if it's longer. Byte limits (and especially bql) is a much better approximation to time, and works vastly better with mixed-up/down traffic, and in the presence of GRO. If I have any one central tenant: edge gateways need to transit all kinds of traffic in both directions, efficiently. And not crash. OK... so lacking support for byte limits in the code, and not having 4GB of memory to spare... and not being able to plug in real world values into your test framework... So what happens with 1000 packets? Well, the SCE team just ran that benchmark. The full results are published, and repeatable. And dismal, for dualpi, compared to the state of the art. I'll write to that more, but the results are plain as day. B) "were connected to amodem using 100Mbps Fast Ethernet; the xDSL linewas configured at 48Mbps downstream and 12Mbps up-stream; the links between network elements consistedof at least 1GigE connections" So you tested 4/1 down up asymmetry. but you didn't try an asymmetric load up/down load. The 1Gbit/35 mbit rrul_be test just performed by that team, as well as the 200/10 test - both shipping values in the field, demonstrated the problems that induces. Problems so severe that low rate videoconferencing on such a system, when busy, was impossible. While I would certainly recommend that ISPs NEVER ship anything with more than a 10x1 ratio, it happens. More than 10x1 is the current "standard" in the cable industry. Please start testing with that? > > Regarding asymmetric links, as you will see in the 2015 and 2019 papers, > our original tests were conducted over Al-Lu's broadband testbed with > real ADSL lines, real home routers, etc. When we switched to a Linux > testbed, we checked we were getting identical results to the testbed > that used real broadband kit, but I admit we omitted to emulate the > asymmetric upstream. As I said, we can add asymmetric tests back again, > and we should. Thank you. I've also asked that y'all plug in realistic values for present day buffering both in the cmts and cablemodems. and use the rrul_be, rtt_fair_var, and rrul tests as a basic starting point for a background traffic load. DSL is *different* and more like fiber, in that it is a isochronous stream, that has an error rate, but no retransmits. Request/grant systems, such as wifi and cable, operate vastly differently. worse, Wifi and LTE, especially, have a tendency to retry a lot, which leads to very counter-intuitive behaviors that long ago made me dismiss reno/cubic/dctcps as appropriate and BBR-like cc protocols using a mixture of indicators, especially including rtt, the only way forward for these kinds of systems. Packet aggregation is a thing. We need to get MUCH better about dropping packets in the retry portion of the wireless macs, especially for voip/videoconferencing/gaming traffic. There's a paper on that, and work is in progress. > > Nonetheless, when testing Accurate ECN feedback specifically we have > been watching for the reverse path, given AccECN is designed to handle > ACK thinning, so we have to test that, esp. over WiFi. In self defense, before anybody uses it any further in testing 'round here: I would like to note that my netem "slot model", although a start towards emulating request/grant systems better, and coupled with *careful*, incremental, repeatable analysis via the trace stuff also now in linux netem, can be used to improve congestion control behavior of transports. See: https://lore.kernel.org/netdev/20190123200454.260121-3-priyarjha@google.com= /#t the slot model alone, does not, emphatically, model wifi correctly at all for, any but the most limited scenarios. Grant/requests are coupled in wifi, which are driven by endpoint behavior. It's complete GIGO after the first exchange, if you trust in the slot model naively, without recreating traces for every mod in your transport, and retesting, retesting, retesting. The linux commit for netem's slotting model provides a reference for a 1-2 station 802.11n overly ideal emulation; it was incorrect and unscalable, and I wish people would stop copy/pasting that into any future work on the subject. 802.11ac is very different, and 802.11ax different too. As one example, the limited number of packets you can fit into 802.11n txop makes SFQ (which ubnt uses) a better choice than DRR, but DRR is a better approach for 802.11ac and later. (IMHO). However! most emulations of wifi assume that it's lossy (like a 1% rate) which is also totally wrong. So the slot model was progress. I don't know enough about lte, but they leave retries undefined by the operator and they are usually set really high. I've long said there is no such thing as a rate in wireless - bandwidth/interval is a fiction, because over any given set of intervals, in request/grant/retry prone systems, bandwidth varies from 0 to a lot on very irregular timescales. Eliding the rest of this message. -- "For a successful technology, reality must take precedence over public relations, for Mother Nature cannot be fooled" - Richard Feynman dave@taht.net CTO, TekLibre, LLC Tel: 1-831-435-0729