From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id B08AB3B29D for ; Thu, 14 Apr 2022 17:25:31 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1649971528; bh=ol6fwMIIrPigOAQIr4Pe1wWDI51D9pGkPb2YXg7jR3k=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=YyfP5iXzMF51aGSU7q1WsWWZ+5fQZ4+55IceD5xOgQPR0c5GaYKTDyHrNkqQniEd8 COorQ9CLRu/hi7bLXSa1xFbqig6Te5fynpQ9hZHDCZdhUHER8Db5Fs4Z5w0o/38fwj 0Cqlm02mrcBFn5A2VcFGoPh66dxrQktNDp6CoPsc= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from smtpclient.apple ([77.0.219.69]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MAOJP-1nma1b2i6N-00BrV7; Thu, 14 Apr 2022 23:25:28 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) From: Sebastian Moeller X-Priority: 3 (Normal) In-Reply-To: <1649955272.49298319@apps.rackspace.com> Date: Thu, 14 Apr 2022 23:25:26 +0200 Cc: Michael Welzl , ecn-sane@lists.bufferbloat.net Content-Transfer-Encoding: quoted-printable Message-Id: References: <4430DD9F-2556-4D38-8BE2-6609265319AF@ifi.uio.no> <1649778681.721621839@apps.rackspace.com> <0026CF35-46DF-4C0C-8FEE-B5309246C1B7@ifi.uio.no> <08F92DA0-1D59-4E58-A289-3D35103CF78B@gmx.de> <1649955272.49298319@apps.rackspace.com> To: "David P. Reed" X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Provags-ID: V03:K1:cS8RlgA/kaU9h4N/koghK7vieg6RcxJa4BcoIYWwJTNlcLNLRGc u4qnctSMXVkfJYPf/jz7bLhe63qnymYWu8iyozORZdCTO5ZJcZZeJ2AbQF80YUrkKmiTuaw US1p62boXCZoP+h/eijQh96bAeFvWV5WOPdhqqgDXwWRELVVxn7PgnCkmilHjOX9ICsNA3c lkZ6CSGXSYg7TQT4uFRVg== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:et4IRXg6FP0=:LDv8rE51YslQQO/AoKzJyz kYNCCmD1WcrVg42999ZVCSXVqHvxBcgxS0040Kg0Bd4/aZrXoE71ut36cqoBCS/AmdjHq58s6 4IdCOBGreOCtCsYZlF/n0wiQQDKSI/QV0T9b5EtQlexNgqLHXklpcVf2Fv3C1DuKaZ9P/pQdt MIHtjZXKX/2VRVwC6oJHLUiGq5YpsbWNStlPCd6aH1CxxnRV1+EFnxFRRvUGenq1RSYw8Wcw5 waaCCHCay8vMF9O/IuBjhcMtSYwBuhc4wmp+/485uc6lTK4a/rO6OxNS1lSSqCeMb8MdfWlir yLadjsnlTq3xyJ4ddDdCr9uC5nM7gJFsyaOqeiL9IZIXTBmsl/cbFPSnucT8i2LS9wOrSlFI7 Aax7UHLWHOg5puHPrmSca2DlhW7KuxMeleYibQwl2cHV265IEf3T4NSb/qwJjYg6IG1y/19A5 aXDJbKGV3toIrZz+xE8YuIDcfY3xNKBSH3Ma/EvEIIiNpFH4rUBO7/+2TSiyjOY5HmIg8qe4M poU4QSTd5s2PSGNojb5/nyd483xi02DesdrUcOPwaA6Vj6EgaatU31PTvj2zR37vvb0E3QWhL wiFR1/MUFvk9CYbtgwgd/u3YVoOu+ByOXjM2/VQnCl0B6frnFb3Xj1WCS1xl7giML6TDVN0J/ xkQldQMg8IK4RuyMxRfGLI197LjLyLbhSjoLQ99KmaUrEbhSo7jxw7KhGVKMIgFYbKb38j+Zw m2kO8Ifk8/6dQmmDS0Ih0/wQOTNZbw15X34MJK9wmVzlZCTg92NBGDqtqRnHu2TCb60YfMU20 kxbDA/0WWRDj9IFkkPPFaz+xTBcXss4yHDGF2y2Qt0ASAAo+QEAw3FWXkE4cbIi69epSccghK 2/ug7aLVEFoeGCDEXXcO0MQXAjSzLbC1XM2diG/7MUAmBdBjvwEHRCmzNd5bDXJiyxTBVvO4l jXjwffvwxa/ilphuF/IizO/Zxy+Rt/yvTEtgQ+cXf1Fsevu25WEzIJwDeI62xZQZMp3CoNk2z pN6bzHJQ4MeEjC7icDhR/kiz/yn0mvtNF4DMoFI18qaqrN0VFrbFFuZQpCESmvkqkna87RfXs 151jGQ5dRaK6v8= Subject: Re: [Ecn-sane] rtt-fairness question X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 21:25:32 -0000 Just indulge me here for a few crazy ideas ;) > On Apr 14, 2022, at 18:54, David P. Reed wrote: >=20 > Am I to assume, then, that routers need not pay any attention to RTT = to achieve RTT-fairness? Part of RTT-bias seems caused by the simple fact that tight control = loops work better than sloppy ones ;) There seem to be three ways to try to remedy that to some degree: 1) the daft one: define a reference RTT (larger than typically encountered) and = have all TCPs respond as if encountering that delay -> until the path = RTT exceeds that reference TCP things should be reasonably fair 2) the flows communicate with the bottleneck honestly: if flows would communicate their RTT to the bottleneck the = bottleneck could partition its resources such that signaling (mark/drop) = and puffer size is bespoke per-flow. In theory that can work, but relies = on either the RTT information being non-gameably linked to the = protocol's operation* or everybody being fully veridical and honest *) think a protocol that will only work if the best estimate of the RTT = is communicated between the two sides continuously 3) the router being verbose: If routers communicate the fill-state of their queue (global or = per-flow does not matter all that much) flows in theory can do a better = job at not putting way too much data in flight remedying the cost of = drops/marks that affects high RTT flows more than the shorter ones. (The = router has little incentive to lie here, if it wanted to punish a flow = it would be easier to simply drop its packets and be done with).=20 IMHO 3, while theoretically the least effective of the three is the only = one that has a reasonable chance of being employed... or rather is = already deployed in the form of ECN (with mild effects). > How does a server or client (at the endpoint) adjust RTT so that it is = fair? See 1) above, but who in their right mind would actually = implement something like that (TCP Prague did that, but IMHO never in = earnest but just to "address" the L4S bullet point RTT-bias reduction).=20= > Now RTT, technically, is just the sum of the instantaneous queue = lengths in bytes along the path and the reverse path, plus a fixed = wire-level delay. And routers along any path do not have correlated = queue sizes. > =20 > It seems to me that RTT adjustment requires collective real-time = cooperation among all-or-most future users of that path. The path is = partially shared by many servers and many users, none of whom directly = speak to each other. > =20 > And routers have very limited memory compared to their = throughput-RTdelay product. So calculating the RTT using spin bits and = UIDs for packets seems a bit much to expect all routers to do. If posed like this, I guess the better question is, what = can/should routers be expected to do here: either equitably share their = queues or share queue inequitably such that throughput is equitable. = =46rom a pure router point of the view the first seems "fairest", but as = fq_codel and cake show, within reason equitable capacity sharing is = possible (so not perfectly and not for every possible RTT spread). > =20 > So, what process measures the cross-interactions among all the users = of all the paths, and what control-loop (presumably stable and = TCP-compatible) actually converges to RTT fairness IRL. Theoretically nothing, in reality on a home link FQ+competent = AQM goes a long way in that direction. > =20 > Today, the basis of congestion control in the Internet is that each = router is a controller of all endpoint flows that share a link, and each = router is free to do whatever it takes to reduce its queue length to = near zero as an average on all timescales larger than about 1/10 of a = second (a magic number that is directly derived from measured human = brain time resolution). The typical applies, be suspicious of too round numbers.... = 100ms is in no way magic and also not "correct" it is however a decent = description of reaction times in a number of perceptul tasks that can be = mis-interpreted as showing things like the brain runs at 10Hz or = similar... > =20 > So, for any two machines separated by less than 1/10 of a light-second = in distance, the total queueing delay has to stabilize in about 1/10 of = a second. (I'm using a light-second in a fiber medium, not free-space, = as the speed of light in fiber is a lot slower than the speed of light = on microwaves, as Wall Street has recently started recoginizing and = investing in). > =20 > I don't see how RTT-fairness can be achieved by some set of bits in = the IP header. You can't shorten RTT below about 2/10 of a second in = that desired system state. You can only "lengthen" RTT by delaying = packets in source or endpoint buffers, because it's unreasonable to = manage all the routers. > =20 > And the endpoints that share a path can't talk to each other and reach = a decision in on the order of 2/10 of a second. > =20 > So at the very highest level, what is RTT-fairness's objective = function optimizing, and how can it work? > =20 > Can it be done without any change to routers? Well the goal here seems to undo the RTT-dependence of = throughput so a router can equalize per flow throughput and thereby = (from its own vantage point) enforce RTT independence, within the amount = of memory available. And that already works today for all identifiable = flows, but apparently at a computational cost that larger routers do not = want to pay. But you knew all that > =20 > =20 > =20 > =20 > On Tuesday, April 12, 2022 3:07pm, "Michael Welzl" = said: >=20 >=20 >=20 > On Apr 12, 2022, at 8:52 PM, Sebastian Moeller = wrote: > Question: is QUIC actually using the spin bit as an essential part of = the protocol? > The spec says it=E2=80=99s optional: = https://www.rfc-editor.org/rfc/rfc9000.html#name-latency-spin-bit > Otherwise endpoints might just game this if faking their RTT at a = router yields an advantage... > This was certainly discussed in the QUIC WG. Probably perceived as an = unclear incentive, but I didn=E2=80=99t really follow this. > Cheers, > Michael >=20 > This is why pping's use of tcp timestamps is elegant, little incentive = for the endpoints to fudge.... >=20 > Regards > Sebastian >=20 >=20 > On 12 April 2022 18:00:15 CEST, Michael Welzl = wrote: > Hi, > Who or what are you objecting against? At least nothing that I = described does what you suggest. > BTW, just as a side point, for QUIC, routers can know the RTT today - = using the spin bit, which was designed for that specific purpose. > Cheers, > Michael >=20 >=20 > On Apr 12, 2022, at 5:51 PM, David P. Reed = wrote: > I strongly object to congestion control *in the network* attempting to = measure RTT (which is an end-to-end comparative metric). Unless the = current RTT is passed in each packet a router cannot enforce fairness. = Period.=20 > =20 > Today, by packet drops and fair marking, information is passed to the = sending nodes (eventually) about congestion. But the router can't know = RTT today. > =20 > The result of *requiring* RTT fairness would be to put the random = bottleneck router (chosen because it is the slowest forwarder on a = contended path) become the endpoint controller. > =20 > That's the opposite of an "end-to-end resource sharing protocol". > =20 > Now, I'm not saying it is impossible - what I'm saying it is asking = all endpoints to register with an "Internet-wide" RTT real-time tracking = and control service. > =20 > This would be the technical equivalent of an ITU central control = point. > =20 > So, either someone will invent something I cannot imagine (a = distributed, rapid-convergence algortithm that rellects to *every = potential user* of a shared router along the current path the RTT's of = ALL other users (and potential users). > =20 > IMHO, the wish for RTT fairness is like saying that the entire solar = system's gravitational pull should be equalized so that all planets and = asteroids have fair access to 1G gravity. > =20 > =20 > On Friday, April 8, 2022 2:03pm, "Michael Welzl" = said: >=20 > Hi, > FWIW, we have done some analysis of fairness and convergence of DCTCP = in: > Peyman Teymoori, David Hayes, Michael Welzl, Stein Gjessing: = "Estimating an Additive Path Cost with Explicit Congestion = Notification", IEEE Transactions on Control of Network Systems, 8(2), = pp. 859-871, June 2021. DOI 10.1109/TCNS.2021.3053179 > Technical report (longer version): > = https://folk.universitetetioslo.no/michawe/research/publications/NUM-ECN_r= eport_2019.pdf > and there=E2=80=99s also some in this paper, which first introduced = our LGC mechanism: > https://ieeexplore.ieee.org/document/7796757 > See the technical report on page 9, section D: a simple trick can = improve DCTCP=E2=80=99s fairness (if that=E2=80=99s really the = mechanism to stay with=E2=80=A6 I=E2=80=99m getting quite happy with = the results we get with our LGC scheme :-) ) >=20 > Cheers, > Michael >=20 > On Apr 8, 2022, at 6:33 PM, Dave Taht wrote: > I have managed to drop most of my state regarding the state of various > dctcp-like solutions. At one level it's good to have not been keeping > up, washing my brain clean, as it were. For some reason or another I > went back to the original paper last week, and have been pounding > through this one again: >=20 > Analysis of DCTCP: Stability, Convergence, and Fairness >=20 > "Instead, we propose subtracting =CE=B1/2 from the window size for = each marked ACK, > resulting in the following simple window update equation: >=20 > One result of which I was most proud recently was of demonstrating > perfect rtt fairness in a range of 20ms to 260ms with fq_codel > https://forum.mikrotik.com/viewtopic.php?t=3D179307 )- and I'm pretty > interested in 2-260ms, but haven't got around to it. >=20 > Now, one early result from the sce vs l4s testing I recall was severe > latecomer convergence problems - something like 40s to come into flow > balance - but I can't remember what presentation, paper, or rtt that > was from. ? >=20 > Another one has been various claims towards some level of rtt > unfairness being ok, but not the actual ratio, nor (going up to the > paper's proposal above) whether that method had been tried. >=20 > My opinion has long been that any form of marking should look more > closely at the observed RTT than any fixed rate reduction method, and > compensate the paced rate to suit. But that's presently just reduced > to an opinion, not having kept up with progress on prague, dctcp-sce, > or bbrv2. As one example of ignorance, are 2 packets still paced back > to back? DRR++ + early marking seems to lead to one packet being > consistently unmarked and the other marked. >=20 > --=20 > I tried to build a better future, a few times: > https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.icei.org >=20 > Dave T=C3=A4ht CEO, TekLibre, LLC > _______________________________________________ > Ecn-sane mailing list > Ecn-sane@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/ecn-sane >=20 > --=20 > Sent from my Android device with K-9 Mail. Please excuse my brevity. >=20