From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A4D583B29E for ; Sun, 31 Jul 2022 07:58:45 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1659268722; bh=wbwWY4ez+F+WUJZ9wDhBXoSBpLVlG9N9cNRHVc9oYHk=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=M6zO1IistwnePskWJ5rjCK+A9DUrog2qDy3p9qW4Gych9roRpx54z7ocvVw23NkNP 3Enszx4B3GV31t9IygRk9wSY0Oaj/kzp1DfNnUGlTQjYCRtL8TfBbr0EyTbB5iD4i/ ac9xUCYMy5gC7VyztL6dH06UKHdA6y7w/7siaurE= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from smtpclient.apple ([95.112.83.112]) by mail.gmx.net (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1N2mBa-1nJeAw32vk-0132ZE; Sun, 31 Jul 2022 13:58:42 +0200 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) From: Sebastian Moeller X-Priority: 3 (Normal) In-Reply-To: <1659123485.059828918@apps.rackspace.com> Date: Sun, 31 Jul 2022 13:58:41 +0200 Cc: starlink@lists.bufferbloat.net Content-Transfer-Encoding: quoted-printable Message-Id: <05EB1373-AD05-4CB6-BD92-C444038D3A67@gmx.de> References: <1659123485.059828918@apps.rackspace.com> To: "David P. Reed" X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Provags-ID: V03:K1:e66TXj3cwcNS/4zJVHsnHYikBZkvCDQ+IYsUHuBj1u5fouUrn1f b0ZiQNQgzR4q+4otpKlauLpYMbKSq2HPE3Gnfq2M7iXBeEWyVVZiWyMUIGrN0BXhKw1lE92 d9CG/6ZFjw8/McCoVa+bx6lakkS/dZMIz62Sqh5YkNSkInR7y++cyPvTx7LDTNLrgjjtPLE tCSiEZ2LZIexA0hPdgFiQ== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:AJSuovghsr4=:UzhKf8CYYpg3XwmRwh0l7j 73FYjC+dIOt4ByYeGt2UelrCp9iKMRBQObX++9TUCFCx9KxFMz7lzsfGm/Rk/dPROB3OFOQI+ yZo+26ABUgB3WxUV2raUm5i0oL4NYXboQJqIGxKIZO9p5vazJa7QCnfpuUIemFeUwTxLBwiZ4 9L8dXAiZN46DIvxR1ZtCUTVALWGGZhEIw7LOYFkHXPmw67mU6URN/7E+eXgiKW7P8uXj+3RrG UX79qbmzOEQUvyk8jcyvNW24PR6uQCJLscH/0ooOYKiLTJt0Q9DqJxEmDcPDvCgTPZT+GlN39 otAmG5k0mR711knOAqpBvQzvyRi+CzFaIGDJ2SMvsvBbnH9sQYj8sV090i+c4WemkVaiT6ngQ AcLnrWL5lSvHsYkNBXMjOCAxSxwugu+x7JhyRTQelKKrHh8t89u9R8RzO/9n71+YI4OAh72sv KaU6Pnw1S1ge+9NEfIViC03JpMbg+HCQd55fy6aaLrmTBJ0AGo1QDcAwNHPV5pG0SvprXWxOf s9xCZU0IwsOPykkQhSizPRopj6GBsuspUFhqeEKl/g6wa43tewq02LBg4Q3LpWI5fF5KdPt4W RLwYWeBpDtWQflKrkKo8Nfj9QSX5AG3VHQahIvNLETv3b0SHRq0oaAzo/ovW69e1XDZFsS2Mw cSy6bDkSkSp5RmMb5buoNGihkSJG4JgfqU3Cx7bkQu2w2wBYsCmDM8hm2Ty8flm5Bb3SXhCzl ETE1YIJvM3XlfDpZeTs3TXlkcR9tsU/utyGHj33E2uxCn+TOsN2aG0z8CfEMlacaqI9pEzjcq /e4KP5AikFnuSnQXnKfu1wlEZTWPQVA+Yn5tpz3E4KDs0Usp2F7AC9QxZhgpVgiqjFYdVv0va 9v8a3XW7wVHwFandmmMmIVmcu0PvWg4RqiAdaas6rLlqSYYBq9/4mk/fk0MNZE6uhiKf+T44T 54AVDPRkwWDCOqVwwlfqFmq+ucnODuZrwccxRPSIyeSxEahjU7umzyp4l0hVC8TegClDYMzgB aQX/VPcc4sV9Ki7VoShMAKixdHCMzhCtXh+dOXRihktjLAoL3WEzVdSomqM2aEkRmkszTvUwg 6mcVsLNjTgCKk+X9SfYIwN35IN92jiNHSBqrcEYxlUGfKStL7XqxM5+1g== Subject: Re: [Starlink] Finite-Buffer M/G/1 Queues with Time and Space Priorities X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2022 11:58:46 -0000 Hi David, interesting food for thought... > On Jul 29, 2022, at 21:38, David P. Reed via Starlink = wrote: >=20 > From: "Bless, Roland (TM)" > models from > queueing theory is that they only work for load < 1, whereas > we are using the network with load values ~1 (i.e., around one) due to > congestion control feedback loops that drive the bottleneck link > to saturation (unless you consider application limited traffic = sources). > =20 > Let me remind people here that there is some kind of really weird = thinking going on here about what should be typical behavior in the = Intenet when it is working well. > =20 > No, the goal of the Internet is not to saturate all bottlenecks at = maximum capacity. That is the opposite of the goal, and it is the = opposite of a sane operating point. > =20 > Every user seeks low response time, typically a response time on the = order of the unloaded delay in the network, for ALL traffic. (whether = it's the response to a file transfer or a voice frame or a WWW request). = *=20 > =20 > Queueing is always suboptimal, if you can achieve goodput without = introducing any queueing delay. Because a queue built up at any link = delays *all* traffic sharing that link, so the overall cost to all users = goes up radically when multiple streams share a link, because the = queueing *delay* gets multiplied by the number of flows affected! > =20 > So the most desirable operating point (which Kleinrock and his = students recently demonstrated with his "power metric") is to have each = queue in every link average < 1 packet in length. (big or small packets, = doesn't matter, actually). > =20 > Now the bigger issue is that this is unachievable when the flows in = the network are bursty. Poisson being the least bursty, and easiest to = analyze of the random processes generating flows. Typical Internet usage = is incredibly bursty at all time scales, though - the burstiness is = fractal when observed for real (at least if you look at time scales from = 1 ms. to 1 day as your unit of analysis). Fractal random processes of = this sort are not Poisson at all. [SM] In this context I like the framing from the CoDel ACM = paper, with the queue acting as shock absorber for burst, as you = indicate bursts are unavoidable in a network with unsynchronized = senders. So it seems prudent to engineer with bursts as use-case (how = ever undesirable) in mind (compared to simply declaring bursts = undesirable and require endpoints not to be bursty, as L4S seems to = do*). > So what is the best one ought to try to do? > =20 > Well, "keeping utilization at 100%" is never what real network = operators seek. Never, ever. Instead, congestion control is focused on = latency control, not optimizing utilization. [SM] I thought that these are not orthogonal goals and one needs = to pick an operating point in the throughput<->latency gradient somehow? = This becomes relevant for smaller links like internet access links more = than for back bone links. It is relatively easy to drive my 100/40 link = into saturation by normal usage, so I have a clear goal of keeping = latency acceptable under saturating loads. > The only folks who seem to focus on utilization is the bean counting = fraternity, because they seem to think the only cost is the wires, so = you want the wires to be full. [SM] Pithy, yet I am sure the bean counters also account for the = cost of ports/interfaces ;) > That, in my opinion, and even in most accounting systems that consider = the whole enterprise rather than the wires/fibers/airtime alone, is = IGNORANT and STUPID. > =20 > However, academics and vendors of switches care nothing about latency = at network scale. They focus on wirespeed as the only metric. > =20 > Well, in the old Bell Telephone days, the metric of the Bell System = that really mattered was not utilization on every day. Instead it was = avoiding outages due to peak load. That often was "Mother's Day" - a few = hours out of one day once a year. Because an outage on Mother's day = (busy signals) meant major frustration! [SM] If one designs for a (rare) worst-case scenario, one is in = the clear most of the time. I wish that was possible with my internet = access link though... I get a sync of 116.7/37.0 Mbps which I shape own = to a gross 105.0/36.0 it turns out it is not that hard to saturate that = link occasionally with just normal usage by a family of five, so I = clearly am far away from 90% reserve capacity, and I have little change = of expanding the capacity by a factr of 10 within my budget... > Why am I talking about this? > =20 > Because I have been trying for decades (and I am not alone) to apply a = "Clue-by-Four" to the thick skulls of folks who don't think about the = Internet at scale, or even won't think about an Enterprise Internet at = scale (or Starlink at scale). And it doesn't sink in. > =20 > Andrew Odlyzko, a brilliant mathematician at Bell Labs for most of his = career also tried to point out that the utilization of the "bottleneck = links" in any enterprise, up to the size of ATT in the old days, was = typically tuned to < 10% of saturation at almost any time. Why? Because = the CEO freaked out at the quality of service of this critical = infrastructure (which means completing tasks quickly, when load is = unusual) and fired people. > =20 > And in fact, the wires are the cheapest resource - the computers and = people connected by those resources that can't do work while waiting for = queueing delay are vastly more expensive to leave idle. Networks don't = do "work" that matters. Queueing isn't "efficient". It's evil. > =20 > Which is why dropping packets rather then queueing them is *good*, if = the sender will slow down and can resend them. Intentially dropped = packets should be nonzero under load, if an outsider is observing for = measruing quality. > =20 > I call this brain-miswiring about optimizing throughput to fill a = bottleneck link the Hotrodder Fallacy. That's the idea that one should = optimize like a drag racer optimizes his car - to burn up the tires and = the engine to meet an irrelevant metric for automobiles. A nice hobby = that has never improved any actual vehicle. (Even F1 racing is far more = realistic, given you want your cars to last for the lifetime of the = race). > =20 > A problem with much of the "network research" community is that it = never has actually looked at what networks are used for and tried to = solve those problems. Instead, they define irrelevant problems and = encourage all students and professors to pursue irrelevancy. > =20 > Now let's look at RRUL. While it nicely looks at latency for small = packets under load, it actually disregards the performance of the load = streams, which are only used to "fill the pipe". [SM] I respectfully disagree. They are used to simulate those = "fill the pipe" flows that do happen in edge networks... think multiple = machines downloading multi-gigabyte update packages (OS, games, = software, ...) when ever they feel like it. The sparse latency = measurement flows simulate low rate/sparse interactive traffic... But note that depending on the situation a nominally sparse flaw can use = up quite some capacity, I talked to a games who observed in riot games = valorant in a mylti-player online game with 10-20 players traffic at 20 = Mbps with cyclic burst 128 times a second. On a slow link that becomes a = noticeable capacity hog. > Fortunately, they are TCP, so they rate limit themselves by window = adjustment. But they are speed unlimited TCP streams that are = meaningless. [SM] Flent will however present information about those flows if = instructed to do so (IIRC by the --socket-stats argument): avg median = 99th % # data pts Ping (ms) ICMP 1.1.1.1 (extra) : 13.26 11.70 = 29.30 ms 1393 Ping (ms) avg : 32.17 N/A = N/A ms 1607 Ping (ms)::ICMP : 32.76 30.60 = 48.02 ms 1395 Ping (ms)::UDP 0 (0) : 32.64 30.52 = 46.55 ms 1607 Ping (ms)::UDP 1 (0) : 31.39 29.90 = 45.98 ms 1607 Ping (ms)::UDP 2 (0) : 32.85 30.82 = 47.04 ms 1607 Ping (ms)::UDP 3 (0) : 31.72 30.25 = 46.49 ms 1607 Ping (ms)::UDP 4 (0) : 31.37 29.78 = 45.61 ms 1607 Ping (ms)::UDP 5 (0) : 31.36 29.74 = 45.13 ms 1607 Ping (ms)::UDP 6 (0) : 32.85 30.71 = 47.34 ms 1607 Ping (ms)::UDP 7 (0) : 33.16 31.08 = 47.93 ms 1607 TCP download avg : 7.82 N/A = N/A Mbits/s 1607 TCP download sum : 62.55 N/A = N/A Mbits/s 1607 TCP download::0 (0) : 7.86 7.28 = 13.81 Mbits/s 1607 TCP download::1 (0) : 8.18 7.88 = 13.98 Mbits/s 1607 TCP download::2 (0) : 7.62 7.05 = 13.81 Mbits/s 1607 TCP download::3 (0) : 7.73 7.37 = 13.23 Mbits/s 1607 TCP download::4 (0) : 7.58 7.07 = 13.51 Mbits/s 1607 TCP download::5 (0) : 7.92 7.37 = 14.03 Mbits/s 1607 TCP download::6 (0) : 8.07 7.58 = 14.33 Mbits/s 1607 TCP download::7 (0) : 7.59 6.96 = 13.94 Mbits/s 1607 TCP totals : 93.20 N/A = N/A Mbits/s 1607 TCP upload avg : 3.83 N/A = N/A Mbits/s 1607 TCP upload sum : 30.65 N/A = N/A Mbits/s 1607 TCP upload::0 (0) : 3.82 3.86 = 9.57 Mbits/s 1607 TCP upload::0 (0)::tcp_cwnd : 14.31 14.00 = 23.00 856 TCP upload::0 (0)::tcp_delivery_rate : 3.67 3.81 = 4.95 855 TCP upload::0 (0)::tcp_pacing_rate : 4.72 4.85 = 6.93 855 TCP upload::0 (0)::tcp_rtt : 42.48 41.36 = 65.32 851 TCP upload::0 (0)::tcp_rtt_var : 2.83 2.38 = 9.90 851 TCP upload::1 (0) : 3.90 3.94 = 16.49 Mbits/s 1607 TCP upload::1 (0)::tcp_cwnd : 14.46 14.00 = 23.00 857 TCP upload::1 (0)::tcp_delivery_rate : 3.75 3.83 = 5.74 856 TCP upload::1 (0)::tcp_pacing_rate : 4.81 4.89 = 8.15 856 TCP upload::1 (0)::tcp_rtt : 42.12 41.07 = 63.10 852 TCP upload::1 (0)::tcp_rtt_var : 2.74 2.36 = 8.36 852 TCP upload::2 (0) : 3.85 3.96 = 5.11 Mbits/s 1607 TCP upload::2 (0)::tcp_cwnd : 14.15 14.00 = 22.00 852 TCP upload::2 (0)::tcp_delivery_rate : 3.69 3.81 = 4.93 851 TCP upload::2 (0)::tcp_pacing_rate : 4.73 4.91 = 6.55 851 TCP upload::2 (0)::tcp_rtt : 41.73 41.09 = 56.97 851 TCP upload::2 (0)::tcp_rtt_var : 2.59 2.29 = 7.71 851 TCP upload::3 (0) : 3.81 3.95 = 5.32 Mbits/s 1607 TCP upload::3 (0)::tcp_cwnd : 13.90 14.00 = 21.00 851 TCP upload::3 (0)::tcp_delivery_rate : 3.66 3.82 = 4.89 851 TCP upload::3 (0)::tcp_pacing_rate : 4.67 4.87 = 6.36 851 TCP upload::3 (0)::tcp_rtt : 41.44 41.09 = 56.46 847 TCP upload::3 (0)::tcp_rtt_var : 2.74 2.46 = 8.27 847 TCP upload::4 (0) : 3.77 3.88 = 5.35 Mbits/s 1607 TCP upload::4 (0)::tcp_cwnd : 13.86 14.00 = 21.00 852 TCP upload::4 (0)::tcp_delivery_rate : 3.61 3.75 = 4.87 852 TCP upload::4 (0)::tcp_pacing_rate : 4.63 4.83 = 6.46 852 TCP upload::4 (0)::tcp_rtt : 41.74 41.18 = 57.27 850 TCP upload::4 (0)::tcp_rtt_var : 2.73 2.45 = 8.38 850 TCP upload::5 (0) : 3.83 3.93 = 5.60 Mbits/s 1607 TCP upload::5 (0)::tcp_cwnd : 13.98 14.00 = 22.00 851 TCP upload::5 (0)::tcp_delivery_rate : 3.68 3.80 = 5.05 851 TCP upload::5 (0)::tcp_pacing_rate : 4.69 4.82 = 6.65 851 TCP upload::5 (0)::tcp_rtt : 41.50 40.91 = 56.42 847 TCP upload::5 (0)::tcp_rtt_var : 2.68 2.34 = 8.24 847 TCP upload::6 (0) : 3.86 3.97 = 5.60 Mbits/s 1607 TCP upload::6 (0)::tcp_cwnd : 14.27 14.00 = 22.00 850 TCP upload::6 (0)::tcp_delivery_rate : 3.71 3.83 = 5.07 850 TCP upload::6 (0)::tcp_pacing_rate : 4.74 4.90 = 6.77 850 TCP upload::6 (0)::tcp_rtt : 42.03 41.66 = 55.81 850 TCP upload::6 (0)::tcp_rtt_var : 2.71 2.49 = 7.85 850 TCP upload::7 (0) : 3.81 3.92 = 5.18 Mbits/s 1607 TCP upload::7 (0)::tcp_cwnd : 14.01 14.00 = 22.00 850 TCP upload::7 (0)::tcp_delivery_rate : 3.67 3.82 = 4.94 849 TCP upload::7 (0)::tcp_pacing_rate : 4.57 4.69 = 6.52 850 TCP upload::7 (0)::tcp_rtt : 42.62 42.16 = 56.20 847 TCP upload::7 (0)::tcp_rtt_var : 2.50 2.19 = 8.02 847 cpu_stats_root@192.168.42.1::load : 0.31 0.30 = 0.75 1286 While the tcp_rtt is smoothed, it still tells something about the = latency of the load bearing flows. > =20 > Actual situations (like what happens when someone starts using = BitTorrent while another in the same household is playing a twitch = Multi-user FPS) don't actually look like RRUL. Because in fact the big = load is ALSO fractal. Bittorrent demand isn't constant over time - far = from it. It's bursty. [SM] And this is where having an FQ scheduler for ingress and = egress really helps, it can isolate most of the fall-out from bursty = traffic onto the bursty traffic itself. However, occasionally a user = actually evaluates the bursty traffic as more important than the rest = (my example from above with bursty real-time traffic of a game) in which = case FQ tends to result in unhappiness if the capacity share of the = affected flow is such that the bursts are partly dropped (and even if = they are just spread out in time too much). > Everything is bursty at different timescales in the Internet. There = are no CBR flows. [SM] Probably true, but I think on the scale of a few = seconds/minutes things can be "constant" enough, no?=20 > So if we want to address the real congestion problems, we need = realistic thinking about what the real problem is. > =20 > Unfortunately this is not achieved by the kind of thinking that = created diffserv, sadly. Because everything is bursty, just with = different timescales in some cases. Even "flash override" priority = traffic is incredibly bursty. [SM] I thought the rationale for "flash override" is not that = its traffic pattern is any different (smoother) from other traffic = classes, but simply that delivery of such marked packets has highest = priority and the network should do what it can to expedite such packets = if at the cost of other packets, so be it... (some link technologies = even allow to pre-empt packets already in transfer to expedite higher = priority packets). Personally, I like strict precedence, it is both = unforgiving and easy to predict, and pretty much useless for a shared = medium like the internet, at least as an end 2 end policy. > Coming back to Starlink - Starlink apparently is being designed by = folks who really do not understand these fundamental ideas. Instead, = they probably all worked in researchy environments where the practical = realities of being part of a worldwide public Internet were ignored. [SM] Also in a world where use-facing tests and evaluations will = emphasize maximal throughput rates a lot, as these are easy to measure = and follow the simple "larger is better" principle consumers are trained = to understand. > (The FCC folks are almost as bad. I have found no-one at FCC = engineering who understands fractal burstiness - even w.'t. the old Bell = System). *) It might appear that I have a bone to pick with L4S (which I have), = but it really is just a great example of engineering malpractice, = especially not designing for the existing internet, but assuming one can = simply "require" a more L4S compatible internet though the power of IETF = drafts. Case in point, L4S wants to bound the maximum bursts duration = for compliant senders, which even if it worked, still leaves the = problem, that unsynchronized senders can and will still occasionally add = up to extended periods at line rate. > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink