[Starlink] Finite-Buffer M/G/1 Queues with Time and Space Priorities

Sun Jul 31 16:22:26 EDT 2022

Sebastian - of course we agree far more than we disagree, and this seems like healthy debate, focused on actual user benefit at scale, which is where I hope the Internet focuses. The "bufferbloat" community has been really kicking it for users, in my opinion.

[Off topic: I don't have time for IETF's process, now that it is such a parody of captured bureaucratic process, rather than "rough consensus and working code", now missing for at least 2 decades of corporate control of IAB. None of us "old farts" ever though emulating ITU style bureaucracy would solve any important problem the Internet was trying to solve.]

On Sunday, July 31, 2022 7:58am, "Sebastian Moeller" <moeller0 at gmx.de> said:

> Hi David,
> 
> interesting food for thought...
> 
> 
> > On Jul 29, 2022, at 21:38, David P. Reed via Starlink
> <starlink at lists.bufferbloat.net> wrote:
> >
> > From: "Bless, Roland (TM)" <roland.bless at kit.edu>
> > models from
> > queueing theory is that they only work for load < 1, whereas
> > we are using the network with load values ~1 (i.e., around one) due to
> > congestion control feedback loops that drive the bottleneck link
> > to saturation (unless you consider application limited traffic sources).
> >
> > Let me remind people here that there is some kind of really weird thinking
> going on here about what should be typical behavior in the Intenet when it is
> working well.
> >
> > No, the goal of the Internet is not to saturate all bottlenecks at maximum
> capacity. That is the opposite of the goal, and it is the opposite of a sane
> operating point.
> >
> > Every user seeks low response time, typically a response time on the order of
> the unloaded delay in the network, for ALL traffic. (whether it's the response to
> a file transfer or a voice frame or a WWW request). *
> >
> > Queueing is always suboptimal, if you can achieve goodput without introducing
> any queueing delay. Because a queue built up at any link delays *all* traffic
> sharing that link, so the overall cost to all users goes up radically when
> multiple streams share a link, because the queueing *delay* gets multiplied by the
> number of flows affected!
> >
> > So the most desirable operating point (which Kleinrock and his students
> recently demonstrated with his "power metric") is to have each queue in every link
> average < 1 packet in length. (big or small packets, doesn't matter,
> actually).
> >
> > Now the bigger issue is that this is unachievable when the flows in the
> network are bursty. Poisson being the least bursty, and easiest to analyze of the
> random processes generating flows. Typical Internet usage is incredibly bursty at
> all time scales, though - the burstiness is fractal when observed for real (at
> least if you look at time scales from 1 ms. to 1 day as your unit of analysis). 
> Fractal random processes of this sort are not Poisson at all.
> 
> [SM] In this context I like the framing from the CoDel ACM paper, with the queue
> acting as shock absorber for burst, as you indicate bursts are unavoidable in a
> network with unsynchronized senders. So it seems prudent to engineer with bursts
> as use-case (how ever undesirable) in mind (compared to simply declaring bursts
> undesirable and require endpoints not to be bursty, as L4S seems to do*).
> 
> 
> > So what is the best one ought to try to do?
> >
> > Well, "keeping utilization at 100%" is never what real network operators
> seek. Never, ever. Instead, congestion control is focused on latency control, not
> optimizing utilization.
> 
> [SM] I thought that these are not orthogonal goals and one needs to pick an
> operating point in the throughput<->latency gradient somehow? This becomes
> relevant for smaller links like internet access links more than for back bone
> links. It is relatively easy to drive my 100/40 link into saturation by normal
> usage, so I have a clear goal of keeping latency acceptable under saturating
> loads.
> 
> 
> > The only folks who seem to focus on utilization is the bean counting
> fraternity, because they seem to think the only cost is the wires, so you want the
> wires to be full.
> 
> [SM] Pithy, yet I am sure the bean counters also account for the cost of
> ports/interfaces ;)
> 
> > That, in my opinion, and even in most accounting systems that consider the
> whole enterprise rather than the wires/fibers/airtime alone, is IGNORANT and
> STUPID.
> >
> > However, academics and vendors of switches care nothing about latency at
> network scale. They focus on wirespeed as the only metric.
> >
> > Well, in the old Bell Telephone days, the metric of the Bell System that
> really mattered was not utilization on every day. Instead it was avoiding outages
> due to peak load. That often was "Mother's Day" - a few hours out of one day once
> a year. Because an outage on Mother's day (busy signals) meant major frustration!
> 
> [SM] If one designs for a (rare) worst-case scenario, one is in the clear most of
> the time. I wish that was possible with my internet access link though... I get a
> sync of 116.7/37.0 Mbps which I shape own to a gross 105.0/36.0 it turns out it
> is not that hard to saturate that link occasionally with just normal usage by a
> family of five, so I clearly am far away from 90% reserve capacity, and I have
> little change of expanding the capacity by a factr of 10 within my budget...
> 
> 
> > Why am I talking about this?
> >
> > Because I have been trying for decades (and I am not alone) to apply a
> "Clue-by-Four" to the thick skulls of folks who don't think about the Internet at
> scale, or even won't think about an Enterprise Internet at scale (or Starlink at
> scale). And it doesn't sink in.
> >
> > Andrew Odlyzko, a brilliant mathematician at Bell Labs for most of his career
> also tried to point out that the utilization of the "bottleneck links" in any
> enterprise, up to the size of ATT in the old days, was typically tuned to < 10%
> of saturation at almost any time. Why? Because the CEO freaked out at the quality
> of service of this critical infrastructure (which means completing tasks quickly,
> when load is unusual) and fired people.
> >
> > And in fact, the wires are the cheapest resource - the computers and people
> connected by those resources that can't do work while waiting for queueing delay
> are vastly more expensive to leave idle. Networks don't do "work" that matters.
> Queueing isn't "efficient". It's evil.
> >
> > Which is why dropping packets rather then queueing them is *good*, if the
> sender will slow down and can resend them. Intentially dropped packets should be
> nonzero under load, if an outsider is observing for measruing quality.
> >
> > I call this brain-miswiring about optimizing throughput to fill a bottleneck
> link the Hotrodder Fallacy. That's the idea that one should optimize like a drag
> racer optimizes his car - to burn up the tires and the engine to meet an
> irrelevant metric for automobiles. A nice hobby that has never improved any actual
> vehicle. (Even F1 racing is far more realistic, given you want your cars to last
> for the lifetime of the race).
> >
> > A problem with much of the "network research" community is that it never has
> actually looked at what networks are used for and tried to solve those problems.
> Instead, they define irrelevant problems and encourage all students and professors
> to pursue irrelevancy.
> >
> > Now let's look at RRUL. While it nicely looks at latency for small packets
> under load, it actually disregards the performance of the load streams, which are
> only used to "fill the pipe".
> 
> [SM] I respectfully disagree. They are used to simulate those "fill the pipe"
> flows that do happen in edge networks... think multiple machines downloading
> multi-gigabyte update packages (OS, games, software, ...) when ever they feel
> like it. The sparse latency measurement flows simulate low rate/sparse
> interactive traffic...
> But note that depending on the situation a nominally sparse flaw can use up quite
> some capacity, I talked to a games who observed in riot games valorant in a
> mylti-player online game with 10-20 players traffic at 20 Mbps with cyclic burst
> 128 times a second. On a slow link that becomes a noticeable capacity hog.
> 
> > Fortunately, they are TCP, so they rate limit themselves by window
> adjustment. But they are speed unlimited TCP streams that are meaningless.
> 
> [SM] Flent will however present information about those flows if instructed to do
> so (IIRC by the --socket-stats argument):
> 
> 
> avg median 99th % 
> # data pts
> Ping (ms) ICMP 1.1.1.1 (extra) : 13.26 11.70 29.30 ms 
> 1393
> Ping (ms) avg : 32.17 N/A N/A ms 
> 1607
> Ping (ms)::ICMP : 32.76 30.60 48.02 ms 
> 1395
> Ping (ms)::UDP 0 (0) : 32.64 30.52 46.55 ms 
> 1607
> Ping (ms)::UDP 1 (0) : 31.39 29.90 45.98 ms 
> 1607
> Ping (ms)::UDP 2 (0) : 32.85 30.82 47.04 ms 
> 1607
> Ping (ms)::UDP 3 (0) : 31.72 30.25 46.49 ms 
> 1607
> Ping (ms)::UDP 4 (0) : 31.37 29.78 45.61 ms 
> 1607
> Ping (ms)::UDP 5 (0) : 31.36 29.74 45.13 ms 
> 1607
> Ping (ms)::UDP 6 (0) : 32.85 30.71 47.34 ms 
> 1607
> Ping (ms)::UDP 7 (0) : 33.16 31.08 47.93 ms 
> 1607
> TCP download avg : 7.82 N/A N/A
> Mbits/s 1607
> TCP download sum : 62.55 N/A N/A
> Mbits/s 1607
> TCP download::0 (0) : 7.86 7.28 13.81
> Mbits/s 1607
> TCP download::1 (0) : 8.18 7.88 13.98
> Mbits/s 1607
> TCP download::2 (0) : 7.62 7.05 13.81
> Mbits/s 1607
> TCP download::3 (0) : 7.73 7.37 13.23
> Mbits/s 1607
> TCP download::4 (0) : 7.58 7.07 13.51
> Mbits/s 1607
> TCP download::5 (0) : 7.92 7.37 14.03
> Mbits/s 1607
> TCP download::6 (0) : 8.07 7.58 14.33
> Mbits/s 1607
> TCP download::7 (0) : 7.59 6.96 13.94
> Mbits/s 1607
> TCP totals : 93.20 N/A N/A
> Mbits/s 1607
> TCP upload avg : 3.83 N/A N/A
> Mbits/s 1607
> TCP upload sum : 30.65 N/A N/A
> Mbits/s 1607
> TCP upload::0 (0) : 3.82 3.86 9.57
> Mbits/s 1607
> TCP upload::0 (0)::tcp_cwnd : 14.31 14.00 23.00 
> 856
> TCP upload::0 (0)::tcp_delivery_rate : 3.67 3.81 4.95 
> 855
> TCP upload::0 (0)::tcp_pacing_rate : 4.72 4.85 6.93 
> 855
> TCP upload::0 (0)::tcp_rtt : 42.48 41.36 65.32 
> 851
> TCP upload::0 (0)::tcp_rtt_var : 2.83 2.38 9.90 
> 851
> TCP upload::1 (0) : 3.90 3.94 16.49
> Mbits/s 1607
> TCP upload::1 (0)::tcp_cwnd : 14.46 14.00 23.00 
> 857
> TCP upload::1 (0)::tcp_delivery_rate : 3.75 3.83 5.74 
> 856
> TCP upload::1 (0)::tcp_pacing_rate : 4.81 4.89 8.15 
> 856
> TCP upload::1 (0)::tcp_rtt : 42.12 41.07 63.10 
> 852
> TCP upload::1 (0)::tcp_rtt_var : 2.74 2.36 8.36 
> 852
> TCP upload::2 (0) : 3.85 3.96 5.11
> Mbits/s 1607
> TCP upload::2 (0)::tcp_cwnd : 14.15 14.00 22.00 
> 852
> TCP upload::2 (0)::tcp_delivery_rate : 3.69 3.81 4.93 
> 851
> TCP upload::2 (0)::tcp_pacing_rate : 4.73 4.91 6.55 
> 851
> TCP upload::2 (0)::tcp_rtt : 41.73 41.09 56.97 
> 851
> TCP upload::2 (0)::tcp_rtt_var : 2.59 2.29 7.71 
> 851
> TCP upload::3 (0) : 3.81 3.95 5.32
> Mbits/s 1607
> TCP upload::3 (0)::tcp_cwnd : 13.90 14.00 21.00 
> 851
> TCP upload::3 (0)::tcp_delivery_rate : 3.66 3.82 4.89 
> 851
> TCP upload::3 (0)::tcp_pacing_rate : 4.67 4.87 6.36 
> 851
> TCP upload::3 (0)::tcp_rtt : 41.44 41.09 56.46 
> 847
> TCP upload::3 (0)::tcp_rtt_var : 2.74 2.46 8.27 
> 847
> TCP upload::4 (0) : 3.77 3.88 5.35
> Mbits/s 1607
> TCP upload::4 (0)::tcp_cwnd : 13.86 14.00 21.00 
> 852
> TCP upload::4 (0)::tcp_delivery_rate : 3.61 3.75 4.87 
> 852
> TCP upload::4 (0)::tcp_pacing_rate : 4.63 4.83 6.46 
> 852
> TCP upload::4 (0)::tcp_rtt : 41.74 41.18 57.27 
> 850
> TCP upload::4 (0)::tcp_rtt_var : 2.73 2.45 8.38 
> 850
> TCP upload::5 (0) : 3.83 3.93 5.60
> Mbits/s 1607
> TCP upload::5 (0)::tcp_cwnd : 13.98 14.00 22.00 
> 851
> TCP upload::5 (0)::tcp_delivery_rate : 3.68 3.80 5.05 
> 851
> TCP upload::5 (0)::tcp_pacing_rate : 4.69 4.82 6.65 
> 851
> TCP upload::5 (0)::tcp_rtt : 41.50 40.91 56.42 
> 847
> TCP upload::5 (0)::tcp_rtt_var : 2.68 2.34 8.24 
> 847
> TCP upload::6 (0) : 3.86 3.97 5.60
> Mbits/s 1607
> TCP upload::6 (0)::tcp_cwnd : 14.27 14.00 22.00 
> 850
> TCP upload::6 (0)::tcp_delivery_rate : 3.71 3.83 5.07 
> 850
> TCP upload::6 (0)::tcp_pacing_rate : 4.74 4.90 6.77 
> 850
> TCP upload::6 (0)::tcp_rtt : 42.03 41.66 55.81 
> 850
> TCP upload::6 (0)::tcp_rtt_var : 2.71 2.49 7.85 
> 850
> TCP upload::7 (0) : 3.81 3.92 5.18
> Mbits/s 1607
> TCP upload::7 (0)::tcp_cwnd : 14.01 14.00 22.00 
> 850
> TCP upload::7 (0)::tcp_delivery_rate : 3.67 3.82 4.94 
> 849
> TCP upload::7 (0)::tcp_pacing_rate : 4.57 4.69 6.52 
> 850
> TCP upload::7 (0)::tcp_rtt : 42.62 42.16 56.20 
> 847
> TCP upload::7 (0)::tcp_rtt_var : 2.50 2.19 8.02 
> 847
> cpu_stats_root at 192.168.42.1::load : 0.31 0.30 0.75 
> 1286
> 
> 
> While the tcp_rtt is smoothed, it still tells something about the latency of the
> load bearing flows.
>

I agree. I have used flent enough to have poked around at its options, and yes, the data is there.
But RRUL's assumption that there is always "more" to send on the load generating TCP flows explores only one kind of case. Suppose you have three streaming video watchers at HD rates, and they don't quite fill up the downlink, yet they actually are "buffering" sufficiently most of the time. How well do they share the downlink so that none of them pause unnecessarily? Maybe FQ_codel works, maybe it doesn't work so well. If you want a variant, imagine a small business office with 20 staff doing Zoom conferences with customers. Zoom actually on the uplink side is bursty to some extent. It can tolerate sub-100 msec. outages. 
My point is that there are real kinds of burstiness besides "click driven", and they have different time-scales of variability. Studying these seems vastly more useful than one more paper based on RRUL as the only point of study. (and even more vastly better than just measuring the pipe's max throughput and utilization).

> 
> >
> > Actual situations (like what happens when someone starts using BitTorrent
> while another in the same household is playing a twitch Multi-user FPS) don't
> actually look like RRUL. Because in fact the big load is ALSO fractal. Bittorrent
> demand isn't constant over time - far from it. It's bursty.
> 
> [SM] And this is where having an FQ scheduler for ingress and egress really
> helps,
I think FQ_codel is great! An insight that I get from it is that "flow fairness" plus dropping works pretty well for today's Internet traffic to provide very good responsiveness that the user sees.
However, I think QUIC, which lacks "flows" that are visible at the queue manager, will become problematic. Not necessarily at the "Home Router" end - but at the cloud endpoints that serve many, many users.

> it can isolate most of the fall-out from bursty traffic onto the bursty
> traffic itself. However, occasionally a user actually evaluates the bursty
> traffic as more important than the rest (my example from above with bursty
> real-time traffic of a game) in which case FQ tends to result in unhappiness if
> the capacity share of the affected flow is such that the bursts are partly
> dropped (and even if they are just spread out in time too much).
> 
> > Everything is bursty at different timescales in the Internet. There are no
> CBR flows.
> 
> [SM] Probably true, but I think on the scale of a few seconds/minutes things can
> be "constant" enough, no?
Even your case of your family's single edge connection, your peak-average over intervals of one hour or more, and probably minute-to-minute as well, show burstiness. There may be a few times each day where the "Mother's Day" event happens, but I bet your average usage in every hour, and probably every minute is < 10%.  What happens when (if your family works like many) you sit down to dinner? And then get back to work?

I bet you buy the fastest "up-to" speed you can afford, but not because your average is very high at all.
Right?

> 
> > So if we want to address the real congestion problems, we need realistic
> thinking about what the real problem is.
> >
> > Unfortunately this is not achieved by the kind of thinking that created
> diffserv, sadly. Because everything is bursty, just with different timescales in
> some cases. Even "flash override" priority traffic is incredibly bursty.
> 
> [SM] I thought the rationale for "flash override" is not that its traffic pattern
> is any different (smoother) from other traffic classes, but simply that delivery
> of such marked packets has highest priority and the network should do what it can
> to expedite such packets if at the cost of other packets, so be it... (some link
> technologies even allow to pre-empt packets already in transfer to expedite
> higher priority packets). Personally, I like strict precedence, it is both
> unforgiving and easy to predict, and pretty much useless for a shared medium
> like the internet, at least as an end 2 end policy.
>
Actually, if one uses EDF scheduling, it is provably optimal in a particular sense. That is, if it is possible to meet all the deadlines with a particular schedule, then EDF schedule will achieve all the deadlines.
That is from the early 1970's work on "real-time scheduling" disciplines.
"Strict precedence" is the case of EDF when the deadline for ALL packets is the send time + a network-wide "latency bound". If you used EDF with a latency bound of, say, 100 msec for all packets, and each packet was sequenced in deadline-order, the network would be VERY predictable and responsive.

The imagined need for "flash override" priority would go away, unless the emergency somehow required sub-100msec. latency, if all flows backed off using TCP AIMD backoff, and queues in bottleneck links were dropped.

No one's actually tried this at scale, but theory all suggests it would be brilliantly stable and predictable.
(How it would work if the network is constantly being DDoSed everywhere isn't  a fair question - no scheduling algorithm can work in constant DDoS, you need meta-network policing to find culprits and shut them off ASAP).

> 
> > Coming back to Starlink - Starlink apparently is being designed by folks who
> really do not understand these fundamental ideas. Instead, they probably all
> worked in researchy environments where the practical realities of being part of a
> worldwide public Internet were ignored.
> 
> [SM] Also in a world where use-facing tests and evaluations will emphasize
> maximal throughput rates a lot, as these are easy to measure and follow the
> simple "larger is better" principle consumers are trained to understand.
> 
> 
> > (The FCC folks are almost as bad. I have found no-one at FCC engineering who
> understands fractal burstiness - even w.'t. the old Bell System).
> 
> 
> 
> 
> *) It might appear that I have a bone to pick with L4S (which I have), but it
> really is just a great example of engineering malpractice, especially not
> designing for the existing internet, but assuming one can simply "require" a more
> L4S compatible internet though the power of IETF drafts. Case in point, L4S wants
> to bound the maximum bursts duration for compliant senders, which even if it
> worked, still leaves the problem, that unsynchronized senders can and will still
> occasionally add up to extended periods at line rate.

I totally agree with the L4S bias you have. It seems wrong-headed to require every participant in the Internet to behave when you don't even tell them why they have to behave or what behaving means. My concern about IETF bureaucracy emulation applies here, as well.

[If BT wants to try L4S across all of BT's customers and take the flack when it fails miserably, it becomes "working code" when it finally works. Then they can get a rough consensus, rather than they and others "dictating" L4S must be.

They have not simulated actual usage (I admit that simulating "actual usage" is hard to do, when you don't even know what your users are actually doing today, as I've mentioned above.)  That suggests a "pilot" experimental process. Even ECN, RED, diffserv and MBONE were pilots. Which is where it was learned that they don't work at scale. Which is why no one seriously deploys them, to this day, because if they actually worked, there would be a race among users to demand them]
> 
> 
> 
> > _______________________________________________
> > Starlink mailing list
> > Starlink at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/starlink
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/private/starlink/attachments/20220731/59be14e0/attachment-0001.html>