[Starlink] Starlink hidden buffers

Bjørn Ivar Teigen bjorn at domos.no
Wed May 24 11:26:52 EDT 2023


 This discussion is fascinating and made me think of a couple of points I
really wish more people would grok:

1. What matters for the amount of queuing is the ratio of load over
capacity, or demand/supply, if you like. This ratio, at any point in time,
determines how quickly a queue fills or empties. It is the derivative of
the queue depth, if you like. Drops in capacity are equivalent to spikes in
load from this point of view.

This means the rate adaptation of WiFi and LTE, and link changes in the
Starlink network, has far greater potential of causing latency spikes than
TCP, even when many users connect at the same time. WiFi rates can go from
1000 to 1 from one packet to the next, and whenever that happens there
simply isn't time for TCP or any other end-to-end congestion controller to
react. In the presence of capacity seeking traffic there will, inevitably,
be a latency spike (or packet loss) when link capacity drops.

I'm presenting a paper on this at ICC next week, and the preprint is here:
https://arxiv.org/abs/2111.00488

2. IF you can describe how the ratio of demand to supply (or load/capacity)
changes over time (i.e, how much and how quickly it can change), then we
can use queuing theory (and/or simulations), to work out the utilization
vs. queuing delay trade-off, including transient behaviour. Handling
transients is what FQ excels at.

Because of the need for frequent link changes in the Starlink network,
there will be a need for more buffering than your typical (relatively)
static network. Not only because the load changes quickly, but because the
capacity does as well. This causes rapid changes in the
load-to-capacity-ratio, which will cause queues and/or packet loss unless
it's planned *really* well. I'm not going to say that is impossible, but
it's certainly hard.

Some queuing and deliberate under-utilization is needed to achieve reliable
QoE in a system like that.

Just my two cents!

Cheers,
Bjørn Ivar Teigen

On Sat, 13 May 2023 at 12:10, Ulrich Speidel via Starlink <
starlink at lists.bufferbloat.net> wrote:

> Here's a bit of a question to you all. See what you make of it.
>
> I've been thinking a bit about the latencies we see in the Starlink
> network. This is why this list exist (right, Dave?). So what do we know?
>
> 1) We know that RTTs can be in the 100's of ms even in what appear to be
> bent-pipe scenarios where the physical one-way path should be well under
> 3000 km, with physical RTT under 20 ms.
> 2) We know from plenty of traceroutes that these RTTs accrue in the
> Starlink network, not between the Starlink handover point (POP) to the
> Internet.
> 3) We know that they aren't an artifact of the Starlink WiFi router (our
> traceroutes were done through their Ethernet adaptor, which bypasses the
> router), so they must be delays on the satellites or the teleports.
> 4) We know that processing delay isn't a huge factor because we also see
> RTTs well under 30 ms.
> 5) That leaves queuing delays.
>
> This issue has been known for a while now. Starlink have been innovating
> their heart out around pretty much everything here - and yet, this
> bufferbloat issue hasn't changed, despite Dave proposing what appears to
> be an easy fix compared to a lot of other things they have done. So what
> are we possibly missing here?
>
> Going back to first principles: The purpose of a buffer on a network
> device is to act as a shock absorber against sudden traffic bursts. If I
> want to size that buffer correctly, I need to know at the very least
> (paraphrasing queueing theory here) something about my packet arrival
> process.
>
> If I look at conventional routers, then that arrival process involves
> traffic generated by a user population that changes relatively slowly:
> WiFi users come and go. One at a time. Computers in a company get turned
> on and off and rebooted, but there are no instantaneous jumps in load -
> you don't suddenly have a hundred users in the middle of watching
> Netflix turning up that weren't there a second ago. Most of what we know
> about Internet traffic behaviour is based on this sort of network, and
> this is what we've designed our queuing systems around, right?
>
> Observation: Starlink potentially breaks that paradigm. Why? Imagine a
> satellite X handling N users that are located closely together in a
> fibre-less rural town watching a range of movies. Assume that N is
> relatively large. Say these users are currently handled through ground
> station teleport A some distance away to the west (bent pipe with
> switching or basic routing on the satellite). X is in view of both A and
> the N users, but with X being a LEO satellite, that bliss doesn't last.
> Say X is moving to the (south- or north-)east and out of A's range.
> Before connection is lost, the N users migrate simultaneously to a new
> satellite Y that has moved into view of both A and themselves. Y is
> doing so from the west and is also catering to whatever users it can see
> there, and let's suppose has been using A for a while already.
>
> The point is that the user load on X and Y from users other than our N
> friends could be quite different. E.g., one of them could be over the
> ocean with few users, the other over countryside with a lot of
> customers. The TCP stacks of our N friends are (hopefully) somewhat
> adapted to the congestion situation on X with their cwnds open to
> reasonable sizes, but they are now thrown onto a completely different
> congestion scenario on Y. Similarly, say that Y had less than N users
> before the handover. For existing users on Y, there is now a huge surge
> of competing traffic that wasn't there a second ago - surging far faster
> than we would expect this to happen in a conventional network because
> there is no slow start involved.
>
> This seems to explain the huge jumps you see on Starlink in TCP goodput
> over time.
>
> But could this be throwing a few spanners into the works in terms of
> queuing? Does it invalidate what we know about queues and queue
> management? Would surges like these justify larger buffers?
>
> --
> ****************************************************************
> Dr. Ulrich Speidel
>
> School of Computer Science
>
> Room 303S.594 (City Campus)
>
> The University of Auckland
> u.speidel at auckland.ac.nz
> http://www.cs.auckland.ac.nz/~ulrich/
> ****************************************************************
>
>
>
> _______________________________________________
> Starlink mailing list
> Starlink at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink
>


-- 
Bjørn Ivar Teigen, Ph.D.
Head of Research
+47 47335952 | bjorn at domos.ai | www.domos.ai
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/starlink/attachments/20230524/f59aadee/attachment.html>


More information about the Starlink mailing list