From: Ulrich Speidel <u.speidel@auckland.ac.nz>
To: David Lang <david@lang.hm>
Cc: "starlink@lists.bufferbloat.net" <starlink@lists.bufferbloat.net>
Subject: Re: [Starlink] Starlink hidden buffers
Date: Sun, 14 May 2023 18:06:42 +1200 [thread overview]
Message-ID: <48b00469-0dbb-54c4-bedb-3aecbf714a1a@auckland.ac.nz> (raw)
In-Reply-To: <0no84q43-s4n6-45n8-50or-12o3rq104n99@ynat.uz>
On 14/05/2023 10:57 am, David Lang wrote:
> On Sat, 13 May 2023, Ulrich Speidel via Starlink wrote:
>
>> Here's a bit of a question to you all. See what you make of it.
>>
>> I've been thinking a bit about the latencies we see in the Starlink
>> network. This is why this list exist (right, Dave?). So what do we know?
>>
>> 1) We know that RTTs can be in the 100's of ms even in what appear to
>> be bent-pipe scenarios where the physical one-way path should be well
>> under 3000 km, with physical RTT under 20 ms.
>> 2) We know from plenty of traceroutes that these RTTs accrue in the
>> Starlink network, not between the Starlink handover point (POP) to
>> the Internet.
>> 3) We know that they aren't an artifact of the Starlink WiFi router
>> (our traceroutes were done through their Ethernet adaptor, which
>> bypasses the router), so they must be delays on the satellites or the
>> teleports.
>
> the ethernet adapter bypasses the wifi, but not the router, you have
> to cut the cable and replace the plug to bypass the router
Good point - but you still don't get the WiFi buffering here. Or at
least we don't seem to, looking at the difference between running with
and without the adapter.
>
>> 4) We know that processing delay isn't a huge factor because we also
>> see RTTs well under 30 ms.
>> 5) That leaves queuing delays.
>>
>> This issue has been known for a while now. Starlink have been
>> innovating their heart out around pretty much everything here - and
>> yet, this bufferbloat issue hasn't changed, despite Dave proposing
>> what appears to be an easy fix compared to a lot of other things they
>> have done. So what are we possibly missing here?
>>
>> Going back to first principles: The purpose of a buffer on a network
>> device is to act as a shock absorber against sudden traffic bursts.
>> If I want to size that buffer correctly, I need to know at the very
>> least (paraphrasing queueing theory here) something about my packet
>> arrival process.
>
> The question is over what timeframe. If you have a huge buffer, you
> can buffer 10s of seconds of traffic and eventually send it. That will
> make benchmarks look good, but not the user experience. The rapid drop
> in RAM prices (beyond merely a free fall) and the benchmark scores
> that heavily penalized any dropped packets encouraged buffers to get
> larger than is sane.
>
> it's still a good question to define what is sane, the longer the
> buffer, the mor of a chance of finding time to catch up, but having
> packets in the buffer that have timed out (i.e. DNS queries tend to
> time out after 3 seconds, TCP will give up and send replacement
> packets, making the initial packets meaningless) is counterproductive.
> What is the acceptable delay to your users?
>
> Here at the bufferbloat project, we tend to say that buffers past a
> few 10s of ms worth of traffic are probably bad and are aiming to
> single-digit ms in many cases.
Taken as read.
>
>> If I look at conventional routers, then that arrival process involves
>> traffic generated by a user population that changes relatively
>> slowly: WiFi users come and go. One at a time. Computers in a company
>> get turned on and off and rebooted, but there are no instantaneous
>> jumps in load - you don't suddenly have a hundred users in the middle
>> of watching Netflix turning up that weren't there a second ago. Most
>> of what we know about Internet traffic behaviour is based on this
>> sort of network, and this is what we've designed our queuing systems
>> around, right?
>
> not true, for businesses, every hour as meetings start and let out,
> and as people arrive in the morning, arrive back from lunch, you have
> very sharp changes in the traffic.
And herein lies the crunch: All of these things that you list happen
over much longer timeframes than a switch to a different satellite.
Also, folk coming back from lunch would start with something like
cwnd=10. Users whose TCP connections get switched over to a different
satellite by some underlying tunneling protocol could have much larger
cwnd.
>
> at home you have less changes in users, but you also may have less
> bandwidth (although many tech enthusiasts have more bandwidth than
> many companies, two of my last 3 jobs have had <400Mb at their main
> office with hundreds of employees while many people would consider
> that 'slow' for home use). As such a parent arriving home with a
> couple of kids will make a drastic change to the network usage in a
> very short time.
I think you've missed my point - I'm talking about changes in network
mid-flight, not people coming home and getting started over a period of
a few minutes. The change you see in a handover is sudden and probably
width sub-second ramp-up. And it's something that doesn't just happen
when people come home or return from lunch - it happens every few minutes.
>
>
> but the active quueing systems that we are designing (cake, fq_codel)
> handle these conditions very well because they don't try to guess what
> the usage is going to be, they just look at the packets that they have
> to process and figure out how to dispatch them out in the best way.
Understood - I've followed your work.
>
> because we have observed that latency tends to be more noticable for
> short connections (DNS, checking if cached web pages are up to date,
> etc), our algorithms give a slight priority to new-low-traffic
> connections over long-running-high-traffic connections rather than
> just splitting the bandwidth evenly across all connections, and can
> even go further to split bandwith between endpoints, not just
> connections (with endpoints being a configurable definition)
>
> without active queue management, the default is FIFO, which allows the
> high-user-impact, short connection packets to sit in a queue behind
> the low-user-impace, bulk data transfers. For benchmarks,
> a-packet-is-a-packet and they all count, so until you have enough
> buffering that you start having expired packets in flight, it doesn't
> matter, but for the user experience, there can be a huge difference.
All understood - you're preaching to the converted. It's just that I
think Starlink may be a different ballpark.
Put another way: If you have a protocol (TCP) that is designed to
reasonably expect that its current cwnd is OK to use for now is put into
a situation where there are relatively frequent, huge and lasting step
changes in available BDP within subsecond periods, are your underlying
assumptions still valid?
I suspect they're handing over whole cells, not individual users, at a
time.
>
> David Lang
>
--
****************************************************************
Dr. Ulrich Speidel
School of Computer Science
Room 303S.594 (City Campus)
The University of Auckland
u.speidel@auckland.ac.nz
http://www.cs.auckland.ac.nz/~ulrich/
****************************************************************
next prev parent reply other threads:[~2023-05-14 6:06 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-13 10:10 Ulrich Speidel
2023-05-13 11:20 ` Sebastian Moeller
2023-05-13 12:16 ` Ulrich Speidel
2023-05-13 23:00 ` David Lang
2023-05-13 22:57 ` David Lang
2023-05-14 6:06 ` Ulrich Speidel [this message]
2023-05-14 6:55 ` David Lang
2023-05-14 8:43 ` Ulrich Speidel
2023-05-14 9:00 ` David Lang
2023-05-15 2:41 ` Ulrich Speidel
2023-05-15 3:33 ` David Lang
2023-05-15 6:36 ` Sebastian Moeller
2023-05-15 11:07 ` David Lang
2023-05-24 12:55 ` Ulrich Speidel
2023-05-24 13:44 ` Dave Taht
2023-05-24 14:05 ` David Lang
2023-05-24 14:49 ` Michael Richardson
2023-05-24 15:09 ` Dave Collier-Brown
2023-05-24 15:31 ` Dave Taht
2023-05-24 18:30 ` Michael Richardson
2023-05-24 18:45 ` Sebastian Moeller
2023-05-24 13:59 ` David Lang
2023-05-24 22:39 ` Ulrich Speidel
2023-05-25 0:06 ` David Lang
2023-07-27 20:37 ` Ulrich Speidel
2023-05-24 15:18 ` Mark Handley
2023-05-24 21:50 ` Ulrich Speidel
2023-05-25 0:17 ` David Lang
2023-05-14 9:06 ` Sebastian Moeller
2023-05-14 9:13 ` David Lang
2023-05-14 9:57 ` Oleg Kutkov
2023-05-14 9:59 ` Oleg Kutkov
2023-05-24 15:26 ` Bjørn Ivar Teigen
2023-05-24 21:53 ` Ulrich Speidel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/starlink.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48b00469-0dbb-54c4-bedb-3aecbf714a1a@auckland.ac.nz \
--to=u.speidel@auckland.ac.nz \
--cc=david@lang.hm \
--cc=starlink@lists.bufferbloat.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox