[Starlink] [NNagain] When Flows Collide?

Fri Mar 8 21:57:04 EST 2024

this is what bufferbloat has been fighting. The default was that 'data is 
important, don't throw it away, hang on to it and send it later'

In practice, this has proven to be suboptimal as the buffers grew large enough 
that the data being buffered was retransmitted anyway (among other problems)

And because the data was buffered, new data arriving was delayed behind the 
buffered data.

This is measurable as 'latency under load' for light connections. So while 
latency isn't everything, it turns out to be a good proxy to detect when the 
standard queuing mechansisms are failing to give you good performance.

It turns out that not all data is equally important. Active Queue Management is 
the art of deciding priorities, both in deciding what data to throw away, but 
also in allowing some later arriving data to be transmitted ahead of data in 
another connection that arrived before it.

With fq_codel and cake, this involves tracking the different connections and 
their behavior. connections that send relatively little data (DNS lookups, video 
chat) have priority over connections that send a lot of data (ISO downloads), 
not based on classifying the data, but by watching the behavior.

connections with a lot of data can buffer a bit, but aren't allowed to use all 
the available buffer space, after they have used 'their share', packets get 
marked/dropped to signal the sender to slow down.

While it is possible for an implementation to 'cheat' by detecting the latency 
probes and prioritizing them, measuring the latency on real data works as well 
and they can't cheat on that without actually addressing the problem

That's why you see such a significant focus on latency in this group, it's not 
latency for the sake of latency, it's latency as a sign that new and sparse 
flows can get a reasonable share of bandwith even in the face of heavy/hostile 
users on the same links.

David Lang

On Fri, 8 Mar 2024, Jack Haverty via Nnagain wrote:

> Date: Fri, 8 Mar 2024 15:44:05 -0800
> From: Jack Haverty via Nnagain <nnagain at lists.bufferbloat.net>
> To: nnagain at lists.bufferbloat.net, Starlink at lists.bufferbloat.net
> Cc: Jack Haverty <jack at 3kitty.org>
> Subject: [NNagain] When Flows Collide?
> 
> It's great to see that latency is getting attention as well as action to 
> control it.  But it's only part of the bigger picture of Internet 
> performance.
>
> While performance across a particular network is interesting, most uses of 
> the Internet involve data flowing through several separate networks.  That's 
> pretty much the definition of "Internet".  The endpoints might be some kind 
> of LAN in a home or corporate IT facility or public venue.   In between there 
> might be fiber, radio, satellite, or other (even whimsically avian!?) 
> networks carrying a users data.   This kind of system configuration has 
> existed since the genesis of The Internet and seems likely to continue. 
> Technology has advanced a lot, with bigger and bigger "pipes" invented to 
> carry more data, but fundamental issues remain.
>
> System configurations we used in the early research days were real 
> experiments to be measured and tested, or often just "thought experiments" to 
> imagine how the system would behave, what algorithms would be appropriate, 
> and what protocols had to exist to coordinate the activities of all the 
> components.
>
> One such configuration was very simple.  Imagine there are three very fast 
> computers, each attached to a very fast LAN.   The computers and LAN can send 
> and receive data as fast as you can imagine, so that they are not a limiting 
> factor.   The LANs are attached to some "ISP" which isn't as fast (in 
> bandwidth or latency) as a LAN.  ISPs are interconnected at various points, 
> forming a somewhat rich mesh of topology with several, or many, possible 
> routes from any source to any destination.
>
> Now imagine a user configuration in which two of the computers send a 
> constant stream of data to the third computer at a predefined rate.  Perhaps 
> it is a UDP datagram every N milliseconds, each datagram containing a frame 
> of video.  If N=20 it corresponds to a 50Hz frame rate, which is common for 
> video.
>
> Somewhere along the way to that common destination, those two data streams 
> collide, and there is a bottleneck.   All the data coming in cannot fit in 
> the pipe going out.  Something has to give.
>
> Thought experiment -- What should happen?  Does the bottleneck discard 
> datagrams it can't handle?  How does it decide which ones to discard?   Does 
> the bottleneck buffer the excess datagrams, hoping that the situation is just 
> temporary?   Does the bottleneck somehow signal back to the sources to reduce 
> their data rate?  Does th ebottleneck discard datagrams that it knows won't 
> reach the destination in time to be useful?  Does the bottleneck trigger some 
> kind of network reconfiguration, perhaps to route "low priority" data along 
> some alternate path to free up capacity for the video streams that requires 
> low latency?
>
> Real experiment -- set up such a configuration and observe what happens, 
> especially from the end-users' perspectives.  What kind of video does the 
> end-user see?
>
> Second thought experiment -- Using the same configuration, send data using 
> TCP instead of UDP.  This adds more mechanisms, but now in the end-users' 
> computers.  How should the ISPs and TCPs involved behave?  How should they 
> cooperate?  What should happen?  What mechanisms (algorithms, protocols, 
> etc.) are needed to make the system behave that way?
>
> Second real Experiment -- How do the specific TCP implementations actually 
> behave?  What kind of video quality do the end users experience?  What kind 
> of data flows actually travel through the network components?
>
> Of course we all observe such real experiments every day, whenever we see or 
> participate in various kinds of videoconferences.  Perhaps someone has 
> instrumented and gathered performance data...?
>
> These questions were discussed and debated at great length more than 40 years 
> ago as TCP V4 was designed.  We couldn't figure out the appropriate 
> algorithms and protocols, and didn't have computer equipment or 
> communications capabilities to implement anything more than the simplest 
> mechanisms anyway.   So the topic became an item on the "future study" list.
>
> But we did put various "placeholder" mechanisms in place in TCP/IP V4, as a 
> reminder that a "real" solution was needed for some future next generation 
> release.  Time-to-live (TTL) would likely need to be based on actual time 
> instead of hops - which were silly but the best we could do with available 
> equipment at the time.  Source Quench (SQ) needed to be replaced by a more 
> effective mechanism, and include details of how all the components should act 
> when sending or receiving an SQ.   Routing needed to be expanded to add the 
> ability to send different data flows over different routes, so that bulk and 
> interactive data could more readily coexist.   Lots of such issues to be 
> resolved.
>
> In the meanwhile, the general consensus was that everything would work OK as 
> long as the traffic flows only rarely created "bottleneck" situations, and 
> such events would be short and transitory.   There wasn't a lot of data flow 
> yet; the Internet was still an Experiment.  We figured we'd be OK for a while 
> as the research continued and found solutions.
>
> Meanwhile, the Web happened.  Videoconferencing, vlogs, and other generators 
> of high traffic exploded.  Clouds have formed, with users now interacting 
> with very remote computers instead of the ones on their desks or down the 
> hall.
>
> As Dorothy would say, "We're not in Kansas anymore".
>
> Jack Haverty
>
>
>
>
>
>
>
>
> On 3/8/24 12:31, Dave Taht via Nnagain wrote:
>> I am deeply appreciative of everyones efforts here over the past 3
>> years, and within starlink burning the midnight oil on their 20ms
>> goal, (especially nathan!!!!) to make all the progress made on their
>> systems in these past few months. I was so happy to burn about 12
>> minutes, publicly, taking apart Oleg's results here, last week:
>> 
>> https://www.youtube.com/watch?v=N0Tmvv5jJKs&t=1760s
>> 
>> But couldn't then and still can't talk better to the whys and the
>> problems remaining. (It's not a kernel problem, actually)
>> 
>> As for starlink/space support of us, bufferbloat.net, and/or lowering
>> latency across the internet in general, I don't know. I keep hoping a
>> used tesla motor for my boat will arrive in the mail one day, that's
>> all. :)
>> 
>> It is my larger hope that with this news, all the others doing FWA,
>> and for that matter, cable, and fiber, will also get on the stick,
>> finally. Maybe someone in the press will explain bufferbloat. Who
>> knows what the coming days hold!?
>> 
>> 13 herbs and spices....
>> 
>> On Fri, Mar 8, 2024 at 3:10 PM the keyboard of geoff goodfellow via
>> Starlink<starlink at lists.bufferbloat.net>  wrote:
>>> it would be a super good and appreciative gesture if they would disclose 
>>> what/if any of the stuff they are making use of and then also to make a 
>>> donation :)
>>> 
>>> On Fri, Mar 8, 2024 at 12:50 PM J Pan<Pan at uvic.ca>  wrote:
>>>> they benefited a lot from this mailing list and the research and even
>>>> user community at large
>>>> --
>>>> J Pan, UVic CSc, ECS566, 250-472-5796 (NO VM),Pan at UVic.CA, 
>>>> Web.UVic.CA/~pan
>>>> 
>>>> 
>>>> On Fri, Mar 8, 2024 at 11:40 AM the keyboard of geoff goodfellow via
>>>> Starlink<starlink at lists.bufferbloat.net>  wrote:
>>>>> Super excited to be able to share some of what we have been working on 
>>>>> over the last few months!
>>>>> EXCERPT:
>>>>> 
>>>>> Starlink engineering teams have been focused on improving the 
>>>>> performance of our network with the goal of delivering a service with 
>>>>> stable 20 millisecond (ms) median latency and minimal packet loss.
>>>>> 
>>>>> Over the past month, we have meaningfully reduced median and worst-case 
>>>>> latency for users around the world. In the United States alone, we 
>>>>> reduced median latency by more than 30%, from 48.5ms to 33ms during 
>>>>> hours of peak usage. Worst-case peak hour latency (p99) has dropped by 
>>>>> over 60%, from over 150ms to less than 65ms. Outside of the United 
>>>>> States, we have also reduced median latency by up to 25% and worst-case 
>>>>> latencies by up to 35%...
>>>>> 
>>>>> [...]
>>>>> https://api.starlink.com/public-files/StarlinkLatency.pdf
>>>>> via
>>>>> https://twitter.com/Starlink/status/1766179308887028005
>>>>> &
>>>>> https://twitter.com/VirtuallyNathan/status/1766179789927522460
>>>>> 
>>>>> 
>>>>> --
>>>>> Geoff.Goodfellow at iconia.com
>>>>> living as The Truth is True
>>>>> 
>>>>> _______________________________________________
>>>>> Starlink mailing list
>>>>> Starlink at lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/starlink
>>> 
>>> --
>>> Geoff.Goodfellow at iconia.com
>>> living as The Truth is True
>>> 
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/starlink
>> 
>> 
>
>
-------------- next part --------------
_______________________________________________
Nnagain mailing list
Nnagain at lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/nnagain