[Starlink] It's still the starlink latency...

Eugene Y Chang eugene.chang at ieee.org
Tue Sep 27 00:06:17 EDT 2022


> Speedtest also does nothing to measure how well a given
> videoconference or voip session might go. There isn't a test (at least
> not when last I looked) in the FCC broadband measurements for just
> videoconferencing, and their latency under load test for many years
> now, is buried deep in the annual report.


> I have a new one - prototyped in some starlink tests so far, and
> elsewhere - called "SPOM" - steady packets over milliseconds, which,
> when run simultaneously with capacity seeking traffic, might be a
> better predictor of videoconferencing performance.


The challenge is for most people, they cannot observe when the network behavior is causing problems. Even videoconf, it is usually one person speaking at a time. Timing problems are invisible. Since 2020, I have been using Jamulus, an app that lets 2-100+ users hold realtime music. (Typically 2-30 users). The delay and fluctuation of delay is easily heard by the participants. Steady reasonable latency (<60ms) is manageable. High fluctuating latency is disruptive. (And yes, all the traffic is time sensitive UDP.)

SPOM is a good start. I would be interested in SPOM measurements when the video conf is stuttering or hanging. Note, often it is only one user in the conf affected. If the SPOM is not running on that affected user’s link, probably won’t get the interesting data.


Gene
----------------------------------------------
Eugene Chang
IEEE Senior Life Member
eugene.chang at ieee.org
781-799-0233 (in Honolulu)



> On Sep 26, 2022, at 2:35 PM, Dave Taht <dave.taht at gmail.com> wrote:
> 
> On Mon, Sep 26, 2022 at 2:45 PM Bruce Perens via Starlink
> <starlink at lists.bufferbloat.net <mailto:starlink at lists.bufferbloat.net>> wrote:
>> 
>> That's a good maxim: Don't believe a speed test that is hosted by your own ISP.
> 
> A network designed for speedtest.net <http://speedtest.net/>, is a network... designed for
> speedtest. Starlink seemingly was designed for speedtest - the 15
> second "cycle" to sense/change their bandwidth setting is just within
> the 20s cycle speedtest terminates at, and speedtest returns the last
> number for the bandwidth. It is a brutal test - using 8 or more flows
> - much harder on the network than your typical web page load which,
> while that is often 15 or so, most never run long enough to get out of
> slow start. At least some of qualifying for the RDOF money was
> achieving 100mbits down on "speedtest".
> 
> A knowledgeable user concerned about web PLT should be looking a the
> first 3 s of a given test, and even then once the bandwidth cracks
> 20Mbit, it's of no help for most web traffic ( we've been citing mike
> belshe's original work here a lot,
> and more recent measurements still show that )
> 
> Speedtest also does nothing to measure how well a given
> videoconference or voip session might go. There isn't a test (at least
> not when last I looked) in the FCC broadband measurements for just
> videoconferencing, and their latency under load test for many years
> now, is buried deep in the annual report.
> 
> I hope that with both ookla and samknows more publicly recording and
> displaying latency under load (still, sigh, I think only displaying
> the last number and only sampling every 250ms) that we can shift the
> needle on this, but I started off this thread complaining nobody was
> picking up on those numbers... and neither service tests the worst
> case scenario of a simultaneous up/download, which was the principal
> scenario we explored with the flent "rrul" series of tests, which were
> originally designed to emulate and deeply understand what bittorrent
> was doing to networks, and our principal tool in designing new fq and
> aqm and transport CCs, along with the rtt_fair test for testing near
> and far destinations at the same time.
> 
> My model has always been a family of four, one person uploading,
> another doing web, one doing videoconferencing,
> and another doing voip or gaming, and no test anyone has emulates
> that. With 16 wifi devices
> per household, the rrul scenario is actually not "worst case", but
> increasingly the state of things "normally".
> 
> Another irony about speedtest is that users are inspired^Wtrained to
> use it when the "network feels slow", and self-initiate something that
> makes it worse, for both them and their portion of the network.
> 
> Since the internet architecture board met last year, (
> https://www.iab.org/activities/workshops/network-quality/ <https://www.iab.org/activities/workshops/network-quality/>  ) there
> seems to be an increasing amount of work on better metrics and tests
> for QoE, with stuff like apple's responsiveness test, etc.
> 
> I have a new one - prototyped in some starlink tests so far, and
> elsewhere - called "SPOM" - steady packets over milliseconds, which,
> when run simultaneously with capacity seeking traffic, might be a
> better predictor of videoconferencing performance.
> 
> There's also a really good "P99" conference coming up for those, that
> like me, are OCD about a few sigmas.
> 
>> 
>> On Mon, Sep 26, 2022 at 2:36 PM Eugene Y Chang via Starlink <starlink at lists.bufferbloat.net> wrote:
>>> 
>>> Thank you for the dialog,.
>>> This discussion with regards to Starlink is interesting as it confirms my guesses about the gap between Starlinks overly simplified, over optimistic marketing and the reality as they acquire subscribers.
>>> 
>>> I am actually interested in a more perverse issue. I am seeing latency and bufferbloat as a consequence from significant under provisioning. It doesn’t matter that the ISP is selling a fiber drop, if (parts) of their network is under provisioned. Two end points can be less than 5 mile apart and realize 120+ ms latency. Two Labor Days ago (a holiday) the max latency was 230+ ms. The pattern I see suggest digital redlining. The older communities appear to have much more severe under provisioning.
>>> 
>>> Another observation. Running speedtest appears to go from the edge of the network by layer 2 to the speedtest host operated by the ISP. Yup, bypasses the (suspected overloaded) routers.
>>> 
>>> Anyway, just observing.
>>> 
>>> Gene
>>> ----------------------------------------------
>>> Eugene Chang
>>> IEEE Senior Life Member
>>> eugene.chang at ieee.org
>>> 781-799-0233 (in Honolulu)
>>> 
>>> 
>>> 
>>> On Sep 26, 2022, at 11:20 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>> 
>>> Hi Gene,
>>> 
>>> 
>>> On Sep 26, 2022, at 23:10, Eugene Y Chang <eugene.chang at ieee.org> wrote:
>>> 
>>> Comments inline below.
>>> 
>>> Gene
>>> ----------------------------------------------
>>> Eugene Chang
>>> IEEE Senior Life Member
>>> eugene.chang at ieee.org
>>> 781-799-0233 (in Honolulu)
>>> 
>>> 
>>> 
>>> On Sep 26, 2022, at 11:01 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>> 
>>> Hi Eugene,
>>> 
>>> 
>>> On Sep 26, 2022, at 22:54, Eugene Y Chang via Starlink <starlink at lists.bufferbloat.net> wrote:
>>> 
>>> Ok, we are getting into the details. I agree.
>>> 
>>> Every node in the path has to implement this to be effective.
>>> 
>>> 
>>> Amazingly the biggest bang for the buck is gotten by fixing those nodes that actually contain a network path's bottleneck. Often these are pretty stable. So yes for fully guaranteed service quality all nodes would need to participate, but for improving things noticeably it is sufficient to improve the usual bottlenecks, e.g. for many internet access links the home gateway is a decent point to implement better buffer management. (In short the problem are over-sized and under-managed buffers, and one of the best solution is better/smarter buffer management).
>>> 
>>> 
>>> This is not completely true.
>>> 
>>> 
>>> [SM] You are likely right, trying to summarize things leads to partially incorrect generalizations.
>>> 
>>> 
>>> Say the bottleneck is at node N. During the period of congestion, the upstream node N-1 will have to buffer. When node N recovers, the bufferbloat at N-1 will be blocking until the bufferbloat drains. Etc. etc.  Making node N better will reduce the extent of the backup at N-1, but N-1 should implement the better code.
>>> 
>>> 
>>> [SM] It is the node that builds up the queue that profits most from better queue management.... (again I generalize, the node with the queue itself probably does not care all that much, but the endpoints will profit if the queue experiencing node deals with that queue more gracefully).
>>> 
>>> 
>>> 
>>> 
>>> 
>>> In fact, every node in the path has to have the same prioritization or the scheme becomes ineffective.
>>> 
>>> 
>>> Yes and no, one of the clearest winners has been flow queueing, IMHO not because it is the most optimal capacity sharing scheme, but because it is the least pessimal scheme, allowing all (or none) flows forward progress. You can interpret that as a scheme in which flows below their capacity share are prioritized, but I am not sure that is the best way to look at these things.
>>> 
>>> 
>>> The hardest part is getting competing ISPs to implement and coordinate.
>>> 
>>> 
>>> [SM] Yes, but it turned out even with non-cooperating ISPs there is a lot end-users can do unilaterally on their side to improve both ingress and egress congestion. Admittedly especially ingress congestion would be even better handled with cooperation of the ISP.
>>> 
>>> Bufferbloat and handoff between ISPs will be hard. The only way to fix this is to get the unwashed public to care. Then they can say “we don’t care about the technical issues, just fix it.” Until then …..
>>> 
>>> 
>>> [SM] Well we do this one home network at a time (not because that is efficient or ideal, but simply because it is possible). Maybe, if you have not done so already try OpenWrt with sqm-scripts (and maybe cake-autorate in addition) on your home internet access link for say a week and let us know ih/how your experience changed?
>>> 
>>> Regards
>>> Sebastian
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Regards
>>> Sebastian
>>> 
>>> 
>>> 
>>> Gene
>>> ----------------------------------------------
>>> Eugene Chang
>>> IEEE Senior Life Member
>>> eugene.chang at ieee.org
>>> 781-799-0233 (in Honolulu)
>>> 
>>> 
>>> 
>>> On Sep 26, 2022, at 10:48 AM, David Lang <david at lang.hm> wrote:
>>> 
>>> software updates can do far more than just improve recovery.
>>> 
>>> In practice, large data transfers are less sensitive to latency than smaller data transfers (i.e. downloading a CD image vs a video conference), software can ensure better fairness in preventing a bulk transfer from hurting the more latency sensitive transfers.
>>> 
>>> (the example below is not completely accurate, but I think it gets the point across)
>>> 
>>> When buffers become excessivly large, you have the situation where a video call is going to generate a small amount of data at a regular interval, but a bulk data transfer is able to dump a huge amount of data into the buffer instantly.
>>> 
>>> If you just do FIFO, then you get a small chunk of video call, then several seconds worth of CD transfer, followed by the next small chunk of the video call.
>>> 
>>> But the software can prevent the one app from hogging so much of the connection and let the chunk of video call in sooner, avoiding the impact to the real time traffic. Historically this has required the admin classify all traffic and configure equipment to implement different treatment based on the classification (and this requires trust in the classification process), the bufferbloat team has developed options (fq_codel and cake) that can ensure fairness between applications/servers with little or no configuration, and no trust in other systems to properly classify their traffic.
>>> 
>>> The one thing that Cake needs to work really well is to be able to know what the data rate available is. With Starlink, this changes frequently and cake integrated into the starlink dish/router software would be far better than anything that can be done externally as the rate changes can be fed directly into the settings (currently they are only indirectly detected)
>>> 
>>> David Lang
>>> 
>>> 
>>> On Mon, 26 Sep 2022, Eugene Y Chang via Starlink wrote:
>>> 
>>> You already know this. Bufferbloat is a symptom and not the cause. Bufferbloat grows when there are (1) periods of low or no bandwidth or (2) periods of insufficient bandwidth (aka network congestion).
>>> 
>>> If I understand this correctly, just a software update cannot make bufferbloat go away. It might improve the speed of recovery (e.g. throw away all time sensitive UDP messages).
>>> 
>>> Gene
>>> ----------------------------------------------
>>> Eugene Chang
>>> IEEE Senior Life Member
>>> eugene.chang at ieee.org
>>> 781-799-0233 (in Honolulu)
>>> 
>>> 
>>> 
>>> On Sep 26, 2022, at 10:04 AM, Bruce Perens <bruce at perens.com> wrote:
>>> 
>>> Please help to explain. Here's a draft to start with:
>>> 
>>> Starlink Performance Not Sufficient for Military Applications, Say Scientists
>>> 
>>> The problem is not availability: Starlink works where nothing but another satellite network would. It's not bandwidth, although others have questions about sustaining bandwidth as the customer base grows. It's latency and jitter. As load increases, latency, the time it takes for a packet to get through, increases more than it should. The scientists who have fought bufferbloat, a major cause of latency on the internet, know why. SpaceX needs to upgrade their system to use the scientist's Open Source modifications to Linux to fight bufferbloat, and thus reduce latency. This is mostly just using a newer version, but there are some tunable parameters. Jitter is a change in the speed of getting a packet through the network during a connection, which is inevitable in satellite networks, but will be improved by making use of the bufferbloat-fighting software, and probably with the addition of more satellites.
>>> 
>>> We've done all of the work, SpaceX just needs to adopt it by upgrading their software, said scientist Dave Taht. Jim Gettys, Taht's collaborator and creator of the X Window System, chimed in: <fill in here please>
>>> Open Source luminary Bruce Perens said: sometimes Starlink's latency and jitter make it inadequate to remote-control my ham radio station. But the military is experimenting with remote-control of vehicles on the battlefield and other applications that can be demonstrated, but won't happen at scale without adoption of bufferbloat-fighting strategies.
>>> 
>>> On Mon, Sep 26, 2022 at 12:59 PM Eugene Chang <eugene.chang at alum.mit.edu<mailto:eugene.chang at alum.mit.edu>> wrote:
>>> The key issue is most people don’t understand why latency matters. They don’t see it or feel it’s impact.
>>> 
>>> First, we have to help people see the symptoms of latency and how it impacts something they care about.
>>> - gamers care but most people may think it is frivolous.
>>> - musicians care but that is mostly for a hobby.
>>> - business should care because of productivity but they don’t know how to “see” the impact.
>>> 
>>> Second, there needs to be a “OMG, I have been seeing the action of latency all this time and never knew it! I was being shafted.” Once you have this awakening, you can get all the press you want for free.
>>> 
>>> Most of the time when business apps are developed, “we” hide the impact of poor performance (aka latency) or they hide from the discussion because the developers don’t have a way to fix the latency. Maybe businesses don’t care because any employees affected are just considered poor performers. (In bad economic times, the poor performers are just laid off.) For employees, if they happen to be at a location with bad latency, they don’t know that latency is hurting them. Unfair but most people don’t know the issue is latency.
>>> 
>>> Talking and explaining why latency is bad is not as effective as showing why latency is bad. Showing has to be with something that has a person impact.
>>> 
>>> Gene
>>> -----------------------------------
>>> Eugene Chang
>>> eugene.chang at alum.mit.edu <mailto:eugene.chang at alum.mit.edu>
>>> +1-781-799-0233 (in Honolulu)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Sep 26, 2022, at 6:32 AM, Bruce Perens via Starlink <starlink at lists.bufferbloat.net<mailto:starlink at lists.bufferbloat.net>> wrote:
>>> 
>>> If you want to get attention, you can get it for free. I can place articles with various press if there is something interesting to say. Did this all through the evangelism of Open Source. All we need to do is write, sign, and publish a statement. What they actually write is less relevant if they publish a link to our statement.
>>> 
>>> Right now I am concerned that the Starlink latency and jitter is going to be a problem even for remote controlling my ham station. The US Military is interested in doing much more, which they have demonstrated, but I don't see happening at scale without some technical work on the network. Being able to say this isn't ready for the government's application would be an attention-getter.
>>> 
>>> Thanks
>>> 
>>> Bruce
>>> 
>>> On Mon, Sep 26, 2022 at 9:21 AM Dave Taht via Starlink <starlink at lists.bufferbloat.net<mailto:starlink at lists.bufferbloat.net>> wrote:
>>> These days, if you want attention, you gotta buy it. A 50k half page
>>> ad in the wapo or NYT riffing off of It's the latency, Stupid!",
>>> signed by the kinds of luminaries we got for the fcc wifi fight, would
>>> go a long way towards shifting the tide.
>>> 
>>> On Mon, Sep 26, 2022 at 8:29 AM Dave Taht <dave.taht at gmail.com <mailto:dave.taht at gmail.com>> wrote:
>>> 
>>> 
>>> On Mon, Sep 26, 2022 at 8:20 AM Livingood, Jason
>>> <Jason_Livingood at comcast.com <mailto:Jason_Livingood at comcast.com>> wrote:
>>> 
>>> 
>>> The awareness & understanding of latency & impact on QoE is nearly unknown among reporters. IMO maybe there should be some kind of background briefings for reporters - maybe like a simple YouTube video explainer that is short & high level & visual? Otherwise reporters will just continue to focus on what they know...
>>> 
>>> 
>>> That's a great idea. I have visions of crashing the washington
>>> correspondents dinner, but perhaps
>>> there is some set of gatherings journalists regularly attend?
>>> 
>>> 
>>> On 9/21/22, 14:35, "Starlink on behalf of Dave Taht via Starlink" <starlink-bounces at lists.bufferbloat.net <mailto:starlink-bounces at lists.bufferbloat.net> on behalf of starlink at lists.bufferbloat.net <mailto:starlink at lists.bufferbloat.net>> wrote:
>>> 
>>> I still find it remarkable that reporters are still missing the
>>> meaning of the huge latencies for starlink, under load.
>>> 
>>> 
>>> 
>>> 
>>> --
>>> FQ World Domination pending: https://blog.cerowrt.org/post/state_of_fq_codel/<https://blog.cerowrt.org/post/state_of_fq_codel/>
>>> Dave Täht CEO, TekLibre, LLC
>>> 
>>> 
>>> 
>>> 
>>> --
>>> FQ World Domination pending: https://blog.cerowrt.org/post/state_of_fq_codel/<https://blog.cerowrt.org/post/state_of_fq_codel/>
>>> Dave Täht CEO, TekLibre, LLC
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink at lists.bufferbloat.net <mailto:Starlink at lists.bufferbloat.net>
>>> https://lists.bufferbloat.net/listinfo/starlink <https://lists.bufferbloat.net/listinfo/starlink>
>>> 
>>> 
>>> --
>>> Bruce Perens K6BP
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink at lists.bufferbloat.net <mailto:Starlink at lists.bufferbloat.net>
>>> https://lists.bufferbloat.net/listinfo/starlink <https://lists.bufferbloat.net/listinfo/starlink>
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Bruce Perens K6BP
>>> 
>>> 
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/starlink
>>> 
>>> 
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/starlink
>> 
>> 
>> 
>> --
>> Bruce Perens K6BP
>> _______________________________________________
>> Starlink mailing list
>> Starlink at lists.bufferbloat.net <mailto:Starlink at lists.bufferbloat.net>
>> https://lists.bufferbloat.net/listinfo/starlink <https://lists.bufferbloat.net/listinfo/starlink>
> 
> 
> 
> --
> FQ World Domination pending: https://blog.cerowrt.org/post/state_of_fq_codel/ <https://blog.cerowrt.org/post/state_of_fq_codel/>
> Dave Täht CEO, TekLibre, LLC

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/starlink/attachments/20220926/e4654088/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <https://lists.bufferbloat.net/pipermail/starlink/attachments/20220926/e4654088/attachment-0001.sig>


More information about the Starlink mailing list