[Bloat] [Cerowrt-devel] [Starlink] Little's Law mea culpa, but not invalidating my main point

Dave Taht dave.taht at gmail.com
Sun Sep 26 15:40:42 EDT 2021


administrivia: I'm taking codel and cerowrt-devel off the cc. While I
did finally figure out how to convince mailman (just now)  to accept
this many recipients (privacy->recipients->0) I don't actually
administer any of the other lists at bufferbloat.net anymore, and am
very out of practice at sysadmin in general. I'm hoping also the
starlink archive is now visible,
and that I can send an interesting 2MB attachment... here goes....

I keep hoping to find an answer to the long-standing issue of getting
google to index the lovely information that flows by here. (if anyone
has chops) Anyway, back in the age when cross-posting was useful
(netnews), the subset of folk that are not on these other lists I
think is near zero.

Anyway, moving on to comment:

On Sun, Sep 26, 2021 at 11:24 AM David P. Reed <dpreed at deepplum.com> wrote:
>
> Pretty good list, thanks for putting this together.
>
>
>
> The only thing I'd add, and I'm not able to formulate it very elegantly, is this personal insight: One that I would research, because it can be a LOT more useful in the end-to-end control loop than stuff like ECN, L4S, RED, ...
>
>
>
> Fact: Detecting congestion by allowing a queue to build up is a very lagging indicator of incipient congestion in the forwarding system. The delay added to all paths by that queue buildup slows down the control loop's ability to respond by slowing the sources. It's the control loop delay that creates both instability and continued congestion growth.
>
> Observation: current forwarders forget what they have forwarded as soon as it is transmitted. This loses all the information about incipient congestion and "fairness" among multiple sources. Yet, there is no need to forget recent history at all after the packets have been transmitted.

current "FIFO" forwarders do. Flow Queuing based approaches (fq_codel,
cake, fq_pie) retain information about the stochastically hashed flow
(1024 flows usually), and it's
needed mark or drop rate, even when that queue has emptied for many
hundred ms. There was an interesting experiment using the sch_fq based
approach where millions of flows were tracked (and intermixed) via
rbtree and then fed to a single AQM but it was so long ago I don't
remember
who did it or where the code went. We could have fed forward some
slope or some other data into that AQM implementation...

>
>
>
> An idea I keep proposing is the idea of remembering the last K seconds of packets, their flow ids (source and destination), the arrival time and departure time, and their channel occupancy on the outbound shared link. Then using this information to reflect incipient congestion information to the flows that need controlling, to be used in their control loops.
>

I would rather like an internet where the queues were always empty and
yet ran at 99% utilization all the time. :)

 e2e packet pacing (as in linux's sch_fq) is a really good starting
foothold here, but pacing in != pacing out,
and certain L2 transport types, notably wifi, need a queue to be most
efficient. Whilst you point at the idea of an e2e approach, directing
this sort of history
at an l2 optimization seems useful.

To direct this at the starlink problem, they presently re-optimize
their networks on roughly a 15 second interval, as shown in this hires
(3ms resolution) irtt plot, attached.

Vertically - peak delay on this plot is at about 400ms; you can see a
bufferbloat mountain about 2/3s of the way through from a speedtest,
and all sorts of jitter artifacts from either the plotter! or the
data! which I haven't got around to poking into. Based on the recent
history of packet traversals it would be my hope they could more
rapidly
focus additional or less beams at a site in a far smaller interval, as
15 sec is optimal for speedtest, but sub-1sec optimal for web traffic.

With the introduction of the starlink laser satellite links, treating
this as a centrally optimizable problem makes my head hurt, and I had
(still have) hope for a remy-like approach
to find the best solutions, once the right variables can be found.

>
>
> So far, no one has taken me up on doing the research to try this in the field.

SImulation would be doable. I don't think ipv4 or ipv6 have enough
bits, and I'd start over with a globally synced mac (l2) layer,
punching down a utc timestamp and a 32 bit "flowid" from the l3 and
more bits for congestion info, and try to figure out something
intelligent to do to map from l2 to l3 before the packet left the
ground.

>Note: the signalling can be simple (sending ECN flags on all flows that transit the queue, even though there is no backlog, yet, when the queue is empty but transient overload seems likely), but the key thing is that we already assume that  recent history of packets is predictive of future overflow.

In particular, thinking about this as "transient overload seems
likely" is helpful to me.

>
> This can be implemented locally on any routing path that tends to be a bottleneck link. Such as the uplink of a home network. It should work with TCP as is if the signalling causes window reduction (at first, just signal by dropping packets prematurely, but if TCP will handle ECN aggressively - a single ECN mark causing window reduction, then it will help that, too).
>
>
>
> The insight is that from an "information and control theory" perspective, the packets that have already been forwarded are incredibly valuable for congestion prediction.

It's a good starting point, yes.

>
>
>
> Please, if possible, if anyone actually works on this and publishes, give me credit for suggesting this.

ENOFUNDING. But I promise to remember next year, as elsewhere we are
building up a large string of tools and simulations for rfc3168,l4s,or
sce ecn-enabled traffic, aqms, and endpoints. if we ever get the time
I'd wanted to add mit's "ABC" idea into the mix, but most of what I'd
like to do is add some sort of more dramatic response to multiple
rfc3168 CE marks by an fq_codel derived algorithm within an rtt to a tcp.

The simplest take I have on the ecn signalling problem of any sort is
merely: we do not have enough bits in the ipv4 or ipv6 headers to
tackle this, and any form of early congestion signalling has to be
isolated from conventional network traffic.

>
> Just because I've been suggesting it for about 15 years now, and being ignored. It would be a mitzvah.
>
>
>
>
>
> On Thursday, September 23, 2021 1:46pm, "Bob McMahon" <bob.mcmahon at broadcom.com> said:
>
> Hi All,
> I do appreciate this thread as well. As a test & measurement guy here are my conclusions around network performance. Thanks in advance for any comments.
>
> Congestion can be mitigated the following ways
> o) Size queues properly to minimize/negate bloat (easier said than done with tech like WiFi)
> o) Use faster links on the service side such that a queues' service rates exceeds the arrival rate, no congestion even in bursts, if possible
> o) Drop entries during oversubscribed states (queue processing can't "speed up" like water flow through a constricted pipe, must drop)
> o) Identify aggressor flows per congestion if possible
> o) Forwarding planes can signal back the the sources "earlier" to minimize queue build ups per a "control loop request" asking sources to pace their writes
> o) transport layers use techniques a la BBR
> o) Use "home gateways" that support tech like FQ_CODEL
> Latency can be mitigated the following ways
> o) Mitigate or eliminate congestion, particularly around queueing delays
> o) End host apps can use TCP_NOTSENT_LOWAT along with write()/select() to reduce host sends of "better never than late" messages
> o) Move servers closer to the clients per fundamental limit of the speed of light (i.e. propagation delay of energy over the wave guides), a la CDNs
> (Except if you're a HFT, separate servers across geography and make sure to have exclusive user rights over the lowest latency links)
>
> Transport control loop(s)
> o) Transport layer control loops are non linear systems so network tooling will struggle to emulate "end user experience"
> o) 1/2 RTT does not equal OWD used to compute the bandwidth delay product, imbalance and effects need to be measured
> o) forwarding planes signaling congestion to sources wasn't designed in TCP originally but the industry trend seems to be to moving towards this per things like L4S
> Photons, radio & antenna design
> o) Find experts who have experience & knowledge, e.g. many do here
> o) Photons don't really have mass nor size, at least per my limited understanding of particle physics and QED though, I must admit, came from reading things on the internet
>
> Bob
>
> On Mon, Sep 20, 2021 at 7:40 PM Vint Cerf <vint at google.com> wrote:
>>
>> see https://mediatrust.com/
>> v
>>
>> On Mon, Sep 20, 2021 at 10:28 AM Steve Crocker <steve at shinkuro.com> wrote:
>>>
>>> Related but slightly different: Attached is a slide some of my colleagues put together a decade ago showing the number of DNS lookups involved in displaying CNN's front page.
>>> Steve
>>>
>>> On Mon, Sep 20, 2021 at 8:18 AM Valdis Klētnieks <valdis.kletnieks at vt.edu> wrote:
>>>>
>>>> On Sun, 19 Sep 2021 18:21:56 -0700, Dave Taht said:
>>>> > what actually happens during a web page load,
>>>>
>>>> I'm pretty sure that nobody actually understands that anymore, in any
>>>> more than handwaving levels.
>>>>
>>>> I have a nice Chrome extension called IPvFoo that actually tracks the IP
>>>> addresses contacted during the load of the displayed page. I'll let you make
>>>> a guess as to how many unique IP addresses were contacted during a load
>>>> of https://www.cnn.com
>>>>
>>>> ...
>>>>
>>>>
>>>> ...
>>>>
>>>>
>>>> ...
>>>>
>>>>
>>>> 145, at least half of which appeared to be analytics.  And that's only the
>>>> hosts that were contacted by my laptop for HTTP, and doesn't count DNS, or
>>>> load-balancing front ends, or all the back-end boxes.  As I commented over on
>>>> NANOG, we've gotten to a point similar to that of AT&T long distance, where 60%
>>>> of the effort of connecting a long distance phone call was the cost of
>>>> accounting and billing for the call.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Starlink mailing list
>>>> Starlink at lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/starlink
>>>
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/starlink
>>
>>
>> --
>> Please send any postal/overnight deliveries to:
>> Vint Cerf
>> 1435 Woodhurst Blvd
>> McLean, VA 22102
>> 703-448-0965
>> until further notice
>
>
> This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel



--
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw

Dave Täht CEO, TekLibre, LLC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: starlink-3ms-irtt-20min (1).png
Type: image/png
Size: 2013513 bytes
Desc: not available
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20210926/0f92be49/attachment-0001.png>


More information about the Bloat mailing list