[Starlink] SatNetLab: A call to arms for the next global Internet testbed

Sat Jul 10 16:27:28 EDT 2021

any buffer sizing based on the number of packets is wrong. Base your buffer size 
on transmit time and you have a chance of being reasonable.

In cases like wifi where packets aren't sent individually, but are sent in 
blobs of packets going to the same destination, you want to buffer at least a 
blobs worth of packets to each destination so that when your transmit slot comes 
up, you can maximize it.

Wifi has the added issue that the blob headers are at a much lower data rate 
than the dta itself, so you can cram a LOT of data into a blob without making a 
significant difference in the airtime used, so you really do want to be able to 
send full blobs (not at the cost of delaying tranmission if you don't have a 
full blob, a mistake some people make, but you do want to buffer enough to fill 
the blobs)

and given that dropped packets results in timeouts and retransmissions that 
affect the rest of the network, it's not obviously wrong for a lossy hop like 
wifi to retry a failed transmission, it just needs to not retry too many times.

David Lang

  On Sat, 10 Jul 2021, Rodney W. Grimes wrote:

> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT)
> From: Rodney W. Grimes <starlink at gndrsh.dnsmgr.net>
> To: Dave Taht <dave.taht at gmail.com>
> Cc: starlink at lists.bufferbloat.net, Ankit Singla <asingla at ethz.ch>,
>     Sam Kumar <samkumar at cs.berkeley.edu>
> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet
>      testbed
> 
>> While it is good to have a call to arms, like this:
> ...  much information removed as I only one to reply to 1 very
>     narrow, but IMHO, very real problem in our networks today ...
> 
>> Here's another piece of pre-history - alohanet - the TTL field was the
>> "time to live" field. The intent was that the packet would indicate
>> how much time it would be valid before it was discarded. It didn't
>> work out, and was replaced by hopcount, which of course switched
>> networks ignore and isonly semi-useful for detecting loops and the
>> like.
>
> TTL works perfectly fine where the original assumptions that a
> device along a network path only hangs on to a packet for a
> reasonable short duration, and that there is not some "retry"
> mechanism in place that is causing this time to explode.  BSD,
> and as far as I can recall, almost ALL original IP stacks had
> a Q depth limit of 50 packets on egress interfaces.  Everything
> pretty much worked well and the net was happy.  Then these base
> assumptions got blasted in the name of "measurable bandwidth" and
> the concept of packets are so precious we must not loose them,
> at almost any cost.  Linux crammed the per interface Q up to 1000,
> wifi decided that it was reasable to retry at the link layer so
> many times that I have seen packets that are >60 seconds old.
>
> Proposed FIX:  Any device that transmits packets that does not
> already have an inherit FIXED transmission time MUST consider
> the current TTL of that packet and give up if > 10mS * TTL elapses
> while it is trying to transmit.  AND change the default if Q
> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine
> at 1000 as it has delay targets that present the issue that
> initially bumping this to 1000 caused.
>
> ... end of Rods Rant ...
>
> --
> Rod Grimes                                                 rgrimes at freebsd.org
> _______________________________________________
> Starlink mailing list
> Starlink at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink