[Starlink] SatNetLab: A call to arms for the next global> Internet testbed

Rodney W. Grimes starlink at gndrsh.dnsmgr.net
Tue Jul 13 08:39:51 EDT 2021


> On Mon, 12 Jul 2021, David P. Reed wrote:
> 
> >> From: David Lang <david at lang.hm>
> >> 
> >> Wifi has the added issue that the blob headers are at a much lower data rate
> >> than the dta itself, so you can cram a LOT of data into a blob without making a
> >> significant difference in the airtime used, so you really do want to be able to
> >> send full blobs (not at the cost of delaying tranmission if you don't have a
> >> full blob, a mistake some people make, but you do want to buffer enough to fill
> >> the blobs)
> > This happens naturally if the senders in the LAN take turns and transmit what they have accumulated while waiting their turn, fairly naturally. Capping the total airtime in a cycle limits short message latency, which is why small packets are helpful.
> 
> I was thinking in terms of the downstream (from Internet) side, the senders 
> there have no idea about wifi or timeslots, they are sending from several hops 
> away from the bottleneck
> 
> >> and given that dropped packets results in timeouts and retransmissions that
> >> affect the rest of the network, it's not obviously wrong for a lossy hop like
> >> wifi to retry a failed transmission, it just needs to not retry too many times.
> >> 
> > Absolutely right, though not perfect. local retransmit on a link (or WLAN 
> > domain) benefits if the link has a high bit-error rate. On the other hand, 
> > it's better if you can to use FEC, or erasure coding or just lower the 
> > attempted signalling rate, from an information theoretic point of view. If you 
> > have an estimator of Bit Error Rate on the link (which gives you a packet 
> > error rate), there's a reasonable bound on the number of retransmits on an 
> > individual packet at the link level that doesn't kill end-to-end latency. I 
> > forget how the formula is derived. It's also important as BER increases to use 
> > shorter packet frames.
> 
> FEC works if the problem is a bit error rate, but on radio links you have a 
> hidden transmitter problem. When another station that can't hear you starts 
> transmitting that is a bit more powerful than you (as far as the receiver is 
> concerned), you don't just lose a few bits, you lose the entire transmission (or 
> a large chunk of it).
> 
> lowering the bit rate when the problem is interference from other stations 
> actually makes the problem worse as you make it more likely that you will be 
> stepped on (one of the reasons why wifi performance falls off a cliff as it gets 
> congested)
> 
> David Lang
> 

It wasnt suggested "lowering the bit rate", it was suggested to make the
packets smaller, which actually does address the hidden transmitter problem
to some degree as it *would* reduce your air time occupancy, but the damn
wifi LL aggregation gets in your way cause it blows them back up.  When I
am having to deal/use wifi in a hidden transmitter prone situation I always
crank down the Fragmentation Threshold setting from the default of 2346 bytes
to the often the minimum of 256 with good results.

Rod Grimes

> > End to end retransmit is not the optimal way to correct link errors - the 
> > end-to-end checksum and retransmit in TCP has confused people over the years 
> > into thinking link reliability can be omitted! That was never the reason TCP 
> > does end-to-end error checking. People got confused about that. As Dave Taht 
> > can recount based on discussions with Steve Crocker and me (ARPANET and 
> > TCP/IP) the point of end-to-end checks is to make sure that *overall* the 
> > system doesn't introduce errors, including in buffer memory, software that 
> > doesn't quite work, etc. The TCP retransmission is mostly about recovering 
> > from packet drops and things like duplicated packets resulting from routing 
> > changes, etc.
> >
> > So fix link errors at link level (but remember that retransmit with checksum 
> > isn't really optimal there - there are better ways if BER is high or the error 
> > might be because of software or hardware bugs which tend to be non-random).
> >
> >
> >
> >
> >> David Lang
> >> 
> >>
> >>   On Sat, 10 Jul 2021, Rodney W. Grimes wrote:
> >> 
> >>> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT)
> >>> From: Rodney W. Grimes <starlink at gndrsh.dnsmgr.net>
> >>> To: Dave Taht <dave.taht at gmail.com>
> >>> Cc: starlink at lists.bufferbloat.net, Ankit Singla <asingla at ethz.ch>,
> >>>     Sam Kumar <samkumar at cs.berkeley.edu>
> >>> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet
> >>>      testbed
> >>>
> >>>> While it is good to have a call to arms, like this:
> >>> ...  much information removed as I only one to reply to 1 very
> >>>     narrow, but IMHO, very real problem in our networks today ...
> >>>
> >>>> Here's another piece of pre-history - alohanet - the TTL field was the
> >>>> "time to live" field. The intent was that the packet would indicate
> >>>> how much time it would be valid before it was discarded. It didn't
> >>>> work out, and was replaced by hopcount, which of course switched
> >>>> networks ignore and isonly semi-useful for detecting loops and the
> >>>> like.
> >>>
> >>> TTL works perfectly fine where the original assumptions that a
> >>> device along a network path only hangs on to a packet for a
> >>> reasonable short duration, and that there is not some "retry"
> >>> mechanism in place that is causing this time to explode.  BSD,
> >>> and as far as I can recall, almost ALL original IP stacks had
> >>> a Q depth limit of 50 packets on egress interfaces.  Everything
> >>> pretty much worked well and the net was happy.  Then these base
> >>> assumptions got blasted in the name of "measurable bandwidth" and
> >>> the concept of packets are so precious we must not loose them,
> >>> at almost any cost.  Linux crammed the per interface Q up to 1000,
> >>> wifi decided that it was reasable to retry at the link layer so
> >>> many times that I have seen packets that are >60 seconds old.
> >>>
> >>> Proposed FIX:  Any device that transmits packets that does not
> >>> already have an inherit FIXED transmission time MUST consider
> >>> the current TTL of that packet and give up if > 10mS * TTL elapses
> >>> while it is trying to transmit.  AND change the default if Q
> >>> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine
> >>> at 1000 as it has delay targets that present the issue that
> >>> initially bumping this to 1000 caused.
> >>>
> >>> ... end of Rods Rant ...
> >>>
> >>> --
> >>> Rod Grimes                                                 rgrimes at freebsd.org
> >>> _______________________________________________
> >>> Starlink mailing list
> >>> Starlink at lists.bufferbloat.net
> >>> https://lists.bufferbloat.net/listinfo/starlink
> >> 
> >> 
> >> ------------------------------
> >> 
> >> Subject: Digest Footer
> >> 
> >> _______________________________________________
> >> Starlink mailing list
> >> Starlink at lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/starlink
> >> 
> >> 
> >> ------------------------------
> >> 
> >> End of Starlink Digest, Vol 4, Issue 21
> >> ***************************************
> >> 
> >
> >
> > _______________________________________________
> > Starlink mailing list
> > Starlink at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/starlink
> _______________________________________________
> Starlink mailing list
> Starlink at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink
> 
> 



More information about the Starlink mailing list