+1 re fixing close to source of error unless applications can deal with packet loss without retransmission - like real-time speech. v On Mon, Jul 12, 2021 at 9:23 PM David P. Reed wrote: > > From: David Lang > > > > Wifi has the added issue that the blob headers are at a much lower data > rate > > than the dta itself, so you can cram a LOT of data into a blob without > making a > > significant difference in the airtime used, so you really do want to be > able to > > send full blobs (not at the cost of delaying tranmission if you don't > have a > > full blob, a mistake some people make, but you do want to buffer enough > to fill > > the blobs) > This happens naturally if the senders in the LAN take turns and transmit > what they have accumulated while waiting their turn, fairly naturally. > Capping the total airtime in a cycle limits short message latency, which is > why small packets are helpful. > > > > > and given that dropped packets results in timeouts and retransmissions > that > > affect the rest of the network, it's not obviously wrong for a lossy hop > like > > wifi to retry a failed transmission, it just needs to not retry too many > times. > > > Absolutely right, though not perfect. local retransmit on a link (or WLAN > domain) benefits if the link has a high bit-error rate. On the other hand, > it's better if you can to use FEC, or erasure coding or just lower the > attempted signalling rate, from an information theoretic point of view. If > you have an estimator of Bit Error Rate on the link (which gives you a > packet error rate), there's a reasonable bound on the number of retransmits > on an individual packet at the link level that doesn't kill end-to-end > latency. I forget how the formula is derived. It's also important as BER > increases to use shorter packet frames. > > End to end retransmit is not the optimal way to correct link errors - the > end-to-end checksum and retransmit in TCP has confused people over the > years into thinking link reliability can be omitted! That was never the > reason TCP does end-to-end error checking. People got confused about that. > As Dave Taht can recount based on discussions with Steve Crocker and me > (ARPANET and TCP/IP) the point of end-to-end checks is to make sure that > *overall* the system doesn't introduce errors, including in buffer memory, > software that doesn't quite work, etc. The TCP retransmission is mostly > about recovering from packet drops and things like duplicated packets > resulting from routing changes, etc. > > So fix link errors at link level (but remember that retransmit with > checksum isn't really optimal there - there are better ways if BER is high > or the error might be because of software or hardware bugs which tend to be > non-random). > > > > > > David Lang > > > > > > On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > > > >> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > >> From: Rodney W. Grimes > >> To: Dave Taht > >> Cc: starlink@lists.bufferbloat.net, Ankit Singla , > >> Sam Kumar > >> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global > Internet > >> testbed > >> > >>> While it is good to have a call to arms, like this: > >> ... much information removed as I only one to reply to 1 very > >> narrow, but IMHO, very real problem in our networks today ... > >> > >>> Here's another piece of pre-history - alohanet - the TTL field was the > >>> "time to live" field. The intent was that the packet would indicate > >>> how much time it would be valid before it was discarded. It didn't > >>> work out, and was replaced by hopcount, which of course switched > >>> networks ignore and isonly semi-useful for detecting loops and the > >>> like. > >> > >> TTL works perfectly fine where the original assumptions that a > >> device along a network path only hangs on to a packet for a > >> reasonable short duration, and that there is not some "retry" > >> mechanism in place that is causing this time to explode. BSD, > >> and as far as I can recall, almost ALL original IP stacks had > >> a Q depth limit of 50 packets on egress interfaces. Everything > >> pretty much worked well and the net was happy. Then these base > >> assumptions got blasted in the name of "measurable bandwidth" and > >> the concept of packets are so precious we must not loose them, > >> at almost any cost. Linux crammed the per interface Q up to 1000, > >> wifi decided that it was reasable to retry at the link layer so > >> many times that I have seen packets that are >60 seconds old. > >> > >> Proposed FIX: Any device that transmits packets that does not > >> already have an inherit FIXED transmission time MUST consider > >> the current TTL of that packet and give up if > 10mS * TTL elapses > >> while it is trying to transmit. AND change the default if Q > >> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > >> at 1000 as it has delay targets that present the issue that > >> initially bumping this to 1000 caused. > >> > >> ... end of Rods Rant ... > >> > >> -- > >> Rod Grimes > rgrimes@freebsd.org > >> _______________________________________________ > >> Starlink mailing list > >> Starlink@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/starlink > > > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > Starlink mailing list > > Starlink@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/starlink > > > > > > ------------------------------ > > > > End of Starlink Digest, Vol 4, Issue 21 > > *************************************** > > > > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > -- Please send any postal/overnight deliveries to: Vint Cerf 1435 Woodhurst Blvd McLean, VA 22102 703-448-0965 until further notice