+1 re fixing close to source of error unless applications can deal with
packet loss without retransmission - like real-time speech.

v


On Mon, Jul 12, 2021 at 9:23 PM David P. Reed <dpreed@deepplum.com> wrote:

> > From: David Lang <david@lang.hm>
> >
> > Wifi has the added issue that the blob headers are at a much lower data
> rate
> > than the dta itself, so you can cram a LOT of data into a blob without
> making a
> > significant difference in the airtime used, so you really do want to be
> able to
> > send full blobs (not at the cost of delaying tranmission if you don't
> have a
> > full blob, a mistake some people make, but you do want to buffer enough
> to fill
> > the blobs)
> This happens naturally if the senders in the LAN take turns and transmit
> what they have accumulated while waiting their turn, fairly naturally.
> Capping the total airtime in a cycle limits short message latency, which is
> why small packets are helpful.
>
> >
> > and given that dropped packets results in timeouts and retransmissions
> that
> > affect the rest of the network, it's not obviously wrong for a lossy hop
> like
> > wifi to retry a failed transmission, it just needs to not retry too many
> times.
> >
> Absolutely right, though not perfect. local retransmit on a link (or WLAN
> domain) benefits if the link has a high bit-error rate. On the other hand,
> it's better if you can to use FEC, or erasure coding or just lower the
> attempted signalling rate, from an information theoretic point of view. If
> you have an estimator of Bit Error Rate on the link (which gives you a
> packet error rate), there's a reasonable bound on the number of retransmits
> on an individual packet at the link level that doesn't kill end-to-end
> latency. I forget how the formula is derived. It's also important as BER
> increases to use shorter packet frames.
>
> End to end retransmit is not the optimal way to correct link errors - the
> end-to-end checksum and retransmit in TCP has confused people over the
> years into thinking link reliability can be omitted! That was never the
> reason TCP does end-to-end error checking. People got confused about that.
> As Dave Taht can recount based on discussions with Steve Crocker and me
> (ARPANET and TCP/IP) the point of end-to-end checks is to make sure that
> *overall* the system doesn't introduce errors, including in buffer memory,
> software that doesn't quite work, etc. The TCP retransmission is mostly
> about recovering from packet drops and things like duplicated packets
> resulting from routing changes, etc.
>
> So fix link errors at link level (but remember that retransmit with
> checksum isn't really optimal there - there are better ways if BER is high
> or the error might be because of software or hardware bugs which tend to be
> non-random).
>
>
>
>
> > David Lang
> >
> >
> >   On Sat, 10 Jul 2021, Rodney W. Grimes wrote:
> >
> >> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT)
> >> From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net>
> >> To: Dave Taht <dave.taht@gmail.com>
> >> Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>,
> >>     Sam Kumar <samkumar@cs.berkeley.edu>
> >> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global
> Internet
> >>      testbed
> >>
> >>> While it is good to have a call to arms, like this:
> >> ...  much information removed as I only one to reply to 1 very
> >>     narrow, but IMHO, very real problem in our networks today ...
> >>
> >>> Here's another piece of pre-history - alohanet - the TTL field was the
> >>> "time to live" field. The intent was that the packet would indicate
> >>> how much time it would be valid before it was discarded. It didn't
> >>> work out, and was replaced by hopcount, which of course switched
> >>> networks ignore and isonly semi-useful for detecting loops and the
> >>> like.
> >>
> >> TTL works perfectly fine where the original assumptions that a
> >> device along a network path only hangs on to a packet for a
> >> reasonable short duration, and that there is not some "retry"
> >> mechanism in place that is causing this time to explode.  BSD,
> >> and as far as I can recall, almost ALL original IP stacks had
> >> a Q depth limit of 50 packets on egress interfaces.  Everything
> >> pretty much worked well and the net was happy.  Then these base
> >> assumptions got blasted in the name of "measurable bandwidth" and
> >> the concept of packets are so precious we must not loose them,
> >> at almost any cost.  Linux crammed the per interface Q up to 1000,
> >> wifi decided that it was reasable to retry at the link layer so
> >> many times that I have seen packets that are >60 seconds old.
> >>
> >> Proposed FIX:  Any device that transmits packets that does not
> >> already have an inherit FIXED transmission time MUST consider
> >> the current TTL of that packet and give up if > 10mS * TTL elapses
> >> while it is trying to transmit.  AND change the default if Q
> >> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine
> >> at 1000 as it has delay targets that present the issue that
> >> initially bumping this to 1000 caused.
> >>
> >> ... end of Rods Rant ...
> >>
> >> --
> >> Rod Grimes
> rgrimes@freebsd.org
> >> _______________________________________________
> >> Starlink mailing list
> >> Starlink@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/starlink
> >
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > Starlink mailing list
> > Starlink@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/starlink
> >
> >
> > ------------------------------
> >
> > End of Starlink Digest, Vol 4, Issue 21
> > ***************************************
> >
>
>
> _______________________________________________
> Starlink mailing list
> Starlink@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink
>


-- 
Please send any postal/overnight deliveries to:
Vint Cerf
1435 Woodhurst Blvd
McLean, VA 22102
703-448-0965

until further notice