From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9F7C63CB37 for ; Mon, 12 Jul 2021 21:27:55 -0400 (EDT) Received: by mail-wm1-x32b.google.com with SMTP id k32so9326919wms.4 for ; Mon, 12 Jul 2021 18:27:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0zij07iZ/IDpZG6T8YiZPKrNGTp5C/jvCuctoafDcX0=; b=kV7dnhxXoeCpi7UC+vhsyhrFI1lTPljl55PB+H3dmbXafSCs4nwyTZo6/Jk7M2a1kG so+17vg7m2gWtCHzLj8J6Q5nmhWUOF+0TJrKaPpYe2wfE9/u9HAVGEOeR05YdSWSnGpa vczLMJug6jlmdiMfhOfvCXq+0IGZHrOfHDuRgvFnqbd/n6LYiCVhASHfNVkF9UoGIzr2 sLnafm/q1mFNKnO0xlksMmsLs+g2lLUKeqYahC9oo0YLM7zgLqhVA9HVHdQDmalANOri B20XTkkKsSh3Qj/21S4Mtg4HKMlcsHWHEQVmi+EnZFajS0YA63kYkeAidrbuFbpelrnH gT3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0zij07iZ/IDpZG6T8YiZPKrNGTp5C/jvCuctoafDcX0=; b=byyk6CNuKWeNVsJkwn+TSD9UFemwU4IKLPOshUN+zlBGmroG4CENcydixcFdLFX7/h ij/gn8eltM8NtaaITdJ1/mht0/geXFGnUKPgeTpLaq0MhjTuc/Tcm1reckTBoncGu3lf Qvi80NNMj4ALsmCMnX1gO1ewvPOD26N1CWtc0mqAYTl7nx5bePivtqe5IkKckCzmOF6D 5Nt4jncmD2/7DBD28Enzyl6sm126YI4h6RsodsgtDOwtIk26y3YfZjH1uCLcR2iq80mg IO8mwUdEiSgR5XB1Mk14RCXdXePkgORow4fztsFd31iwqLYGYK20aRn8nvHRfYkpEdcM BpWQ== X-Gm-Message-State: AOAM532AUwFulHTKpbTZn8ALq47ycXJ/RUc73yt6/gIzJOxiM6/U+DtO 7zQ6/NtWduaBJm3Nhwlzbt4f4OKfCJIZ3LUEtsgxg8jx65k= X-Google-Smtp-Source: ABdhPJyOlxKys2Dd9vP3crcvqsad9ieCLqKtN1glsd88EHe+Mrm5g5Rb0uS0b6+RCMMj8QccrHxqGnhYkG0yHNywzc0= X-Received: by 2002:a7b:c3d3:: with SMTP id t19mr2001468wmj.156.1626139674295; Mon, 12 Jul 2021 18:27:54 -0700 (PDT) MIME-Version: 1.0 References: <1626139405.16213728@apps.rackspace.com> In-Reply-To: <1626139405.16213728@apps.rackspace.com> From: Vint Cerf Date: Mon, 12 Jul 2021 21:27:10 -0400 Message-ID: To: "David P. Reed" Cc: starlink@lists.bufferbloat.net Content-Type: multipart/alternative; boundary="000000000000a4a6f205c6f725cf" Subject: Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Jul 2021 01:27:55 -0000 --000000000000a4a6f205c6f725cf Content-Type: text/plain; charset="UTF-8" +1 re fixing close to source of error unless applications can deal with packet loss without retransmission - like real-time speech. v On Mon, Jul 12, 2021 at 9:23 PM David P. Reed wrote: > > From: David Lang > > > > Wifi has the added issue that the blob headers are at a much lower data > rate > > than the dta itself, so you can cram a LOT of data into a blob without > making a > > significant difference in the airtime used, so you really do want to be > able to > > send full blobs (not at the cost of delaying tranmission if you don't > have a > > full blob, a mistake some people make, but you do want to buffer enough > to fill > > the blobs) > This happens naturally if the senders in the LAN take turns and transmit > what they have accumulated while waiting their turn, fairly naturally. > Capping the total airtime in a cycle limits short message latency, which is > why small packets are helpful. > > > > > and given that dropped packets results in timeouts and retransmissions > that > > affect the rest of the network, it's not obviously wrong for a lossy hop > like > > wifi to retry a failed transmission, it just needs to not retry too many > times. > > > Absolutely right, though not perfect. local retransmit on a link (or WLAN > domain) benefits if the link has a high bit-error rate. On the other hand, > it's better if you can to use FEC, or erasure coding or just lower the > attempted signalling rate, from an information theoretic point of view. If > you have an estimator of Bit Error Rate on the link (which gives you a > packet error rate), there's a reasonable bound on the number of retransmits > on an individual packet at the link level that doesn't kill end-to-end > latency. I forget how the formula is derived. It's also important as BER > increases to use shorter packet frames. > > End to end retransmit is not the optimal way to correct link errors - the > end-to-end checksum and retransmit in TCP has confused people over the > years into thinking link reliability can be omitted! That was never the > reason TCP does end-to-end error checking. People got confused about that. > As Dave Taht can recount based on discussions with Steve Crocker and me > (ARPANET and TCP/IP) the point of end-to-end checks is to make sure that > *overall* the system doesn't introduce errors, including in buffer memory, > software that doesn't quite work, etc. The TCP retransmission is mostly > about recovering from packet drops and things like duplicated packets > resulting from routing changes, etc. > > So fix link errors at link level (but remember that retransmit with > checksum isn't really optimal there - there are better ways if BER is high > or the error might be because of software or hardware bugs which tend to be > non-random). > > > > > > David Lang > > > > > > On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > > > >> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > >> From: Rodney W. Grimes > >> To: Dave Taht > >> Cc: starlink@lists.bufferbloat.net, Ankit Singla , > >> Sam Kumar > >> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global > Internet > >> testbed > >> > >>> While it is good to have a call to arms, like this: > >> ... much information removed as I only one to reply to 1 very > >> narrow, but IMHO, very real problem in our networks today ... > >> > >>> Here's another piece of pre-history - alohanet - the TTL field was the > >>> "time to live" field. The intent was that the packet would indicate > >>> how much time it would be valid before it was discarded. It didn't > >>> work out, and was replaced by hopcount, which of course switched > >>> networks ignore and isonly semi-useful for detecting loops and the > >>> like. > >> > >> TTL works perfectly fine where the original assumptions that a > >> device along a network path only hangs on to a packet for a > >> reasonable short duration, and that there is not some "retry" > >> mechanism in place that is causing this time to explode. BSD, > >> and as far as I can recall, almost ALL original IP stacks had > >> a Q depth limit of 50 packets on egress interfaces. Everything > >> pretty much worked well and the net was happy. Then these base > >> assumptions got blasted in the name of "measurable bandwidth" and > >> the concept of packets are so precious we must not loose them, > >> at almost any cost. Linux crammed the per interface Q up to 1000, > >> wifi decided that it was reasable to retry at the link layer so > >> many times that I have seen packets that are >60 seconds old. > >> > >> Proposed FIX: Any device that transmits packets that does not > >> already have an inherit FIXED transmission time MUST consider > >> the current TTL of that packet and give up if > 10mS * TTL elapses > >> while it is trying to transmit. AND change the default if Q > >> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > >> at 1000 as it has delay targets that present the issue that > >> initially bumping this to 1000 caused. > >> > >> ... end of Rods Rant ... > >> > >> -- > >> Rod Grimes > rgrimes@freebsd.org > >> _______________________________________________ > >> Starlink mailing list > >> Starlink@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/starlink > > > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > Starlink mailing list > > Starlink@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/starlink > > > > > > ------------------------------ > > > > End of Starlink Digest, Vol 4, Issue 21 > > *************************************** > > > > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > -- Please send any postal/overnight deliveries to: Vint Cerf 1435 Woodhurst Blvd McLean, VA 22102 703-448-0965 until further notice --000000000000a4a6f205c6f725cf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
+1 re fixing close to source of error unless applications = can deal with packet loss without retransmission - like real-time speech.

v


On Mon, Jul 12, 2021 at 9:23 PM= David P. Reed <dpreed@deepplum.c= om> wrote:
david@lang.hm>
>
> Wifi has the added issue that the blob headers are at a much lower dat= a rate
> than the dta itself, so you can cram a LOT of data into a blob without= making a
> significant difference in the airtime used, so you really do want to b= e able to
> send full blobs (not at the cost of delaying tranmission if you don= 9;t have a
> full blob, a mistake some people make, but you do want to buffer enoug= h to fill
> the blobs)
This happens naturally if the senders in the LAN take turns and transmit wh= at they have accumulated while waiting their turn, fairly naturally. Cappin= g the total airtime in a cycle limits short message latency, which is why s= mall packets are helpful.

>
> and given that dropped packets results in timeouts and retransmissions= that
> affect the rest of the network, it's not obviously wrong for a los= sy hop like
> wifi to retry a failed transmission, it just needs to not retry too ma= ny times.
>
Absolutely right, though not perfect. local retransmit on a link (or WLAN d= omain) benefits if the link has a high bit-error rate. On the other hand, i= t's better if you can to use FEC, or erasure coding or just lower the a= ttempted signalling rate, from an information theoretic point of view. If y= ou have an estimator of Bit Error Rate on the link (which gives you a packe= t error rate), there's a reasonable bound on the number of retransmits = on an individual packet at the link level that doesn't kill end-to-end = latency. I forget how the formula is derived. It's also important as BE= R increases to use shorter packet frames.

End to end retransmit is not the optimal way to correct link errors - the e= nd-to-end checksum and retransmit in TCP has confused people over the years= into thinking link reliability can be omitted! That was never the reason T= CP does end-to-end error checking. People got confused about that. As Dave = Taht can recount based on discussions with Steve Crocker and me (ARPANET an= d TCP/IP) the point of end-to-end checks is to make sure that *overall* the= system doesn't introduce errors, including in buffer memory, software = that doesn't quite work, etc. The TCP retransmission is mostly about re= covering from packet drops and things like duplicated packets resulting fro= m routing changes, etc.

So fix link errors at link level (but remember that retransmit with checksu= m isn't really optimal there - there are better ways if BER is high or = the error might be because of software or hardware bugs which tend to be no= n-random).




> David Lang
>
>
>=C2=A0 =C2=A0On Sat, 10 Jul 2021, Rodney W. Grimes wrote:
>
>> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT)
>> From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net>
>> To: Dave Taht <dave.taht@gmail.com>
>> Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>,
>>=C2=A0 =C2=A0 =C2=A0Sam Kumar <samkumar@cs.berkeley.edu>
>> Subject: Re: [Starlink] SatNetLab: A call to arms for the next glo= bal Internet
>>=C2=A0 =C2=A0 =C2=A0 testbed
>>
>>> While it is good to have a call to arms, like this:
>> ...=C2=A0 much information removed as I only one to reply to 1 ver= y
>>=C2=A0 =C2=A0 =C2=A0narrow, but IMHO, very real problem in our netw= orks today ...
>>
>>> Here's another piece of pre-history - alohanet - the TTL f= ield was the
>>> "time to live" field. The intent was that the packet= would indicate
>>> how much time it would be valid before it was discarded. It di= dn't
>>> work out, and was replaced by hopcount, which of course switch= ed
>>> networks ignore and isonly semi-useful for detecting loops and= the
>>> like.
>>
>> TTL works perfectly fine where the original assumptions that a
>> device along a network path only hangs on to a packet for a
>> reasonable short duration, and that there is not some "retry&= quot;
>> mechanism in place that is causing this time to explode.=C2=A0 BSD= ,
>> and as far as I can recall, almost ALL original IP stacks had
>> a Q depth limit of 50 packets on egress interfaces.=C2=A0 Everythi= ng
>> pretty much worked well and the net was happy.=C2=A0 Then these ba= se
>> assumptions got blasted in the name of "measurable bandwidth&= quot; and
>> the concept of packets are so precious we must not loose them,
>> at almost any cost.=C2=A0 Linux crammed the per interface Q up to = 1000,
>> wifi decided that it was reasable to retry at the link layer so >> many times that I have seen packets that are >60 seconds old. >>
>> Proposed FIX:=C2=A0 Any device that transmits packets that does no= t
>> already have an inherit FIXED transmission time MUST consider
>> the current TTL of that packet and give up if > 10mS * TTL elap= ses
>> while it is trying to transmit.=C2=A0 AND change the default if Q<= br> >> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine
>> at 1000 as it has delay targets that present the issue that
>> initially bumping this to 1000 caused.
>>
>> ... end of Rods Rant ...
>>
>> --
>> Rod Grimes=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0rgrimes@freebsd.org
>> _______________________________________________
>> Starlink mailing list
>> Starlink@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/starl= ink
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Starlink mailing list
> St= arlink@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink<= /a>
>
>
> ------------------------------
>
> End of Starlink Digest, Vol 4, Issue 21
> ***************************************
>


_______________________________________________
Starlink mailing list
Starlin= k@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink


--
Please send any postal/ove= rnight deliveries to:
Vint Cerf
1435 Woodhurst Blvd=C2= =A0
McLean, VA 22102
703-448-0965

<= div>until further notice



=
--000000000000a4a6f205c6f725cf--