* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed [not found] <mailman.5.1626019201.21244.starlink@lists.bufferbloat.net> @ 2021-07-13 1:23 ` David P. Reed 2021-07-13 1:27 ` Vint Cerf 2021-07-13 1:57 ` David Lang 0 siblings, 2 replies; 15+ messages in thread From: David P. Reed @ 2021-07-13 1:23 UTC (permalink / raw) To: starlink; +Cc: starlink > From: David Lang <david@lang.hm> > > Wifi has the added issue that the blob headers are at a much lower data rate > than the dta itself, so you can cram a LOT of data into a blob without making a > significant difference in the airtime used, so you really do want to be able to > send full blobs (not at the cost of delaying tranmission if you don't have a > full blob, a mistake some people make, but you do want to buffer enough to fill > the blobs) This happens naturally if the senders in the LAN take turns and transmit what they have accumulated while waiting their turn, fairly naturally. Capping the total airtime in a cycle limits short message latency, which is why small packets are helpful. > > and given that dropped packets results in timeouts and retransmissions that > affect the rest of the network, it's not obviously wrong for a lossy hop like > wifi to retry a failed transmission, it just needs to not retry too many times. > Absolutely right, though not perfect. local retransmit on a link (or WLAN domain) benefits if the link has a high bit-error rate. On the other hand, it's better if you can to use FEC, or erasure coding or just lower the attempted signalling rate, from an information theoretic point of view. If you have an estimator of Bit Error Rate on the link (which gives you a packet error rate), there's a reasonable bound on the number of retransmits on an individual packet at the link level that doesn't kill end-to-end latency. I forget how the formula is derived. It's also important as BER increases to use shorter packet frames. End to end retransmit is not the optimal way to correct link errors - the end-to-end checksum and retransmit in TCP has confused people over the years into thinking link reliability can be omitted! That was never the reason TCP does end-to-end error checking. People got confused about that. As Dave Taht can recount based on discussions with Steve Crocker and me (ARPANET and TCP/IP) the point of end-to-end checks is to make sure that *overall* the system doesn't introduce errors, including in buffer memory, software that doesn't quite work, etc. The TCP retransmission is mostly about recovering from packet drops and things like duplicated packets resulting from routing changes, etc. So fix link errors at link level (but remember that retransmit with checksum isn't really optimal there - there are better ways if BER is high or the error might be because of software or hardware bugs which tend to be non-random). > David Lang > > > On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > >> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) >> From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> >> To: Dave Taht <dave.taht@gmail.com> >> Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, >> Sam Kumar <samkumar@cs.berkeley.edu> >> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet >> testbed >> >>> While it is good to have a call to arms, like this: >> ... much information removed as I only one to reply to 1 very >> narrow, but IMHO, very real problem in our networks today ... >> >>> Here's another piece of pre-history - alohanet - the TTL field was the >>> "time to live" field. The intent was that the packet would indicate >>> how much time it would be valid before it was discarded. It didn't >>> work out, and was replaced by hopcount, which of course switched >>> networks ignore and isonly semi-useful for detecting loops and the >>> like. >> >> TTL works perfectly fine where the original assumptions that a >> device along a network path only hangs on to a packet for a >> reasonable short duration, and that there is not some "retry" >> mechanism in place that is causing this time to explode. BSD, >> and as far as I can recall, almost ALL original IP stacks had >> a Q depth limit of 50 packets on egress interfaces. Everything >> pretty much worked well and the net was happy. Then these base >> assumptions got blasted in the name of "measurable bandwidth" and >> the concept of packets are so precious we must not loose them, >> at almost any cost. Linux crammed the per interface Q up to 1000, >> wifi decided that it was reasable to retry at the link layer so >> many times that I have seen packets that are >60 seconds old. >> >> Proposed FIX: Any device that transmits packets that does not >> already have an inherit FIXED transmission time MUST consider >> the current TTL of that packet and give up if > 10mS * TTL elapses >> while it is trying to transmit. AND change the default if Q >> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine >> at 1000 as it has delay targets that present the issue that >> initially bumping this to 1000 caused. >> >> ... end of Rods Rant ... >> >> -- >> Rod Grimes rgrimes@freebsd.org >> _______________________________________________ >> Starlink mailing list >> Starlink@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/starlink > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > > > ------------------------------ > > End of Starlink Digest, Vol 4, Issue 21 > *************************************** > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 1:23 ` [Starlink] SatNetLab: A call to arms for the next global> Internet testbed David P. Reed @ 2021-07-13 1:27 ` Vint Cerf 2021-07-13 1:57 ` David Lang 1 sibling, 0 replies; 15+ messages in thread From: Vint Cerf @ 2021-07-13 1:27 UTC (permalink / raw) To: David P. Reed; +Cc: starlink [-- Attachment #1: Type: text/plain, Size: 6072 bytes --] +1 re fixing close to source of error unless applications can deal with packet loss without retransmission - like real-time speech. v On Mon, Jul 12, 2021 at 9:23 PM David P. Reed <dpreed@deepplum.com> wrote: > > From: David Lang <david@lang.hm> > > > > Wifi has the added issue that the blob headers are at a much lower data > rate > > than the dta itself, so you can cram a LOT of data into a blob without > making a > > significant difference in the airtime used, so you really do want to be > able to > > send full blobs (not at the cost of delaying tranmission if you don't > have a > > full blob, a mistake some people make, but you do want to buffer enough > to fill > > the blobs) > This happens naturally if the senders in the LAN take turns and transmit > what they have accumulated while waiting their turn, fairly naturally. > Capping the total airtime in a cycle limits short message latency, which is > why small packets are helpful. > > > > > and given that dropped packets results in timeouts and retransmissions > that > > affect the rest of the network, it's not obviously wrong for a lossy hop > like > > wifi to retry a failed transmission, it just needs to not retry too many > times. > > > Absolutely right, though not perfect. local retransmit on a link (or WLAN > domain) benefits if the link has a high bit-error rate. On the other hand, > it's better if you can to use FEC, or erasure coding or just lower the > attempted signalling rate, from an information theoretic point of view. If > you have an estimator of Bit Error Rate on the link (which gives you a > packet error rate), there's a reasonable bound on the number of retransmits > on an individual packet at the link level that doesn't kill end-to-end > latency. I forget how the formula is derived. It's also important as BER > increases to use shorter packet frames. > > End to end retransmit is not the optimal way to correct link errors - the > end-to-end checksum and retransmit in TCP has confused people over the > years into thinking link reliability can be omitted! That was never the > reason TCP does end-to-end error checking. People got confused about that. > As Dave Taht can recount based on discussions with Steve Crocker and me > (ARPANET and TCP/IP) the point of end-to-end checks is to make sure that > *overall* the system doesn't introduce errors, including in buffer memory, > software that doesn't quite work, etc. The TCP retransmission is mostly > about recovering from packet drops and things like duplicated packets > resulting from routing changes, etc. > > So fix link errors at link level (but remember that retransmit with > checksum isn't really optimal there - there are better ways if BER is high > or the error might be because of software or hardware bugs which tend to be > non-random). > > > > > > David Lang > > > > > > On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > > > >> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > >> From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> > >> To: Dave Taht <dave.taht@gmail.com> > >> Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, > >> Sam Kumar <samkumar@cs.berkeley.edu> > >> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global > Internet > >> testbed > >> > >>> While it is good to have a call to arms, like this: > >> ... much information removed as I only one to reply to 1 very > >> narrow, but IMHO, very real problem in our networks today ... > >> > >>> Here's another piece of pre-history - alohanet - the TTL field was the > >>> "time to live" field. The intent was that the packet would indicate > >>> how much time it would be valid before it was discarded. It didn't > >>> work out, and was replaced by hopcount, which of course switched > >>> networks ignore and isonly semi-useful for detecting loops and the > >>> like. > >> > >> TTL works perfectly fine where the original assumptions that a > >> device along a network path only hangs on to a packet for a > >> reasonable short duration, and that there is not some "retry" > >> mechanism in place that is causing this time to explode. BSD, > >> and as far as I can recall, almost ALL original IP stacks had > >> a Q depth limit of 50 packets on egress interfaces. Everything > >> pretty much worked well and the net was happy. Then these base > >> assumptions got blasted in the name of "measurable bandwidth" and > >> the concept of packets are so precious we must not loose them, > >> at almost any cost. Linux crammed the per interface Q up to 1000, > >> wifi decided that it was reasable to retry at the link layer so > >> many times that I have seen packets that are >60 seconds old. > >> > >> Proposed FIX: Any device that transmits packets that does not > >> already have an inherit FIXED transmission time MUST consider > >> the current TTL of that packet and give up if > 10mS * TTL elapses > >> while it is trying to transmit. AND change the default if Q > >> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > >> at 1000 as it has delay targets that present the issue that > >> initially bumping this to 1000 caused. > >> > >> ... end of Rods Rant ... > >> > >> -- > >> Rod Grimes > rgrimes@freebsd.org > >> _______________________________________________ > >> Starlink mailing list > >> Starlink@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/starlink > > > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > Starlink mailing list > > Starlink@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/starlink > > > > > > ------------------------------ > > > > End of Starlink Digest, Vol 4, Issue 21 > > *************************************** > > > > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > -- Please send any postal/overnight deliveries to: Vint Cerf 1435 Woodhurst Blvd McLean, VA 22102 703-448-0965 until further notice [-- Attachment #2: Type: text/html, Size: 8383 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 1:23 ` [Starlink] SatNetLab: A call to arms for the next global> Internet testbed David P. Reed 2021-07-13 1:27 ` Vint Cerf @ 2021-07-13 1:57 ` David Lang 2021-07-13 12:39 ` Rodney W. Grimes 1 sibling, 1 reply; 15+ messages in thread From: David Lang @ 2021-07-13 1:57 UTC (permalink / raw) To: David P. Reed; +Cc: starlink On Mon, 12 Jul 2021, David P. Reed wrote: >> From: David Lang <david@lang.hm> >> >> Wifi has the added issue that the blob headers are at a much lower data rate >> than the dta itself, so you can cram a LOT of data into a blob without making a >> significant difference in the airtime used, so you really do want to be able to >> send full blobs (not at the cost of delaying tranmission if you don't have a >> full blob, a mistake some people make, but you do want to buffer enough to fill >> the blobs) > This happens naturally if the senders in the LAN take turns and transmit what they have accumulated while waiting their turn, fairly naturally. Capping the total airtime in a cycle limits short message latency, which is why small packets are helpful. I was thinking in terms of the downstream (from Internet) side, the senders there have no idea about wifi or timeslots, they are sending from several hops away from the bottleneck >> and given that dropped packets results in timeouts and retransmissions that >> affect the rest of the network, it's not obviously wrong for a lossy hop like >> wifi to retry a failed transmission, it just needs to not retry too many times. >> > Absolutely right, though not perfect. local retransmit on a link (or WLAN > domain) benefits if the link has a high bit-error rate. On the other hand, > it's better if you can to use FEC, or erasure coding or just lower the > attempted signalling rate, from an information theoretic point of view. If you > have an estimator of Bit Error Rate on the link (which gives you a packet > error rate), there's a reasonable bound on the number of retransmits on an > individual packet at the link level that doesn't kill end-to-end latency. I > forget how the formula is derived. It's also important as BER increases to use > shorter packet frames. FEC works if the problem is a bit error rate, but on radio links you have a hidden transmitter problem. When another station that can't hear you starts transmitting that is a bit more powerful than you (as far as the receiver is concerned), you don't just lose a few bits, you lose the entire transmission (or a large chunk of it). lowering the bit rate when the problem is interference from other stations actually makes the problem worse as you make it more likely that you will be stepped on (one of the reasons why wifi performance falls off a cliff as it gets congested) David Lang > End to end retransmit is not the optimal way to correct link errors - the > end-to-end checksum and retransmit in TCP has confused people over the years > into thinking link reliability can be omitted! That was never the reason TCP > does end-to-end error checking. People got confused about that. As Dave Taht > can recount based on discussions with Steve Crocker and me (ARPANET and > TCP/IP) the point of end-to-end checks is to make sure that *overall* the > system doesn't introduce errors, including in buffer memory, software that > doesn't quite work, etc. The TCP retransmission is mostly about recovering > from packet drops and things like duplicated packets resulting from routing > changes, etc. > > So fix link errors at link level (but remember that retransmit with checksum > isn't really optimal there - there are better ways if BER is high or the error > might be because of software or hardware bugs which tend to be non-random). > > > > >> David Lang >> >> >> On Sat, 10 Jul 2021, Rodney W. Grimes wrote: >> >>> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) >>> From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> >>> To: Dave Taht <dave.taht@gmail.com> >>> Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, >>> Sam Kumar <samkumar@cs.berkeley.edu> >>> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet >>> testbed >>> >>>> While it is good to have a call to arms, like this: >>> ... much information removed as I only one to reply to 1 very >>> narrow, but IMHO, very real problem in our networks today ... >>> >>>> Here's another piece of pre-history - alohanet - the TTL field was the >>>> "time to live" field. The intent was that the packet would indicate >>>> how much time it would be valid before it was discarded. It didn't >>>> work out, and was replaced by hopcount, which of course switched >>>> networks ignore and isonly semi-useful for detecting loops and the >>>> like. >>> >>> TTL works perfectly fine where the original assumptions that a >>> device along a network path only hangs on to a packet for a >>> reasonable short duration, and that there is not some "retry" >>> mechanism in place that is causing this time to explode. BSD, >>> and as far as I can recall, almost ALL original IP stacks had >>> a Q depth limit of 50 packets on egress interfaces. Everything >>> pretty much worked well and the net was happy. Then these base >>> assumptions got blasted in the name of "measurable bandwidth" and >>> the concept of packets are so precious we must not loose them, >>> at almost any cost. Linux crammed the per interface Q up to 1000, >>> wifi decided that it was reasable to retry at the link layer so >>> many times that I have seen packets that are >60 seconds old. >>> >>> Proposed FIX: Any device that transmits packets that does not >>> already have an inherit FIXED transmission time MUST consider >>> the current TTL of that packet and give up if > 10mS * TTL elapses >>> while it is trying to transmit. AND change the default if Q >>> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine >>> at 1000 as it has delay targets that present the issue that >>> initially bumping this to 1000 caused. >>> >>> ... end of Rods Rant ... >>> >>> -- >>> Rod Grimes rgrimes@freebsd.org >>> _______________________________________________ >>> Starlink mailing list >>> Starlink@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/starlink >> >> >> ------------------------------ >> >> Subject: Digest Footer >> >> _______________________________________________ >> Starlink mailing list >> Starlink@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/starlink >> >> >> ------------------------------ >> >> End of Starlink Digest, Vol 4, Issue 21 >> *************************************** >> > > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 1:57 ` David Lang @ 2021-07-13 12:39 ` Rodney W. Grimes 2021-07-13 18:01 ` David Lang 0 siblings, 1 reply; 15+ messages in thread From: Rodney W. Grimes @ 2021-07-13 12:39 UTC (permalink / raw) To: David Lang; +Cc: David P. Reed, starlink > On Mon, 12 Jul 2021, David P. Reed wrote: > > >> From: David Lang <david@lang.hm> > >> > >> Wifi has the added issue that the blob headers are at a much lower data rate > >> than the dta itself, so you can cram a LOT of data into a blob without making a > >> significant difference in the airtime used, so you really do want to be able to > >> send full blobs (not at the cost of delaying tranmission if you don't have a > >> full blob, a mistake some people make, but you do want to buffer enough to fill > >> the blobs) > > This happens naturally if the senders in the LAN take turns and transmit what they have accumulated while waiting their turn, fairly naturally. Capping the total airtime in a cycle limits short message latency, which is why small packets are helpful. > > I was thinking in terms of the downstream (from Internet) side, the senders > there have no idea about wifi or timeslots, they are sending from several hops > away from the bottleneck > > >> and given that dropped packets results in timeouts and retransmissions that > >> affect the rest of the network, it's not obviously wrong for a lossy hop like > >> wifi to retry a failed transmission, it just needs to not retry too many times. > >> > > Absolutely right, though not perfect. local retransmit on a link (or WLAN > > domain) benefits if the link has a high bit-error rate. On the other hand, > > it's better if you can to use FEC, or erasure coding or just lower the > > attempted signalling rate, from an information theoretic point of view. If you > > have an estimator of Bit Error Rate on the link (which gives you a packet > > error rate), there's a reasonable bound on the number of retransmits on an > > individual packet at the link level that doesn't kill end-to-end latency. I > > forget how the formula is derived. It's also important as BER increases to use > > shorter packet frames. > > FEC works if the problem is a bit error rate, but on radio links you have a > hidden transmitter problem. When another station that can't hear you starts > transmitting that is a bit more powerful than you (as far as the receiver is > concerned), you don't just lose a few bits, you lose the entire transmission (or > a large chunk of it). > > lowering the bit rate when the problem is interference from other stations > actually makes the problem worse as you make it more likely that you will be > stepped on (one of the reasons why wifi performance falls off a cliff as it gets > congested) > > David Lang > It wasnt suggested "lowering the bit rate", it was suggested to make the packets smaller, which actually does address the hidden transmitter problem to some degree as it *would* reduce your air time occupancy, but the damn wifi LL aggregation gets in your way cause it blows them back up. When I am having to deal/use wifi in a hidden transmitter prone situation I always crank down the Fragmentation Threshold setting from the default of 2346 bytes to the often the minimum of 256 with good results. Rod Grimes > > End to end retransmit is not the optimal way to correct link errors - the > > end-to-end checksum and retransmit in TCP has confused people over the years > > into thinking link reliability can be omitted! That was never the reason TCP > > does end-to-end error checking. People got confused about that. As Dave Taht > > can recount based on discussions with Steve Crocker and me (ARPANET and > > TCP/IP) the point of end-to-end checks is to make sure that *overall* the > > system doesn't introduce errors, including in buffer memory, software that > > doesn't quite work, etc. The TCP retransmission is mostly about recovering > > from packet drops and things like duplicated packets resulting from routing > > changes, etc. > > > > So fix link errors at link level (but remember that retransmit with checksum > > isn't really optimal there - there are better ways if BER is high or the error > > might be because of software or hardware bugs which tend to be non-random). > > > > > > > > > >> David Lang > >> > >> > >> On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > >> > >>> Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > >>> From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> > >>> To: Dave Taht <dave.taht@gmail.com> > >>> Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, > >>> Sam Kumar <samkumar@cs.berkeley.edu> > >>> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet > >>> testbed > >>> > >>>> While it is good to have a call to arms, like this: > >>> ... much information removed as I only one to reply to 1 very > >>> narrow, but IMHO, very real problem in our networks today ... > >>> > >>>> Here's another piece of pre-history - alohanet - the TTL field was the > >>>> "time to live" field. The intent was that the packet would indicate > >>>> how much time it would be valid before it was discarded. It didn't > >>>> work out, and was replaced by hopcount, which of course switched > >>>> networks ignore and isonly semi-useful for detecting loops and the > >>>> like. > >>> > >>> TTL works perfectly fine where the original assumptions that a > >>> device along a network path only hangs on to a packet for a > >>> reasonable short duration, and that there is not some "retry" > >>> mechanism in place that is causing this time to explode. BSD, > >>> and as far as I can recall, almost ALL original IP stacks had > >>> a Q depth limit of 50 packets on egress interfaces. Everything > >>> pretty much worked well and the net was happy. Then these base > >>> assumptions got blasted in the name of "measurable bandwidth" and > >>> the concept of packets are so precious we must not loose them, > >>> at almost any cost. Linux crammed the per interface Q up to 1000, > >>> wifi decided that it was reasable to retry at the link layer so > >>> many times that I have seen packets that are >60 seconds old. > >>> > >>> Proposed FIX: Any device that transmits packets that does not > >>> already have an inherit FIXED transmission time MUST consider > >>> the current TTL of that packet and give up if > 10mS * TTL elapses > >>> while it is trying to transmit. AND change the default if Q > >>> size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > >>> at 1000 as it has delay targets that present the issue that > >>> initially bumping this to 1000 caused. > >>> > >>> ... end of Rods Rant ... > >>> > >>> -- > >>> Rod Grimes rgrimes@freebsd.org > >>> _______________________________________________ > >>> Starlink mailing list > >>> Starlink@lists.bufferbloat.net > >>> https://lists.bufferbloat.net/listinfo/starlink > >> > >> > >> ------------------------------ > >> > >> Subject: Digest Footer > >> > >> _______________________________________________ > >> Starlink mailing list > >> Starlink@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/starlink > >> > >> > >> ------------------------------ > >> > >> End of Starlink Digest, Vol 4, Issue 21 > >> *************************************** > >> > > > > > > _______________________________________________ > > Starlink mailing list > > Starlink@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/starlink > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 12:39 ` Rodney W. Grimes @ 2021-07-13 18:01 ` David Lang 2021-07-13 18:06 ` Ben Greear 0 siblings, 1 reply; 15+ messages in thread From: David Lang @ 2021-07-13 18:01 UTC (permalink / raw) To: Rodney W. Grimes; +Cc: David Lang, David P. Reed, starlink On Tue, 13 Jul 2021, Rodney W. Grimes wrote: > It wasnt suggested "lowering the bit rate", it was suggested to make the > packets smaller, which actually does address the hidden transmitter problem > to some degree as it *would* reduce your air time occupancy, but the damn > wifi LL aggregation gets in your way cause it blows them back up. When I > am having to deal/use wifi in a hidden transmitter prone situation I always > crank down the Fragmentation Threshold setting from the default of 2346 bytes > to the often the minimum of 256 with good results. The problem is that with wifi at modern data rates, you have a header at a low data rate and then data at a much higher data rate (in extreme cases, a >50x difference), so the amount of data that you send has a pretty minor difference in the airtime used. So you really do want to send a large amount of data per transmission to minimize the overhead IT's not quite as bad if you have disabled 802.11b speeds on the entire network as that raises the header/housekeeping transmissions from 1Mb/s to 11Mb/s David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 18:01 ` David Lang @ 2021-07-13 18:06 ` Ben Greear 2021-07-13 18:13 ` David Lang 0 siblings, 1 reply; 15+ messages in thread From: Ben Greear @ 2021-07-13 18:06 UTC (permalink / raw) To: David Lang, Rodney W. Grimes; +Cc: starlink, David P. Reed On 7/13/21 11:01 AM, David Lang wrote: > On Tue, 13 Jul 2021, Rodney W. Grimes wrote: > >> It wasnt suggested "lowering the bit rate", it was suggested to make the >> packets smaller, which actually does address the hidden transmitter problem >> to some degree as it *would* reduce your air time occupancy, but the damn >> wifi LL aggregation gets in your way cause it blows them back up. When I >> am having to deal/use wifi in a hidden transmitter prone situation I always >> crank down the Fragmentation Threshold setting from the default of 2346 bytes >> to the often the minimum of 256 with good results. > > The problem is that with wifi at modern data rates, you have a header at a low data rate and then data at a much higher data rate (in extreme cases, a >50x > difference), so the amount of data that you send has a pretty minor difference in the airtime used. So you really do want to send a large amount of data per > transmission to minimize the overhead > > IT's not quite as bad if you have disabled 802.11b speeds on the entire network as that raises the header/housekeeping transmissions from 1Mb/s to 11Mb/s The quiesce period waiting for medium access also takes some time, so that is another reason to try to put lots of frames on air in the same tx operation... David, I'm curious about the rate-ctrl aspect of this. Have you found any implementations of rate-ctrl that try harder to decrease amsdu groupings and/or keep MCS higher (maybe based on RSSI?) in a congested environment to deal better with hidden node problems? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 18:06 ` Ben Greear @ 2021-07-13 18:13 ` David Lang 2021-07-13 18:25 ` Ben Greear 0 siblings, 1 reply; 15+ messages in thread From: David Lang @ 2021-07-13 18:13 UTC (permalink / raw) To: Ben Greear; +Cc: David Lang, Rodney W. Grimes, starlink, David P. Reed [-- Attachment #1: Type: text/plain, Size: 2144 bytes --] On Tue, 13 Jul 2021, Ben Greear wrote: > On 7/13/21 11:01 AM, David Lang wrote: >> On Tue, 13 Jul 2021, Rodney W. Grimes wrote: >> >>> It wasnt suggested "lowering the bit rate", it was suggested to make the >>> packets smaller, which actually does address the hidden transmitter >>> problem >>> to some degree as it *would* reduce your air time occupancy, but the damn >>> wifi LL aggregation gets in your way cause it blows them back up. When I >>> am having to deal/use wifi in a hidden transmitter prone situation I >>> always >>> crank down the Fragmentation Threshold setting from the default of 2346 >>> bytes >>> to the often the minimum of 256 with good results. >> >> The problem is that with wifi at modern data rates, you have a header at a >> low data rate and then data at a much higher data rate (in extreme cases, a >> >50x difference), so the amount of data that you send has a pretty minor >> difference in the airtime used. So you really do want to send a large >> amount of data per transmission to minimize the overhead >> >> IT's not quite as bad if you have disabled 802.11b speeds on the entire >> network as that raises the header/housekeeping transmissions from 1Mb/s to >> 11Mb/s > > The quiesce period waiting for medium access also takes some time, so that is > another reason to try to put lots of frames on air in the same tx operation... yep, mentally I lump that into the header/housekeeping functionality as that's all fixed-time no matter how much data you are transmitting. > David, I'm curious about the rate-ctrl aspect of this. Have you found any > implementations of rate-ctrl that try harder to decrease amsdu groupings > and/or keep MCS higher (maybe based on RSSI?) in a congested environment to > deal better with hidden node problems? I have not experimented with that. I help run the network at the SCALE conference each year (3500 geeks with their gear over ~100k sq ft of conference center with >100 APs running openwrt), and I'm open to suggestions for monitoring/tweaking the network stack, as long as it's easy to revert to default if we run into grief. David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 18:13 ` David Lang @ 2021-07-13 18:25 ` Ben Greear 2021-07-13 21:23 ` David Lang 0 siblings, 1 reply; 15+ messages in thread From: Ben Greear @ 2021-07-13 18:25 UTC (permalink / raw) To: David Lang; +Cc: Rodney W. Grimes, starlink, David P. Reed On 7/13/21 11:13 AM, David Lang wrote: > On Tue, 13 Jul 2021, Ben Greear wrote: > >> On 7/13/21 11:01 AM, David Lang wrote: >>> On Tue, 13 Jul 2021, Rodney W. Grimes wrote: >>> >>>> It wasnt suggested "lowering the bit rate", it was suggested to make the >>>> packets smaller, which actually does address the hidden transmitter problem >>>> to some degree as it *would* reduce your air time occupancy, but the damn >>>> wifi LL aggregation gets in your way cause it blows them back up. When I >>>> am having to deal/use wifi in a hidden transmitter prone situation I always >>>> crank down the Fragmentation Threshold setting from the default of 2346 bytes >>>> to the often the minimum of 256 with good results. >>> >>> The problem is that with wifi at modern data rates, you have a header at a low data rate and then data at a much higher data rate (in extreme cases, a >50x >>> difference), so the amount of data that you send has a pretty minor difference in the airtime used. So you really do want to send a large amount of data per >>> transmission to minimize the overhead >>> >>> IT's not quite as bad if you have disabled 802.11b speeds on the entire network as that raises the header/housekeeping transmissions from 1Mb/s to 11Mb/s >> >> The quiesce period waiting for medium access also takes some time, so that is another reason to try to put lots of frames on air in the same tx operation... > > yep, mentally I lump that into the header/housekeeping functionality as that's all fixed-time no matter how much data you are transmitting. Not exactly fixed though? With a few transmitters trying to get on medium, air-time may theoretically improve a bit for first few transmitters due to random backoff timers finding clear air after shorter over-all quiesce period, but I would image it gets pretty bad with several hundred radios detecting collisions and increasing their backoff before accessing medium again? > >> David, I'm curious about the rate-ctrl aspect of this. Have you found any implementations of rate-ctrl that try harder to decrease amsdu groupings and/or >> keep MCS higher (maybe based on RSSI?) in a congested environment to deal better with hidden node problems? > > I have not experimented with that. I help run the network at the SCALE conference each year (3500 geeks with their gear over ~100k sq ft of conference center > with >100 APs running openwrt), and I'm open to suggestions for monitoring/tweaking the network stack, as long as it's easy to revert to default if we run into > grief. Mucking with rate-ctrl typically involves hacking firmware in modern APs. Impossible for most due to lack of source, and tricky stuff even for those with source. And if you do it wrong, you can completely ruin a network. Maybe something like MTK AX chipset will someday offer a host-based rate-ctrl where experimenters could put some effort in this direction. I don't know of any other chipset that would have any chance of user-based rate-ctrl. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global> Internet testbed 2021-07-13 18:25 ` Ben Greear @ 2021-07-13 21:23 ` David Lang 0 siblings, 0 replies; 15+ messages in thread From: David Lang @ 2021-07-13 21:23 UTC (permalink / raw) To: Ben Greear; +Cc: David Lang, Rodney W. Grimes, starlink, David P. Reed [-- Attachment #1: Type: text/plain, Size: 3556 bytes --] On Tue, 13 Jul 2021, Ben Greear wrote: >>>> The problem is that with wifi at modern data rates, you have a header at >>>> a low data rate and then data at a much higher data rate (in extreme >>>> cases, a >50x difference), so the amount of data that you send has a >>>> pretty minor difference in the airtime used. So you really do want to >>>> send a large amount of data per transmission to minimize the overhead >>>> >>>> IT's not quite as bad if you have disabled 802.11b speeds on the entire >>>> network as that raises the header/housekeeping transmissions from 1Mb/s >>>> to 11Mb/s >>> >>> The quiesce period waiting for medium access also takes some time, so that >>> is another reason to try to put lots of frames on air in the same tx >>> operation... >> >> yep, mentally I lump that into the header/housekeeping functionality as >> that's all fixed-time no matter how much data you are transmitting. > > Not exactly fixed though? With a few transmitters trying to get on medium, > air-time may theoretically improve a bit for first few transmitters due to > random backoff timers finding clear air after shorter over-all quiesce period, > but I would image it gets pretty bad with several hundred radios detecting > collisions and increasing their backoff before accessing medium again? the quiesce period and the header transmission time are airtime that is used by a transmission that does not transmit any data (header at the beginning of the message, quiesce period at the end of the message). Neither of them depend on how much data you are sending. >>> David, I'm curious about the rate-ctrl aspect of this. Have you found any >>> implementations of rate-ctrl that try harder to decrease amsdu groupings >>> and/or keep MCS higher (maybe based on RSSI?) in a congested environment >>> to deal better with hidden node problems? >> >> I have not experimented with that. I help run the network at the SCALE >> conference each year (3500 geeks with their gear over ~100k sq ft of >> conference center with >100 APs running openwrt), and I'm open to >> suggestions for monitoring/tweaking the network stack, as long as it's easy >> to revert to default if we run into grief. > > Mucking with rate-ctrl typically involves hacking firmware in modern APs. > Impossible for most due to lack of > source, and tricky stuff even for those with source. And if you do it wrong, > you can completely ruin a > network. > > Maybe something like MTK AX chipset will someday offer a host-based rate-ctrl > where experimenters could put some > effort in this direction. I don't know of any other chipset that would have > any chance of user-based rate-ctrl. Everything I work with has to work with unmodified clients, and it's the clients that have the most hidden transmitter problems. If you are doing infrastructure wifi (especially point-to-point links) there is a ton of stuff that you can do with antennas to cut down the signal from transmitters you don't care about and enhance the signal from the ones you do. careful analysis of your building structure can let you use AP positioning and antennas to reduce interference between APs and different sets of clients. but I spoke up because the assumption that wifi data errors are primarily things that can be solved by error correction and slowing down the bit rate are only true in the weak-signal environment, and are very counter-productive in a dense/urban environment, which is FAR more common than the weak-signal envioronment today. David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Starlink] SatNetLab: A call to arms for the next global Internet testbed @ 2021-07-09 19:19 Dave Taht 2021-07-10 11:49 ` Rodney W. Grimes 0 siblings, 1 reply; 15+ messages in thread From: Dave Taht @ 2021-07-09 19:19 UTC (permalink / raw) To: starlink; +Cc: Ankit Singla, Sam Kumar While it is good to have a call to arms, like this: https://people.inf.ethz.ch/asingla/papers/satnetlab.pdf One of my favorite of "kelly johnson's rules" is "The designers of the airplane must be on the shop floor while the first ones are built". I strongly feel that new transports, packet scheduling, aqm, and congestion control mechanisms be co-designed with the l1, l2, and l3 people all locked in the same room, with representatives also from open source, hardware and software vendors, and academia - with enough skills to actually implement and test the design as it evolves. Going to the call for big fat centralized testbeds in this paper... "the center cannot hold, it all falls apart" - Yeats (0) yes, while I do think long term funding and some centralization is needed for longitudinal studies, by the time a large testbed like planet lab is built, it is almost always obsolete, with difficult hurdles for j.random.user to meet, and further, offering services over some centralized facility just for tests doesn't scale, and without the services it offered being entirely open source, well... planetlab closed, and cerowrt - a widely distributed effort - survived and nearly everything from that bargain basement project made it out into the linux kernel, into open source, and now fq_codel, in particular, is into billions of devices. Along the way we spawned three ietf working groups (aqm, homenet, and babel), and the ipv6 stuff in openwrt is still (IMHO), the most well thought out of nearly anything along the edge "out there". IPv6 still suffers from not having had kelly johnson along to ride herd on everyone. it is far better to opt for the most decentralized environment you can and engage with every kind of engineer along the way. Much like how the original IMPs were spread to universities across the entire united states, a future "satnet" should be spread across the world, and should be closely tied to the actual users and innovators also using it, much like how in the 80s ipv4 outran the ISO stack for usefulness to real people for real users. ISO was "the future" for so long that the future outran it, and until the "kobe revolt" ousted the folk in charge of that top down design from iana mgmt (where vint had that famous incident baring the "IP on everything t-shirt")... then real forward progress on commercializing the internet proceeded rapidly. Anyway, in the satnetlab paper: The author points to some good work at l3 but completely missed the real world changes that happened in the past decade at all layers of the stack. In passing I note that bbr and delay based e2e congestion controls work much better with five tuple DRR, SFQ, QFQ, or SQF at the bottleneck links in the network. Ankit is right in that BGP ran out of napkins long ago and is so ossified as to be pretty useless for inter connecting new portions of the internet. Centralized IGPs like OSPF are probably not the right thing either, and my bet as to a routing protocol worth leveraging (at least some of) has been the distance-vector protocol "babel" which has (so far as I know) the only working implementation of source specific routing and a more or less working RTT metric. The other big thing that makes me a bit crazy is that network designs are NOT laws of nature!, they are protocol agreements and engineering changes, and at every change you need to recheck your assumptions at all the other layers in the stack.... [1] Here's another piece of pre-history - alohanet - the TTL field was the "time to live" field. The intent was that the packet would indicate how much time it would be valid before it was discarded. It didn't work out, and was replaced by hopcount, which of course switched networks ignore and isonly semi-useful for detecting loops and the like. Thought: A new satcom network l2, could actually record the universal originating time of a packet from a world-wide gps clock (64 bit header), and thus measuring transit times becomes easy. Didn't have gps back in the 60s and 70s.... ... To try and pick on everyone equally (including myself! I wasn't monitoring the fq_codel deployment in the cloud closely and it turns out at least one increasingly common virtualized driver (virtio-net) doesn't have bql in it, leading to "sloshy behavior" of tcp, with > 16MB of data living out on the tx and rx rings) [1] l3 folk talk about "mice and elephants" far too much when talking about network traffic. Years ago we added a new taxonomy to that, "Ants", which scurry around almost invisibly keeping the network ecosystem healthy and mostly need to happen around or below what we think of as l2. It's easy to show what happens to a network without "ants" in it - block ARP for a minute and an ethernet network will fail. Similarly, ipv6 ND. Block address assignment via DHCP... or dns... or take a hard look at "management frames" in wifi, or if you want to make your head really hurt, take a deep dive into 3gpp, and see if you can come out the other side with your brain intact. To me the "ants" are the most important part of the network ecosystem, hardly studied, totally necessary. A misunderstanding about the nature of buffers led to the continuously unrolling, seemingly endless disaster that is ethernet over powerline with its interaction of hardware flow control, variable rates, and buffering [2,3] I don't want to talk to GPON today. And then: there are all sorts of useful secrets buried in the history of the Internet, Aloha, and Arpanet that seem to have been lost on a lot of people... that we sort of, have been counting on being always there... (going back to there being a lot of useful experiments that can get done on cheap hardware, in a decentralized fashion) Example: Recently I was told that at least one breed of "thread" wireless chip did not have exponential backoff in the (rom!) firmware, which meant (to me) that any significant density your office's lightbulb array would suffer congestion collapse on the next firmware update.... ... and to test that out required a deep knowledge and expensive gear to sniff the radio and sitting side by side with an EE type to decode the signals (an exercise I recommend to any CS major), reverse decompiling the firmware (which I recommend to EE types)... or buying 16 of these 8 dollar chips, designing an experiment with a dense mesh, and beating the heck out of it with real traffic, which is what I planned to do when I got around to it. I figured the results would be hilarious for a while... but then I would probably end up worrying about the financial future of whatever company actually tried to ship these chips, qty millions, into the field, or the millions of customers sometimes unable to flick their lightbulbs on or off for no apparent reason... and thus I haven't got around to powering them up, and then filing bug reports and so forth, and climbing through 9 layers of VPs committed to long term buying decisions. Have a good weekend everyone. I'm tapped out. Have a poem. 0) Turning and turning in the widening gyre The falcon cannot hear the falconer; Things fall apart; the center cannot hold; Mere anarchy is loosed upon the world. - https://www.sparknotes.com/lit/things/quotes/ 1) https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf 2) Interactions between TCP and Ethernet flow controlover Netgear XAVB2001 HomePlug AV links http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.278.6149&rep=rep1&type=pdf 3) Buffer size estimationTP LINK TL-PA211KITHomePlug AV adapters https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.300.7521&rep=rep1&type=pdf -- Latest Podcast: https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ Dave Täht CTO, TekLibre, LLC ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed 2021-07-09 19:19 [Starlink] SatNetLab: A call to arms for the next global " Dave Taht @ 2021-07-10 11:49 ` Rodney W. Grimes 2021-07-10 20:27 ` David Lang 0 siblings, 1 reply; 15+ messages in thread From: Rodney W. Grimes @ 2021-07-10 11:49 UTC (permalink / raw) To: Dave Taht; +Cc: starlink, Ankit Singla, Sam Kumar > While it is good to have a call to arms, like this: ... much information removed as I only one to reply to 1 very narrow, but IMHO, very real problem in our networks today ... > Here's another piece of pre-history - alohanet - the TTL field was the > "time to live" field. The intent was that the packet would indicate > how much time it would be valid before it was discarded. It didn't > work out, and was replaced by hopcount, which of course switched > networks ignore and isonly semi-useful for detecting loops and the > like. TTL works perfectly fine where the original assumptions that a device along a network path only hangs on to a packet for a reasonable short duration, and that there is not some "retry" mechanism in place that is causing this time to explode. BSD, and as far as I can recall, almost ALL original IP stacks had a Q depth limit of 50 packets on egress interfaces. Everything pretty much worked well and the net was happy. Then these base assumptions got blasted in the name of "measurable bandwidth" and the concept of packets are so precious we must not loose them, at almost any cost. Linux crammed the per interface Q up to 1000, wifi decided that it was reasable to retry at the link layer so many times that I have seen packets that are >60 seconds old. Proposed FIX: Any device that transmits packets that does not already have an inherit FIXED transmission time MUST consider the current TTL of that packet and give up if > 10mS * TTL elapses while it is trying to transmit. AND change the default if Q size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine at 1000 as it has delay targets that present the issue that initially bumping this to 1000 caused. ... end of Rods Rant ... -- Rod Grimes rgrimes@freebsd.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed 2021-07-10 11:49 ` Rodney W. Grimes @ 2021-07-10 20:27 ` David Lang 2021-07-19 15:50 ` George Burdell 0 siblings, 1 reply; 15+ messages in thread From: David Lang @ 2021-07-10 20:27 UTC (permalink / raw) To: Rodney W. Grimes; +Cc: Dave Taht, starlink, Ankit Singla, Sam Kumar any buffer sizing based on the number of packets is wrong. Base your buffer size on transmit time and you have a chance of being reasonable. In cases like wifi where packets aren't sent individually, but are sent in blobs of packets going to the same destination, you want to buffer at least a blobs worth of packets to each destination so that when your transmit slot comes up, you can maximize it. Wifi has the added issue that the blob headers are at a much lower data rate than the dta itself, so you can cram a LOT of data into a blob without making a significant difference in the airtime used, so you really do want to be able to send full blobs (not at the cost of delaying tranmission if you don't have a full blob, a mistake some people make, but you do want to buffer enough to fill the blobs) and given that dropped packets results in timeouts and retransmissions that affect the rest of the network, it's not obviously wrong for a lossy hop like wifi to retry a failed transmission, it just needs to not retry too many times. David Lang On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> > To: Dave Taht <dave.taht@gmail.com> > Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, > Sam Kumar <samkumar@cs.berkeley.edu> > Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet > testbed > >> While it is good to have a call to arms, like this: > ... much information removed as I only one to reply to 1 very > narrow, but IMHO, very real problem in our networks today ... > >> Here's another piece of pre-history - alohanet - the TTL field was the >> "time to live" field. The intent was that the packet would indicate >> how much time it would be valid before it was discarded. It didn't >> work out, and was replaced by hopcount, which of course switched >> networks ignore and isonly semi-useful for detecting loops and the >> like. > > TTL works perfectly fine where the original assumptions that a > device along a network path only hangs on to a packet for a > reasonable short duration, and that there is not some "retry" > mechanism in place that is causing this time to explode. BSD, > and as far as I can recall, almost ALL original IP stacks had > a Q depth limit of 50 packets on egress interfaces. Everything > pretty much worked well and the net was happy. Then these base > assumptions got blasted in the name of "measurable bandwidth" and > the concept of packets are so precious we must not loose them, > at almost any cost. Linux crammed the per interface Q up to 1000, > wifi decided that it was reasable to retry at the link layer so > many times that I have seen packets that are >60 seconds old. > > Proposed FIX: Any device that transmits packets that does not > already have an inherit FIXED transmission time MUST consider > the current TTL of that packet and give up if > 10mS * TTL elapses > while it is trying to transmit. AND change the default if Q > size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > at 1000 as it has delay targets that present the issue that > initially bumping this to 1000 caused. > > ... end of Rods Rant ... > > -- > Rod Grimes rgrimes@freebsd.org > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed 2021-07-10 20:27 ` David Lang @ 2021-07-19 15:50 ` George Burdell 2021-12-11 9:18 ` Ankit Singla 0 siblings, 1 reply; 15+ messages in thread From: George Burdell @ 2021-07-19 15:50 UTC (permalink / raw) To: David Lang; +Cc: Rodney W. Grimes, starlink, Ankit Singla, Sam Kumar On Sat, Jul 10, 2021 at 01:27:28PM -0700, David Lang wrote: > any buffer sizing based on the number of packets is wrong. Base your buffer > size on transmit time and you have a chance of being reasonable. This is very true. Packets have a dynamic range of 64 bytes to 64k (GRO) and sizing queues in terms of packets leads to bad behavior on mixed up and down traffic particularly. Also... people doing AQM and TCP designs tend to almost always test one way traffic only, and this leads to less than desirable behavior on real world traffic. Strike that. Terrible behavior! a pure single queue AQM struggles mightily to find a good hit rate when there are a ton of acks, dns, gaming, voip, etc, mixed in with the capacity seeking flows. Nearly every AQM paper you read never tests real, bidir traffic. It's a huge blind spot, which is why the bufferbloat effort *starts* with the rrul test and related on testing any new idea we have. bfifos are better, but harder to implement in hardware. A fun trick: If you are trying to optimize your traffic for R/T communications rather than speedtest, you can clamp your tcp "mss" to smaller than 600 bytes *at the router*, and your network gets better. (we really should get around to publishing something on that, when you are plagued by a giant upstream FIFO, filling it with smaller packets really helps, and it's something a smart user could easily do regardless of the ISP's preferences) > > In cases like wifi where packets aren't sent individually, but are sent in > blobs of packets going to the same destination, yes... > you want to buffer at least > a blobs worth of packets to each destination so that when your transmit slot > comes up, you can maximize it. Nooooooo! This is one of those harder tradeoffs that is pretty counter intuitive. You want per station queuing, yes. However the decision as to how much service time you want to grant each station is absolutely not in maximizing the transmit slot, but in maximizing the number of stations you can serve in reasonable time. Simple (and inaccurate) example: 100 stations at 4ms txop each, stuffed full of *udp* data, is 400ms/round. (plus usually insane numbers of retries). This breaks a lot of things, and doesn't respect the closely coupled nature of tcp (please re-read the codel paper!). Cutting the txop in this case to 1ms cuts interstation service time... at the cost of "bandwidth" that can't be stuffed into the slow header + wifi data rate equation. but what you really want to do is give the sparsest stations quicker access to the media so they can ramp up to parity (and usually complete their short flows much faster, and then get off) I run with BE 2.4ms txops and announce the same in the beacon. I'd be willing to bet your scale conference network would work much better if you did that also. (It would be better if we could scale txop size to the load, but fq_codel on wifi already does the sparse station optimization which translates into many shorter txops than you would see from other wifi schedulers, and the bulk of the problem I see is the *stations*) lastly, you need to defer constructing the blob as long as possible, so you can shoot at, mark, or reschedule (FQ), the packets in there at the last moment before they are committed to the hardware. Ideally you would not construct any blob at all until a few microseconds before the transmit opportunity. Shifting this back to starlink - they have a marvelous opportunity to do just this, in the dishy, as they are half duplex and could defer grabbing the packets from a sch_cake buffer until precisely before that txop to the sat arrives. (my guess would be no more than 400us based on what I understand of the arm chip they are using) This would be much better than what we could do in the ath9k where we were forced to always have "one in the hardware, one ready to go" due to limitations in that chip. We're making some progress on the openwifi fpga here, btw... > Wifi has the added issue that the blob headers are at a much lower data rate > than the dta itself, so you can cram a LOT of data into a blob without > making a significant difference in the airtime used, so you really do want > to be able to send full blobs (not at the cost of delaying tranmission if > you don't have a full blob, a mistake some people make, but you do want to > buffer enough to fill the blobs) > > and given that dropped packets results in timeouts and retransmissions that > affect the rest of the network, it's not obviously wrong for a lossy hop > like wifi to retry a failed transmission, it just needs to not retry too > many times. > > David Lang > > > On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > > >Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > >From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> > >To: Dave Taht <dave.taht@gmail.com> > >Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, > > Sam Kumar <samkumar@cs.berkeley.edu> > >Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet > > testbed > > > >>While it is good to have a call to arms, like this: > >... much information removed as I only one to reply to 1 very > > narrow, but IMHO, very real problem in our networks today ... > > > >>Here's another piece of pre-history - alohanet - the TTL field was the > >>"time to live" field. The intent was that the packet would indicate > >>how much time it would be valid before it was discarded. It didn't > >>work out, and was replaced by hopcount, which of course switched > >>networks ignore and isonly semi-useful for detecting loops and the > >>like. > > > >TTL works perfectly fine where the original assumptions that a > >device along a network path only hangs on to a packet for a > >reasonable short duration, and that there is not some "retry" > >mechanism in place that is causing this time to explode. BSD, > >and as far as I can recall, almost ALL original IP stacks had > >a Q depth limit of 50 packets on egress interfaces. Everything > >pretty much worked well and the net was happy. Then these base > >assumptions got blasted in the name of "measurable bandwidth" and > >the concept of packets are so precious we must not loose them, > >at almost any cost. Linux crammed the per interface Q up to 1000, > >wifi decided that it was reasable to retry at the link layer so > >many times that I have seen packets that are >60 seconds old. > > > >Proposed FIX: Any device that transmits packets that does not > >already have an inherit FIXED transmission time MUST consider > >the current TTL of that packet and give up if > 10mS * TTL elapses > >while it is trying to transmit. AND change the default if Q > >size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > >at 1000 as it has delay targets that present the issue that > >initially bumping this to 1000 caused. > > > >... end of Rods Rant ... > > > >-- > >Rod Grimes rgrimes@freebsd.org > >_______________________________________________ > >Starlink mailing list > >Starlink@lists.bufferbloat.net > >https://lists.bufferbloat.net/listinfo/starlink > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed 2021-07-19 15:50 ` George Burdell @ 2021-12-11 9:18 ` Ankit Singla 2021-12-13 1:16 ` Dave Taht 0 siblings, 1 reply; 15+ messages in thread From: Ankit Singla @ 2021-12-11 9:18 UTC (permalink / raw) To: George Burdell Cc: David Lang, Rodney W. Grimes, starlink, Ankit Singla, Sam Kumar [-- Attachment #1: Type: text/plain, Size: 8133 bytes --] Sorry I didn’t engage with this, folks — probably came across as rude, but just had a large and unexpected career shift ongoing (https://twitter.com/stub_AS/status/1469283183132876809?s=20), and didn’t feel up to it, especially, as I’m largely abandoning my research along these lines due to these developments. In any case, I have a lot of respect for you folks educating everyone on latency and buffer bloat, and have been following Dave (Taht)’s great work in the space for awhile. Best, Ankit On Jul 19, 2021, at 17:50, George Burdell <gb@teklibre.net<mailto:gb@teklibre.net>> wrote: On Sat, Jul 10, 2021 at 01:27:28PM -0700, David Lang wrote: any buffer sizing based on the number of packets is wrong. Base your buffer size on transmit time and you have a chance of being reasonable. This is very true. Packets have a dynamic range of 64 bytes to 64k (GRO) and sizing queues in terms of packets leads to bad behavior on mixed up and down traffic particularly. Also... people doing AQM and TCP designs tend to almost always test one way traffic only, and this leads to less than desirable behavior on real world traffic. Strike that. Terrible behavior! a pure single queue AQM struggles mightily to find a good hit rate when there are a ton of acks, dns, gaming, voip, etc, mixed in with the capacity seeking flows. Nearly every AQM paper you read never tests real, bidir traffic. It's a huge blind spot, which is why the bufferbloat effort *starts* with the rrul test and related on testing any new idea we have. bfifos are better, but harder to implement in hardware. A fun trick: If you are trying to optimize your traffic for R/T communications rather than speedtest, you can clamp your tcp "mss" to smaller than 600 bytes *at the router*, and your network gets better. (we really should get around to publishing something on that, when you are plagued by a giant upstream FIFO, filling it with smaller packets really helps, and it's something a smart user could easily do regardless of the ISP's preferences) In cases like wifi where packets aren't sent individually, but are sent in blobs of packets going to the same destination, yes... you want to buffer at least a blobs worth of packets to each destination so that when your transmit slot comes up, you can maximize it. Nooooooo! This is one of those harder tradeoffs that is pretty counter intuitive. You want per station queuing, yes. However the decision as to how much service time you want to grant each station is absolutely not in maximizing the transmit slot, but in maximizing the number of stations you can serve in reasonable time. Simple (and inaccurate) example: 100 stations at 4ms txop each, stuffed full of *udp* data, is 400ms/round. (plus usually insane numbers of retries). This breaks a lot of things, and doesn't respect the closely coupled nature of tcp (please re-read the codel paper!). Cutting the txop in this case to 1ms cuts interstation service time... at the cost of "bandwidth" that can't be stuffed into the slow header + wifi data rate equation. but what you really want to do is give the sparsest stations quicker access to the media so they can ramp up to parity (and usually complete their short flows much faster, and then get off) I run with BE 2.4ms txops and announce the same in the beacon. I'd be willing to bet your scale conference network would work much better if you did that also. (It would be better if we could scale txop size to the load, but fq_codel on wifi already does the sparse station optimization which translates into many shorter txops than you would see from other wifi schedulers, and the bulk of the problem I see is the *stations*) lastly, you need to defer constructing the blob as long as possible, so you can shoot at, mark, or reschedule (FQ), the packets in there at the last moment before they are committed to the hardware. Ideally you would not construct any blob at all until a few microseconds before the transmit opportunity. Shifting this back to starlink - they have a marvelous opportunity to do just this, in the dishy, as they are half duplex and could defer grabbing the packets from a sch_cake buffer until precisely before that txop to the sat arrives. (my guess would be no more than 400us based on what I understand of the arm chip they are using) This would be much better than what we could do in the ath9k where we were forced to always have "one in the hardware, one ready to go" due to limitations in that chip. We're making some progress on the openwifi fpga here, btw... Wifi has the added issue that the blob headers are at a much lower data rate than the dta itself, so you can cram a LOT of data into a blob without making a significant difference in the airtime used, so you really do want to be able to send full blobs (not at the cost of delaying tranmission if you don't have a full blob, a mistake some people make, but you do want to buffer enough to fill the blobs) and given that dropped packets results in timeouts and retransmissions that affect the rest of the network, it's not obviously wrong for a lossy hop like wifi to retry a failed transmission, it just needs to not retry too many times. David Lang On Sat, 10 Jul 2021, Rodney W. Grimes wrote: Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net<mailto:starlink@gndrsh.dnsmgr.net>> To: Dave Taht <dave.taht@gmail.com<mailto:dave.taht@gmail.com>> Cc: starlink@lists.bufferbloat.net<mailto:starlink@lists.bufferbloat.net>, Ankit Singla <asingla@ethz.ch<mailto:asingla@ethz.ch>>, Sam Kumar <samkumar@cs.berkeley.edu<mailto:samkumar@cs.berkeley.edu>> Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed While it is good to have a call to arms, like this: ... much information removed as I only one to reply to 1 very narrow, but IMHO, very real problem in our networks today ... Here's another piece of pre-history - alohanet - the TTL field was the "time to live" field. The intent was that the packet would indicate how much time it would be valid before it was discarded. It didn't work out, and was replaced by hopcount, which of course switched networks ignore and isonly semi-useful for detecting loops and the like. TTL works perfectly fine where the original assumptions that a device along a network path only hangs on to a packet for a reasonable short duration, and that there is not some "retry" mechanism in place that is causing this time to explode. BSD, and as far as I can recall, almost ALL original IP stacks had a Q depth limit of 50 packets on egress interfaces. Everything pretty much worked well and the net was happy. Then these base assumptions got blasted in the name of "measurable bandwidth" and the concept of packets are so precious we must not loose them, at almost any cost. Linux crammed the per interface Q up to 1000, wifi decided that it was reasable to retry at the link layer so many times that I have seen packets that are >60 seconds old. Proposed FIX: Any device that transmits packets that does not already have an inherit FIXED transmission time MUST consider the current TTL of that packet and give up if > 10mS * TTL elapses while it is trying to transmit. AND change the default if Q size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine at 1000 as it has delay targets that present the issue that initially bumping this to 1000 caused. ... end of Rods Rant ... -- Rod Grimes rgrimes@freebsd.org<mailto:rgrimes@freebsd.org> _______________________________________________ Starlink mailing list Starlink@lists.bufferbloat.net<mailto:Starlink@lists.bufferbloat.net> https://lists.bufferbloat.net/listinfo/starlink _______________________________________________ Starlink mailing list Starlink@lists.bufferbloat.net<mailto:Starlink@lists.bufferbloat.net> https://lists.bufferbloat.net/listinfo/starlink [-- Attachment #2: Type: text/html, Size: 61972 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed 2021-12-11 9:18 ` Ankit Singla @ 2021-12-13 1:16 ` Dave Taht 0 siblings, 0 replies; 15+ messages in thread From: Dave Taht @ 2021-12-13 1:16 UTC (permalink / raw) To: Ankit Singla; +Cc: George Burdell, starlink, Sam Kumar On Sun, Dec 12, 2021 at 1:54 PM Ankit Singla <ankit.singla@inf.ethz.ch> wrote: > > Sorry I didn’t engage with this, folks — probably came across as rude, but just had a large and unexpected career shift ongoing (https://twitter.com/stub_AS/status/1469283183132876809?s=20), and didn’t feel up to it, especially, as I’m largely abandoning my research along these lines due to these developments. Congrats on your new gig! It'll be very different from what you're used to. I have a tendency to cc the author(s) of any given paper I like in the hope that it sparks discussion. We get very little "hallway conversation" in this covid era, and certainly back in june, with the wild success of that LEO conference, I'd hoped for an explosion and exposition of new ideas for moving our civilization out of LEO finally. I've spent the last few months in my all too few idle moments trying to envision new ideas for new payloads, new communication technologies, and services with a vastly lowered cost to orbit, and newly cost effective means to fling useful small payloads to and from the closer NEOs (my oft-lonely hobby for over 30 years now). > In any case, I have a lot of respect for you folks educating everyone on latency and buffer bloat, and have been following Dave (Taht)’s great work in the space for awhile. thx. I am really hoping someone cracks the cross-world sat to sat laser congestion problem!!! Maybe you'll find a way to implement some of our new p4 stuff at scale? I'm pretty happy with proposing cake + bql + some sort of packet pair based beam allocator, but without the abiltiy to hack on the dishy directly, have mostly given up. Don't feel like writing a simulator. I am just waiting patiently for an observable change in the measurements and fiddling now a bit with DTN. I too have to admit that the "call of the data center" has been very, very loud (and profitable) of late. But my heart's in space, and teeny routers. > Best, > Ankit > > On Jul 19, 2021, at 17:50, George Burdell <gb@teklibre.net> wrote: > > On Sat, Jul 10, 2021 at 01:27:28PM -0700, David Lang wrote: > > any buffer sizing based on the number of packets is wrong. Base your buffer > size on transmit time and you have a chance of being reasonable. > > > This is very true. Packets have a dynamic range of 64 bytes to 64k (GRO) and > sizing queues in terms of packets leads to bad behavior on mixed up and > down traffic particularly. > > Also... people doing AQM and TCP designs tend to almost always > test one way traffic only, and this leads to less than desirable behavior > on real world traffic. Strike that. Terrible behavior! a pure > single queue AQM struggles mightily to find a good hit rate when there are a > ton of acks, dns, gaming, voip, etc, mixed in with the capacity seeking > flows. > > Nearly every AQM paper you read never tests real, bidir traffic. It's > a huge blind spot, which is why the bufferbloat effort *starts* with > the rrul test and related on testing any new idea we have. > > bfifos are better, but harder to implement in hardware. > > A fun trick: If you are trying to optimize your traffic for R/T communications > rather than speedtest, you can clamp your tcp "mss" to smaller than 600 > bytes *at the router*, and your network gets better. > > (we really should get around to publishing something on that, when you are > plagued by a giant upstream FIFO, filling it with smaller packets really > helps, and it's something a smart user could easily do regardless of the > ISP's preferences) > > > In cases like wifi where packets aren't sent individually, but are sent in > blobs of packets going to the same destination, > > > yes... > > you want to buffer at least > a blobs worth of packets to each destination so that when your transmit slot > comes up, you can maximize it. > > > Nooooooo! This is one of those harder tradeoffs that is pretty counter > intuitive. You want per station queuing, yes. However the decision as to > how much service time you want to grant each station is absolutely > not in maximizing the transmit slot, but in maximizing the number of > stations you can serve in reasonable time. Simple (and inaccurate) example: > > 100 stations at 4ms txop each, stuffed full of *udp* data, is 400ms/round. > (plus usually insane numbers of retries). > > This breaks a lot of things, > and doesn't respect the closely coupled nature of tcp (please re-read > the codel paper!). Cutting the txop in this case to 1ms cuts interstation > service time... at the cost of "bandwidth" that can't be stuffed into > the slow header + wifi data rate equation. > > but what you really want to do is give the sparsest stations quicker > access to the media so they can ramp up to parity (and usually > complete their short flows much faster, and then get off) > > I run with BE 2.4ms txops and announce the same in the beacon. I'd > be willing to bet your scale conference network would work > much better if you did that also. (It would be better if we could > scale txop size to the load, but fq_codel on wifi already > does the sparse station optimization which translates into many > shorter txops than you would see from other wifi schedulers, and > the bulk of the problem I see is the *stations*) > > lastly, you need to defer constructing the blob as long as possible, > so you can shoot at, mark, or reschedule (FQ), the packets in there > at the last moment before they are committed to the hardware. > > Ideally you would not construct any blob at all until a few microseconds > before the transmit opportunity. > > Shifting this back to starlink - they have a marvelous opportunity > to do just this, in the dishy, as they are half duplex and could > defer grabbing the packets from a sch_cake buffer until precisely > before that txop to the sat arrives. > (my guess would be no more than 400us based on what I understand > of the arm chip they are using) > > This would be much better than what we could do in the ath9k > where we were forced to always have "one in the hardware, one > ready to go" due to limitations in that chip. We're making > some progress on the openwifi fpga here, btw... > > > Wifi has the added issue that the blob headers are at a much lower data rate > than the dta itself, so you can cram a LOT of data into a blob without > making a significant difference in the airtime used, so you really do want > to be able to send full blobs (not at the cost of delaying tranmission if > you don't have a full blob, a mistake some people make, but you do want to > buffer enough to fill the blobs) > > and given that dropped packets results in timeouts and retransmissions that > affect the rest of the network, it's not obviously wrong for a lossy hop > like wifi to retry a failed transmission, it just needs to not retry too > many times. > > David Lang > > > On Sat, 10 Jul 2021, Rodney W. Grimes wrote: > > Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) > From: Rodney W. Grimes <starlink@gndrsh.dnsmgr.net> > To: Dave Taht <dave.taht@gmail.com> > Cc: starlink@lists.bufferbloat.net, Ankit Singla <asingla@ethz.ch>, > Sam Kumar <samkumar@cs.berkeley.edu> > Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet > testbed > > While it is good to have a call to arms, like this: > > ... much information removed as I only one to reply to 1 very > narrow, but IMHO, very real problem in our networks today ... > > Here's another piece of pre-history - alohanet - the TTL field was the > "time to live" field. The intent was that the packet would indicate > how much time it would be valid before it was discarded. It didn't > work out, and was replaced by hopcount, which of course switched > networks ignore and isonly semi-useful for detecting loops and the > like. > > > TTL works perfectly fine where the original assumptions that a > device along a network path only hangs on to a packet for a > reasonable short duration, and that there is not some "retry" > mechanism in place that is causing this time to explode. BSD, > and as far as I can recall, almost ALL original IP stacks had > a Q depth limit of 50 packets on egress interfaces. Everything > pretty much worked well and the net was happy. Then these base > assumptions got blasted in the name of "measurable bandwidth" and > the concept of packets are so precious we must not loose them, > at almost any cost. Linux crammed the per interface Q up to 1000, > wifi decided that it was reasable to retry at the link layer so > many times that I have seen packets that are >60 seconds old. > > Proposed FIX: Any device that transmits packets that does not > already have an inherit FIXED transmission time MUST consider > the current TTL of that packet and give up if > 10mS * TTL elapses > while it is trying to transmit. AND change the default if Q > size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine > at 1000 as it has delay targets that present the issue that > initially bumping this to 1000 caused. > > ... end of Rods Rant ... > > -- > Rod Grimes rgrimes@freebsd.org > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink -- I tried to build a better future, a few times: https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org Dave Täht CEO, TekLibre, LLC ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-12-13 1:16 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <mailman.5.1626019201.21244.starlink@lists.bufferbloat.net> 2021-07-13 1:23 ` [Starlink] SatNetLab: A call to arms for the next global> Internet testbed David P. Reed 2021-07-13 1:27 ` Vint Cerf 2021-07-13 1:57 ` David Lang 2021-07-13 12:39 ` Rodney W. Grimes 2021-07-13 18:01 ` David Lang 2021-07-13 18:06 ` Ben Greear 2021-07-13 18:13 ` David Lang 2021-07-13 18:25 ` Ben Greear 2021-07-13 21:23 ` David Lang 2021-07-09 19:19 [Starlink] SatNetLab: A call to arms for the next global " Dave Taht 2021-07-10 11:49 ` Rodney W. Grimes 2021-07-10 20:27 ` David Lang 2021-07-19 15:50 ` George Burdell 2021-12-11 9:18 ` Ankit Singla 2021-12-13 1:16 ` Dave Taht
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox