Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [Cake] cake and nat in openwrt... on by default?
  2020-04-05  7:57  0% ` Kevin Darbyshire-Bryant
@ 2020-04-05 15:22  1%   ` Dave Taht
  0 siblings, 0 replies; 34+ results
From: Dave Taht @ 2020-04-05 15:22 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant; +Cc: Cake List

On Sun, Apr 5, 2020 at 12:57 AM Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
>
>
>
> > On 5 Apr 2020, at 05:17, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > I see cake is moving to the upstreamed version. As best as I recall,
> > nat mode was on by default in the openwrt code, but not the upstreamed
> > code.
> >
> > People not setting nat mode on would explain a few things i've seen
> > 'round the intertubes this week.
>
> From sch_cake repo and hence ‘out of tree’ cake
>
>         if (tb[TCA_CAKE_NAT]) {
> #if IS_REACHABLE(CONFIG_NF_CONNTRACK)
>                 q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
>                 q->flow_mode |= CAKE_FLOW_NAT_FLAG *
>                         !!nla_get_u32(tb[TCA_CAKE_NAT]);
> #else
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0)
>                 NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
>                                     "No conntrack support in kernel");
> #endif
>                 return -EOPNOTSUPP;
> #endif
>         }
>
>
> From kernel 5.4 as found in openwrt build dir
>
>         if (tb[TCA_CAKE_NAT]) {
> #if IS_ENABLED(CONFIG_NF_CONNTRACK)
>                 q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
>                 q->flow_mode |= CAKE_FLOW_NAT_FLAG *
>                         !!nla_get_u32(tb[TCA_CAKE_NAT]);
> #else
>                 NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
>                                     "No conntrack support in kernel");
>                 return -EOPNOTSUPP;
> #endif
>
>
>
> cake_init(…) in both does:
>
> q->flow_mode  = CAKE_FLOW_TRIPLE;
>
>
> So openwrt doesn’t, by default, enable NAT mode in cake.
>
> I honestly don’t think that there are enough instances of cake out there, let alone instances of cake from openwrt, let alone instances of cake from master which switched to upstream cake 2-3 days ago, to make any sort of difference anyway.

I'd still be willing to bet, then, that the majority of instances were
not turning nat mode on, when
they should have been.

>
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
> Cheers,
>
> Kevin D-B
>
> gpg: 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A
>


-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] cake and nat in openwrt... on by default?
  @ 2020-04-05  7:57  0% ` Kevin Darbyshire-Bryant
  2020-04-05 15:22  1%   ` Dave Taht
  0 siblings, 1 reply; 34+ results
From: Kevin Darbyshire-Bryant @ 2020-04-05  7:57 UTC (permalink / raw)
  To: Dave Taht; +Cc: Cake List

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]



> On 5 Apr 2020, at 05:17, Dave Taht <dave.taht@gmail.com> wrote:
> 
> I see cake is moving to the upstreamed version. As best as I recall,
> nat mode was on by default in the openwrt code, but not the upstreamed
> code.
> 
> People not setting nat mode on would explain a few things i've seen
> 'round the intertubes this week.

From sch_cake repo and hence ‘out of tree’ cake

        if (tb[TCA_CAKE_NAT]) {
#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
                q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
                q->flow_mode |= CAKE_FLOW_NAT_FLAG *
                        !!nla_get_u32(tb[TCA_CAKE_NAT]);
#else
#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0)
                NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
                                    "No conntrack support in kernel");
#endif
                return -EOPNOTSUPP;
#endif
        }


From kernel 5.4 as found in openwrt build dir

        if (tb[TCA_CAKE_NAT]) {
#if IS_ENABLED(CONFIG_NF_CONNTRACK)
                q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
                q->flow_mode |= CAKE_FLOW_NAT_FLAG *
                        !!nla_get_u32(tb[TCA_CAKE_NAT]);
#else
                NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
                                    "No conntrack support in kernel");
                return -EOPNOTSUPP;
#endif



cake_init(…) in both does:

q->flow_mode  = CAKE_FLOW_TRIPLE;


So openwrt doesn’t, by default, enable NAT mode in cake.

I honestly don’t think that there are enough instances of cake out there, let alone instances of cake from openwrt, let alone instances of cake from master which switched to upstream cake 2-3 days ago, to make any sort of difference anyway.

> 
> --
> Make Music, Not War
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


Cheers,

Kevin D-B

gpg: 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[relevance 0%]

* Re: [Cake] [Bloat]  New board that looks interesting
  2020-04-04 16:27  1%     ` Aaron Wood
@ 2020-04-04 17:36  1%       ` Dave Taht
  0 siblings, 0 replies; 34+ results
From: Dave Taht @ 2020-04-04 17:36 UTC (permalink / raw)
  To: Aaron Wood; +Cc: Cake List, David P. Reed, Make-Wifi-fast, bloat

I think I'll wait for y'all to try it and report back. I trust my
apu2s and I actually kind of like they lack a graphics chip and need
to be configured via serial port.

In other news I've started testing ubuntu 20.4, which among other
things, has wireguard in it. I've been really frustrated with the
state of distributions lately, trying to get any complex thing done
has required snaps and docker containers and I really prefer running
stuff natively when possible. Tools that I still rely on like mrtg and
smokeping are undermaintained, trying to get zoneminder to co-exist
and co-install with anything else (notably jitsi thus far) has been a
real PITA.

I am pleased at the increasing size of the ipv6 deployment, my phone
got it last month....

I think I've found a babel bug with default routes...

and I fired up a kernel build to go hack on the ax200 chips.

On Sat, Apr 4, 2020 at 9:27 AM Aaron Wood <woody77@gmail.com> wrote:
>
> The comparison of chipset performance link (to OpemWRT forums) that went out had this chip, the J4105 as the fastest.  Able to do a gigabit with cake (nearly able to do it in both directions).
>
> I think this has replaced the apu2 as the board I’m going with as my edge router.
>
> On Sat, Apr 4, 2020 at 9:10 AM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> Historically I've found the "Celeron" chips rather weak, but it's just
>> a brand. I haven't the foggiest idea how well this variant will
>> perform.
>>
>> The intel ethernet chips are best of breed in linux, however. It's
>> been my hope that the 211 variant with the timed networking support
>> would show up in the field (sch_etx) so we could fiddle with that,
>> (the apu2s aren't using that version) but I cannot for the life of me
>> remember the right keywords to look it up at the moment. this feature
>> lets you program when a packet emerges from the driver and is sort of
>> a whole new ballgame when it comes to scheduling - there hasn't been
>> an aqm designed for it, and you can do fq by playing tricks with the
>> sent timestamp.
>>
>> All the other features look rather nice on this board.
>>
>> On Sat, Apr 4, 2020 at 7:47 AM David P. Reed <dpreed@deepplum.com> wrote:
>> >
>> > Thanks! I ordered one just now. In my experience, this company does rather neat stuff. Their XMOS based microphone array (ReSpeaker) is really useful. What's the state of play in Linux/OpenWRT for Intel 9560 capabilities regarding AQM?
>> >
>> > On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com> said:
>> >
>> > > _______________________________________________
>> > > Cake mailing list
>> > > Cake@lists.bufferbloat.net
>> > > https://lists.bufferbloat.net/listinfo/cake
>> > > https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
>> > >
>> > > quad-core Celeron J4105 1.5-2.5 GHz x64
>> > > 8GB Ram
>> > > 2x i211t intel ethernet controllers
>> > > intel 9560 802.11ac (wave2) wifi/bluetooth chipset
>> > > intel built-in graphics
>> > > onboard ARM Cortex-M0 and RPi & Arduino headers
>> > > m.2 and PCIe adapters
>> > > <$200
>> > >
>> >
>> >
>> > _______________________________________________
>> > Bloat mailing list
>> > Bloat@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/bloat
>>
>>
>>
>> --
>> Make Music, Not War
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-435-0729
>
> --
> - Sent from my iPhone.



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] [Bloat]  New board that looks interesting
  2020-04-04 16:10  1%   ` [Cake] [Bloat] " Dave Taht
@ 2020-04-04 16:27  1%     ` Aaron Wood
  2020-04-04 17:36  1%       ` Dave Taht
  0 siblings, 1 reply; 34+ results
From: Aaron Wood @ 2020-04-04 16:27 UTC (permalink / raw)
  To: Dave Taht; +Cc: Cake List, David P. Reed, Make-Wifi-fast, bloat

[-- Attachment #1: Type: text/plain, Size: 2463 bytes --]

The comparison of chipset performance link (to OpemWRT forums) that went
out had this chip, the J4105 as the fastest.  Able to do a gigabit with
cake (nearly able to do it in both directions).

I think this has replaced the apu2 as the board I’m going with as my edge
router.

On Sat, Apr 4, 2020 at 9:10 AM Dave Taht <dave.taht@gmail.com> wrote:

> Historically I've found the "Celeron" chips rather weak, but it's just
> a brand. I haven't the foggiest idea how well this variant will
> perform.
>
> The intel ethernet chips are best of breed in linux, however. It's
> been my hope that the 211 variant with the timed networking support
> would show up in the field (sch_etx) so we could fiddle with that,
> (the apu2s aren't using that version) but I cannot for the life of me
> remember the right keywords to look it up at the moment. this feature
> lets you program when a packet emerges from the driver and is sort of
> a whole new ballgame when it comes to scheduling - there hasn't been
> an aqm designed for it, and you can do fq by playing tricks with the
> sent timestamp.
>
> All the other features look rather nice on this board.
>
> On Sat, Apr 4, 2020 at 7:47 AM David P. Reed <dpreed@deepplum.com> wrote:
> >
> > Thanks! I ordered one just now. In my experience, this company does
> rather neat stuff. Their XMOS based microphone array (ReSpeaker) is really
> useful. What's the state of play in Linux/OpenWRT for Intel 9560
> capabilities regarding AQM?
> >
> > On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com>
> said:
> >
> > > _______________________________________________
> > > Cake mailing list
> > > Cake@lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> > > https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
> > >
> > > quad-core Celeron J4105 1.5-2.5 GHz x64
> > > 8GB Ram
> > > 2x i211t intel ethernet controllers
> > > intel 9560 802.11ac (wave2) wifi/bluetooth chipset
> > > intel built-in graphics
> > > onboard ARM Cortex-M0 and RPi & Arduino headers
> > > m.2 and PCIe adapters
> > > <$200
> > >
> >
> >
> > _______________________________________________
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>
-- 
- Sent from my iPhone.

[-- Attachment #2: Type: text/html, Size: 3753 bytes --]

^ permalink raw reply	[relevance 1%]

* Re: [Cake] [Bloat]  New board that looks interesting
  2020-04-04 14:47  1% ` David P. Reed
@ 2020-04-04 16:10  1%   ` Dave Taht
  2020-04-04 16:27  1%     ` Aaron Wood
  0 siblings, 1 reply; 34+ results
From: Dave Taht @ 2020-04-04 16:10 UTC (permalink / raw)
  To: David P. Reed; +Cc: Aaron Wood, Cake List, Make-Wifi-fast, bloat

Historically I've found the "Celeron" chips rather weak, but it's just
a brand. I haven't the foggiest idea how well this variant will
perform.

The intel ethernet chips are best of breed in linux, however. It's
been my hope that the 211 variant with the timed networking support
would show up in the field (sch_etx) so we could fiddle with that,
(the apu2s aren't using that version) but I cannot for the life of me
remember the right keywords to look it up at the moment. this feature
lets you program when a packet emerges from the driver and is sort of
a whole new ballgame when it comes to scheduling - there hasn't been
an aqm designed for it, and you can do fq by playing tricks with the
sent timestamp.

All the other features look rather nice on this board.

On Sat, Apr 4, 2020 at 7:47 AM David P. Reed <dpreed@deepplum.com> wrote:
>
> Thanks! I ordered one just now. In my experience, this company does rather neat stuff. Their XMOS based microphone array (ReSpeaker) is really useful. What's the state of play in Linux/OpenWRT for Intel 9560 capabilities regarding AQM?
>
> On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com> said:
>
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> > https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
> >
> > quad-core Celeron J4105 1.5-2.5 GHz x64
> > 8GB Ram
> > 2x i211t intel ethernet controllers
> > intel 9560 802.11ac (wave2) wifi/bluetooth chipset
> > intel built-in graphics
> > onboard ARM Cortex-M0 and RPi & Arduino headers
> > m.2 and PCIe adapters
> > <$200
> >
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] New board that looks interesting
  @ 2020-04-04 14:47  1% ` David P. Reed
  2020-04-04 16:10  1%   ` [Cake] [Bloat] " Dave Taht
  0 siblings, 1 reply; 34+ results
From: David P. Reed @ 2020-04-04 14:47 UTC (permalink / raw)
  To: Aaron Wood; +Cc: cake, bloat, Make-Wifi-fast

Thanks! I ordered one just now. In my experience, this company does rather neat stuff. Their XMOS based microphone array (ReSpeaker) is really useful. What's the state of play in Linux/OpenWRT for Intel 9560 capabilities regarding AQM?

On Saturday, April 4, 2020 12:12am, "Aaron Wood" <woody77@gmail.com> said:

> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
> https://www.seeedstudio.com/ODYSSEY-X86J4105800-p-4445.html
> 
> quad-core Celeron J4105 1.5-2.5 GHz x64
> 8GB Ram
> 2x i211t intel ethernet controllers
> intel 9560 802.11ac (wave2) wifi/bluetooth chipset
> intel built-in graphics
> onboard ARM Cortex-M0 and RPi & Arduino headers
> m.2 and PCIe adapters
> <$200
> 



^ permalink raw reply	[relevance 1%]

* Re: [Cake] tc-cake(8) needs to explain a common mistake
  2020-04-03 20:44  0%   ` Sebastian Moeller
@ 2020-04-03 21:37  1%     ` Alexander E. Patrakov
  0 siblings, 0 replies; 34+ results
From: Alexander E. Patrakov @ 2020-04-03 21:37 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Dave Täht, Cake List

On Sat, Apr 4, 2020 at 1:44 AM Sebastian Moeller <moeller0@gmx.de> wrote:

> >>
> >> Example 1: the ADSL modem connects at 18 Mbit/s, but the ISP further
> >> throttles the speed to 15 Mbit/s because that's what the user pays
> >> for, and does so with a shaper that has bufferbloat. Then, the "adsl"
> >> keyword is likely not appropriate, because the ISP's shaper operates
> >> on the IP level. The bandwidth needs to be set slightly below 15
> >> Mbit/s.
>
>         Let's run the number shall we? I simply make a few assumptions here to get things started, but the exact numbers really do not matter too much. With that said, let's assume TCP/IPv4 and ATM/AAL5, PPPoE, LLC/SNAP, RFC-2684;
> Overhead (bytes): PPP (2), PPPoE (6), Ethernet Header (14), Ethernet PAD [8] (0), ATM LLC (3), ATM SNAP (5), ATM pad (2), ATM AAL5 SAR (8) : Total 40
>
> Let's see what the link will be able to deliver for "full" MTU 1500 packets (quotes as the MTU1500 will only carry to the PPPoE endpoint, internet MTU is going to be 1492)
> gross-rate * ((payload size) / (on the wire size)) = net speedtest result (let's use this as proxy as this is what people can easily verify/check)
>
> MTU1500: 18.000 * ((1500-20-20-8) / (ceil((1500-8+40)/48)*53)) = 15.410
> MTU150: 18.000 * ((150-20-20-8) / (ceil((150-8+40)/48)*53)) = 8.66037735849
> MTU75: 18.000 * ((75-20-20-8) / (ceil((75-8+40)/48)*53)) = 3.05660377358
>
>
> Now the IP-level shaper at ~80% of the link-speed, if it does not account for the ATM/AAL5 "celling" even if it gets the overhead correctly will give the following:
>
> MTU1500: 15.000 * ((1500-20-20-8) / (ceil((1500-8+40)/1)*1)) = 14.2167101828
> MTU150: 15.000 * ((150-20-20-8) / (ceil((150-8+40)/1)*1)) = 8.40659340659
> MTU75: 15.000 * ((75-20-20-8) / (ceil((75-8+40)/1)*1)) = 3.78504672897
>
> So for large enough packets static accounting for ATM/AAL5 works reasonably well, but for small packets it fails.
> That is why most ISP-grade equipment allows not only to configure the per-packet-overhead for end-user links but also can deal with ATM/AAL5. And as far as I understand most competent ISPs actually configure their traffic-shapers for ADSL links to do this, because DSLAMs are really more like L2-switches with fancy media-converters attached and deal not terribly well with overload and queueing into the switch fabric.
>
> That in turn leads to the following situation:
> MTU1500: 15.000 * ((1500-20-20-8) / (ceil((1500-8+40)/48)*53)) = 12.842
> MTU150: 15.000 * ((150-20-20-8) / (ceil((150-8+40)/48)*53)) = 7.217
> MTU75: 15.000 * ((75-20-20-8) / (ceil((75-8+40)/48)*53)) = 2.547
>
> which will obviously not cause packet buffering in the DSLAM for any packet size mix the link might encounter. AND that in turn means that the actual bottleneck link (the ISP's traffic shaper) still behaves like it would employ ATM/AAL5 encapsulation, and hence the end-user's SQM instance should do as well.

OK, bad (marginal) example, let's adjust it so that the user pays for
10 Mbit/s. Or replace with a 100BASE-TX link shaped (badly) to 50
Mbit/s by the ISP. The point is that sometimes the ISP shaper is what
matters, and ISPs like to sell bandwidth in packages with round
numbers.

And, is the accounting for ATM/AAL5 the default on equipment that ISPs
use for ADSL?

P.S. The example actually comes from my experience with the "Globe"
ISP in the Philippines during my trip there - I had to specifically
ask to increase the speed limit, it was initially 10 Mbit/s and then
upgraded to 15 Mbit/s. The modem always connected at 18 - 19 Mbit/s,
and there was a 21 Mbit/s value once (when I connected the modem to
the powerbank during power outage). And I am not sure that we are
always talking about competent ISPs.

<snip>

> P.S.: I am of the opinion, that https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details had very sane and un-cargo-culty advice about the overhead topic:
> "Getting [overhead and link layer accounting] exactly right is less important than getting it close, and over-estimating by a few bytes is generally better at keeping bufferbloat down than underestimating. With this in mind, to get started, set the Link Layer Adaptation options based on your connection to the Internet. "
>
> I am less sure about the paragraph you added recently, as it does not seem to consider all the applicable subtleties.

Should I undo it?

-- 
Alexander E. Patrakov
CV: http://pc.cd/PLz7

^ permalink raw reply	[relevance 1%]

* Re: [Cake] tc-cake(8) needs to explain a common mistake
  2020-04-03 18:49  1% ` [Cake] tc-cake(8) needs to explain a common mistake Dave Taht
@ 2020-04-03 20:44  0%   ` Sebastian Moeller
  2020-04-03 21:37  1%     ` Alexander E. Patrakov
  0 siblings, 1 reply; 34+ results
From: Sebastian Moeller @ 2020-04-03 20:44 UTC (permalink / raw)
  To: Dave Täht; +Cc: Alexander E. Patrakov, Cake List



> On Apr 3, 2020, at 20:49, Dave Taht <dave.taht@gmail.com> wrote:
> 
> so nice to know cake has made it to russia!!!
> 
> On Fri, Apr 3, 2020 at 11:46 AM Alexander E. Patrakov
> <patrakov@gmail.com> wrote:
>> 
>> Hello,
>> 
>> there is a recurring cargo cult pattern in many forums (e.g. OpenWRT):
>> people keep suggesting various overhead compensation parameters to
>> tc-cake without checking what's the bottleneck. They just assume that
>> it is always related to the link-layer technology of the connection.
>> 
>> This assumption is mostly incorrect, and this needs to be explained in
>> the manual page to stop the cargo cult. E.g., here in Russia, in the
>> past year, I had a 1Gbit/s link (1000BASE-X) but they shaped my
>> connection down to 500 Mbit/s because that's the bandwidth that I paid
>> for. I.e. the link from my router to the ISP equipment was not the
>> bottleneck, it was the ISP's shaper.
>> 
>> How about the following addition to the tc-cake(8) manual page, just
>> before "Manual Overhead Specification"? Feel free to edit.
>> 
>> General considerations
>> -------------------------------
>> 
>> Do not blindly set the overhead compensation parameters to match the
>> internet connection link type and protocols running on it. Doing so
>> makes sense only if that link (and not something further in the path,
>> like the ISP's shaper) is indeed the bottleneck.

	Well, In general yes, but in reality a competent ISP will configure its shapers to account for the link properties, so assuming the access link's overhead/encapsulation gives a reasonable first guess at what values might be optimal.


>> 
>> Example 1: the ADSL modem connects at 18 Mbit/s, but the ISP further
>> throttles the speed to 15 Mbit/s because that's what the user pays
>> for, and does so with a shaper that has bufferbloat. Then, the "adsl"
>> keyword is likely not appropriate, because the ISP's shaper operates
>> on the IP level. The bandwidth needs to be set slightly below 15
>> Mbit/s.

	Let's run the number shall we? I simply make a few assumptions here to get things started, but the exact numbers really do not matter too much. With that said, let's assume TCP/IPv4 and ATM/AAL5, PPPoE, LLC/SNAP, RFC-2684;
Overhead (bytes): PPP (2), PPPoE (6), Ethernet Header (14), Ethernet PAD [8] (0), ATM LLC (3), ATM SNAP (5), ATM pad (2), ATM AAL5 SAR (8) : Total 40

Let's see what the link will be able to deliver for "full" MTU 1500 packets (quotes as the MTU1500 will only carry to the PPPoE endpoint, internet MTU is going to be 1492)
gross-rate * ((payload size) / (on the wire size)) = net speedtest result (let's use this as proxy as this is what people can easily verify/check)

MTU1500: 18.000 * ((1500-20-20-8) / (ceil((1500-8+40)/48)*53)) = 15.410
MTU150: 18.000 * ((150-20-20-8) / (ceil((150-8+40)/48)*53)) = 8.66037735849
MTU75: 18.000 * ((75-20-20-8) / (ceil((75-8+40)/48)*53)) = 3.05660377358


Now the IP-level shaper at ~80% of the link-speed, if it does not account for the ATM/AAL5 "celling" even if it gets the overhead correctly will give the following:

MTU1500: 15.000 * ((1500-20-20-8) / (ceil((1500-8+40)/1)*1)) = 14.2167101828
MTU150: 15.000 * ((150-20-20-8) / (ceil((150-8+40)/1)*1)) = 8.40659340659
MTU75: 15.000 * ((75-20-20-8) / (ceil((75-8+40)/1)*1)) = 3.78504672897

So for large enough packets static accounting for ATM/AAL5 works reasonably well, but for small packets it fails.
That is why most ISP-grade equipment allows not only to configure the per-packet-overhead for end-user links but also can deal with ATM/AAL5. And as far as I understand most competent ISPs actually configure their traffic-shapers for ADSL links to do this, because DSLAMs are really more like L2-switches with fancy media-converters attached and deal not terribly well with overload and queueing into the switch fabric.

That in turn leads to the following situation:
MTU1500: 15.000 * ((1500-20-20-8) / (ceil((1500-8+40)/48)*53)) = 12.842
MTU150: 15.000 * ((150-20-20-8) / (ceil((150-8+40)/48)*53)) = 7.217
MTU75: 15.000 * ((75-20-20-8) / (ceil((75-8+40)/48)*53)) = 2.547

which will obviously not cause packet buffering in the DSLAM for any packet size mix the link might encounter. AND that in turn means that the actual bottleneck link (the ISP's traffic shaper) still behaves like it would employ ATM/AAL5 encapsulation, and hence the end-user's SQM instance should do as well.



>> 
>> Example 2: the ADSL modem connects at 18 Mbit/s, and the user pays for
>> "as fast as the modem can get" connection. Then, the "adsl" keyword is
>> relevant, and the bandwidth needs to be set to 18 Mbit/s.

	IMHO that is simply a less opaque version of example 1, but I agree here ATM/AAL5 accounting needs to be done...


>> 
>> Example 3: the user has a 100BASE-TX Ethernet connection, and pays for
>> the full 100 Mbit/s bandwidth (i.e. there is no shaper further up).
>> Then, the "ethernet" keyword is relevant, and the bandwidth needs to
>> be set to 100 Mbit/s.

	Yes, if the true bottleneck is an ethernet link, then ethernet overhead needs to be accounted for, so overhead 38 bytes (or 42 if VLANs are in use) would be correct. 


I fully concur that properly accounting for the properties of the true bottleneck is the challenge here. And with few exceptions this is hard, as we have two dependent variables to set gross shaper rate and per-packet-overhead. Looking at the numbers, I have reluctantly become convinced, that getting the per-packet-overhead correctly is not that important as long as one does not under-estimate it, so if in doubt, I rather recommend to err on the side of too much (but I value low latency above maximum throughput, like most people starting to look at sqm).

Best Regards
	Sebastian

P.S.: I am of the opinion, that https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details had very sane and un-cargo-culty advice about the overhead topic:
"Getting [overhead and link layer accounting] exactly right is less important than getting it close, and over-estimating by a few bytes is generally better at keeping bufferbloat down than underestimating. With this in mind, to get started, set the Link Layer Adaptation options based on your connection to the Internet. "

I am less sure about the paragraph you added recently, as it does not seem to consider all the applicable subtleties.


>> 
>> --
>> Alexander E. Patrakov
>> CV: http://pc.cd/PLz7
> 
> 
> 
> -- 
> Make Music, Not War
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[relevance 0%]

* Re: [Cake] tc-cake(8) needs to explain a common mistake
       [not found]     <CAN_LGv1h8Ut4bGm7ZgYaGV_Tbdy3ABW+epb_p6jeX=TxnAvH1g@mail.gmail.com>
@ 2020-04-03 18:49  1% ` Dave Taht
  2020-04-03 20:44  0%   ` Sebastian Moeller
  0 siblings, 1 reply; 34+ results
From: Dave Taht @ 2020-04-03 18:49 UTC (permalink / raw)
  To: Alexander E. Patrakov, Cake List

so nice to know cake has made it to russia!!!

On Fri, Apr 3, 2020 at 11:46 AM Alexander E. Patrakov
<patrakov@gmail.com> wrote:
>
> Hello,
>
> there is a recurring cargo cult pattern in many forums (e.g. OpenWRT):
> people keep suggesting various overhead compensation parameters to
> tc-cake without checking what's the bottleneck. They just assume that
> it is always related to the link-layer technology of the connection.
>
> This assumption is mostly incorrect, and this needs to be explained in
> the manual page to stop the cargo cult. E.g., here in Russia, in the
> past year, I had a 1Gbit/s link (1000BASE-X) but they shaped my
> connection down to 500 Mbit/s because that's the bandwidth that I paid
> for. I.e. the link from my router to the ISP equipment was not the
> bottleneck, it was the ISP's shaper.
>
> How about the following addition to the tc-cake(8) manual page, just
> before "Manual Overhead Specification"? Feel free to edit.
>
> General considerations
> -------------------------------
>
> Do not blindly set the overhead compensation parameters to match the
> internet connection link type and protocols running on it. Doing so
> makes sense only if that link (and not something further in the path,
> like the ISP's shaper) is indeed the bottleneck.
>
> Example 1: the ADSL modem connects at 18 Mbit/s, but the ISP further
> throttles the speed to 15 Mbit/s because that's what the user pays
> for, and does so with a shaper that has bufferbloat. Then, the "adsl"
> keyword is likely not appropriate, because the ISP's shaper operates
> on the IP level. The bandwidth needs to be set slightly below 15
> Mbit/s.
>
> Example 2: the ADSL modem connects at 18 Mbit/s, and the user pays for
> "as fast as the modem can get" connection. Then, the "adsl" keyword is
> relevant, and the bandwidth needs to be set to 18 Mbit/s.
>
> Example 3: the user has a 100BASE-TX Ethernet connection, and pays for
> the full 100 Mbit/s bandwidth (i.e. there is no shaper further up).
> Then, the "ethernet" keyword is relevant, and the bandwidth needs to
> be set to 100 Mbit/s.
>
> --
> Alexander E. Patrakov
> CV: http://pc.cd/PLz7



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks?
  @ 2020-03-28 23:15  1%       ` David P. Reed
  0 siblings, 0 replies; 34+ results
From: David P. Reed @ 2020-03-28 23:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Dave Taht, Make-Wifi-fast, Anthony Minessale II, Cake List,
	Ken Rice, cerowrt-devel, bloat

Regarding EDF.

I've been pushing folks to move latency sensitive computing in ALL OS's to a version of EDF since about 1976. This was when I was in grad school working on distributed computing on LANs. In fact, it is where I got the idea for my Ph.D. thesis (completed in 1978) which pointed out a bigger idea - that getting ACID consistency [ACID hadn't been invented then as a term, we called it atomic actions] on data in a distributed system being processed by concurrent distributed transactions can be done by using timestamps that behave like the "deadlines" in EDF. In fact, the scheduling of code in my thesis was a generalized version of EDF, approximated because of the impossibility of perfect synchronization.

The Croquet system, which was a real-time edge based decentralized system, with no central server, that we demonstrated with a Second-Life style virtual world that wored entirely on a set of laptops that could be across the country from each other was based on an OS implemented in a variant of the Squeak programming language, where the scheduling and object model was not process based, but message based with replicated computation synchronized via a shared "timestamp" that was used for execution scheduling (essentially distributed EDF). The latency requirements for this distributed virtual world were on the order of 100 msec. simultaneity for mouse clicks affecting all participating nodes across the country in a virtual 3D world, with sound, etc.

Croquet was built in 2 years by 3 people (starting from scratch).  And scheduling was never a problem, nor was variable network delay (our protocol was based on UDP frames synchronized by the same timestamps used to synchronize each object method execution.

The operating system model is one I created within that modified Squeak environment as part of its base "interpreter", which wasn't a loop, but a scheduler using EDF.

To make this work properly, the programming model has to be unified around this kind of scheduling.

And here's why I am mentioning this. To put EDF *only* into the networking stack, but leave the userspace applicaiton living with the stupid Linux timesharing system scheduler, optimized for people typing commands on terminals every few seconds and running batch compilation is the *worst of all possible ways to use EDF*.

Because it creates a huge mess bridging those two ideas.

Croquet is a much more complicated thing that a teleconferencing system, because it actually lets end users write simple programs that control the user interactive experience, 30 frames per second across the entire US, replicated on each computer, in the Squaak variant of Smalltalk. And we did it with 3 coders in a couple of years. (yes, they are sckilled people - me, David A. Smith, and the late Andreas Raab, who died way too young).

In contrast, trying to bridge between EDF and regular Linux processes running under the ordinary scheduler, even with "nice" and all kinds of hacks, just to do a video conferencing system with fixed, non-programmable behavior, would take far more design, far more lines of code, etc.

So this is why I think timesharing OS's are really obsolescent for modern distributed interactive systems. Yeah, "rsync" and "git" are nice for batch replication of files. ANd yeah, EDF can help make them perform faster in their file transferring.

But to make an immersive, real-time experience (which is what computing today is all about, on all time scales, even in the servers other than HPC) it is ALL wrong, and incrementally patching little pieced of Linux ain't gonna get there. Windows or BSD (macOS) ain't gonna do it either.

I'm old. Why is Linux living in the idea space of operating systems that precededed networking, distributed computing, media sharing?

My opinion, and it is only an opinion based on experience, is that it really is time for networking to stop focusing on file transfers, and OS's to stop focusing on timesharing behavior. The world is "live" and time-based. It may not be hard-real-time. But latency is what matters.

Since networking will remain separate from OS's, the interface concepts in both really need to be matched to get to that future.

It's why I pushed so hard for UDP, not reliable in-order streams alone. And in my view, though no one every implemented it, those UDP packets will be carring times, essential for synchronization of coordinated operations at all the endpoints of the computation.

I'd love to see that happen before this old guy dies. I think it will make it a whole lot easier to make networked programs work.

Decentralization isn't "blockchain". My thesis, in 1978, talked about one way to decentralize computation, not just data structures. And timing is critical.

Sorry for the rant. I'm tired of waiting for "backwards compatibility" with Unix version 1 to allow us to go forward. To me, Linux is a great version of a subset of the operating systems I worked on in the early 1970's. And little more.









On Saturday, March 28, 2020 3:58pm, "Toke Høiland-Jørgensen" <toke@redhat.com> said:

> Dave Taht <dave.taht@gmail.com> writes:
> 
>>> So: 1. We really should rethink how timing-sensitive algorithms are
>>> expressed, and it isn't gonna be good to base them on semaphores and
>>> threads that run at random rates. That means a very different OS
>>> conceptual framework. Can this share with, say, the Linux we know and
>>> love - yes, the hardware can be shared. One should be able to
>>> dedicate virtual processors that are not running Linux processes, but
>>> instead another computational model (dataflow?).
>>
>> Linux switched to an EDF model for networking in 5.0
> 
> Not entirely. There's EDT scheduling, and the TCP stack is mostly
> switched over, I think. But as always, Linux evolves piecemal :)
> 
>>> 2. EBPF is interesting, because it is more secure, and is again
>>> focused on running code at kernel level, event-driven. I think it
>>> would be a seriously difficult lift to get it to the point where one
>>> could program the networked media processing in BPF.
>>
>> But there is huge demand for it, so people are writing way more in it
>> than i ever ever thought possible... or desirable.
> 
> Tell me about it.
> 
> We have seen a bit of interest for combining eBPF with realtime, though.
> With the upstreaming of the realtime code, support has landed for
> running eBPF even on realtime kernels. And we're starting to see a bit
> of interest for looking specifically at latency bounds for network
> processing (for TSN), including XDP. Nothing concrete yet, though.
> 
> -Toke
> 
> 



^ permalink raw reply	[relevance 1%]

* Re: [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks?
    2020-03-27 19:00  1% ` David P. Reed
@ 2020-03-28  6:53  1% ` Anthony Minessale II
  1 sibling, 0 replies; 34+ results
From: Anthony Minessale II @ 2020-03-28  6:53 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat, Make-Wifi-fast, Cake List, cerowrt-devel, Ken Rice

[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]

Working on this a bit right now. I have the controls to tell the browser to
send less manually but not auto. We might add transport-cc as google seems
to have picked that one. We can have you on a call sometime to test.



On Fri, Mar 27, 2020 at 12:27 PM Dave Taht <dave.taht@gmail.com> wrote:

> sort of an outgrowth of this convo:
>
> https://lwn.net/SubscriberLink/815751/786d161d06a90f0e/
>
> I imagine worldwide videoconferencing quality could be much better if
> we could convince more folk to
> finally install sqm or upgrade to a working docsis 3.1 solution, etc.
> Maybe some rag somewhere will finally pick up on bufferbloat solutions
> and run with it? Or we can write some articles? Or reach out to school
> systems? Or?
>
> I've been fiddling with jitsi, and am about to give freeswitch a try.
> Last I looked freeswitch's otherwise pretty nifty conference bridge
> didn't dynamically adjust at all due to e2e signalling, but that was
> years ago. (?)
>
> I have to admit that p2p multiparty videoconferencing seems more
> plausible in a de-bufferbloated age, but
> haven't explored what tools are available. (?)
>
> There's also been this somewhat entertaining convo on the ietf mbone
> list:
> https://mailarchive.ietf.org/arch/msg/mboned/2thFQk_IYn38XCZBQavhUmOd6tk/
>
> Around me there has been this huge interest in "streaming". The user
> agreement for these (see restream.io's) is scary - and the copyright
> police have control... but I am very happy to report that even a
> couple really lousy long distance fq_codel'd ath9k links work *really*
> well (with facebook's implementation), where a non fq_codeled link
> (ath10k) failed miserably... and setting up a reflector in nginx also
> failed miserably.
>
> Anyone working on the ath10k AQL backport for openwrt as yet?
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>


-- 

[image: Inline image 1]

Anthony Minessale II | President

FreeSWITCH Solutions | 17345 Civic Drive #2531 Brookfield, WI 53045
<https://maps.google.com/?q=17345+Civic+Drive+%232531+Brookfield,+WI+53045&entry=gmail&source=g>

Email: anthm@freeswitch.com

Mobile: +12623098501

Website: https://www.FreeSWITCH.com <https://www.freeswitch.com/>

[image: color-facebook-96.png] <https://www.facebook.com/freeswitch/>[image:
color-twitter-96.png]
<https://twitter.com/freeswitch?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor>

[-- Attachment #2: Type: text/html, Size: 6735 bytes --]

^ permalink raw reply	[relevance 1%]

* Re: [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks?
  2020-03-27 19:00  1% ` David P. Reed
  2020-03-27 19:12  1%   ` David Lang
  2020-03-27 19:36  1%   ` Dave Taht
@ 2020-03-27 20:32  1%   ` Dave Taht
    2 siblings, 1 reply; 34+ results
From: Dave Taht @ 2020-03-27 20:32 UTC (permalink / raw)
  To: David P. Reed
  Cc: bloat, Make-Wifi-fast, Cake List, cerowrt-devel,
	Anthony Minessale II, Ken Rice

I don't know to what extent the freeswitch guys would be interested in
this thread. I'd like find a good list or forum to talk about the
state of the art in videoconferencing ? , the ietf rmcat and webrtc
lists are mostly dead. hangouts, jitsi, zoom, etc, seem to be pretty
good products
nowadays (at least in my fq_codel'd environment), but solid info on
how to make them better in the home and for online tele-learning

On Fri, Mar 27, 2020 at 12:00 PM David P. Reed <dpreed@deepplum.com> wrote:
>
> Congestion control for real-time video is quite different than for streaming. Streaming really is dealt with by a big enough (multi-second) buffering, and can in principle work great over TCP (if debloated).

Your encoder still has to adjust to the available bandwidth. The
facebook streaming application did this beautifully through my very
limited highly shared 5mbit uplink - adjusting quickly to a parallel
rrul test in particular by skipping some frames. then lowering the
frame rate and quality, but an early attempt of mine to merely reflect
rtmp streams did not, neither an attempt with "obs studio".

there was about 30 sec of delay in the facebook test - I figure some
of this is tuned to visible uplink buffer sizes (still seconds over
cell), but also to give the riaa a shot at censoring the audio. (a
commercial song crept into - over a mic! - which was detected as
infringing on one attempt which automatically muted the audio and
keyed a nastygram from fb)

I'm going to poke into obs studios underlying code (rtsp anyone?0 at
some point, and really - udp with a head dropping aqm is the best
thing for transporting video, IMHO.

> UDP congestion control MUST be end-to-end and done in the application layer, which is usually outside the OS kernel. This makes it tricky, because you end up with latency variation due to eh OS's process scheduler that is on the order of magnitude of the real-time requirements for air-to-air or light-to-light response (meaning the physical transition from sound or picture to and from the transducer).

We are so far from that point! encoder latencies today are in the
100+ms range. I always liked the opus codec because it can get down to
2.7ms encoding latencies, and a doubled frame rate camera 8ms.... but
video encoding rates Im out of date on. (?)

One long deferred piece of webrtc/rmcat research I always meant to do
was audio and video on separate ports in the stream,
and using that 2.7m opus clock and depending on fq at the bottleneck
to provide better congestion control information by treating the
smaller audio packets as a clock signal. Due to lack of port space and
a widespread perception that fq isn't out there, most
videoconferencing streams multiplex everything over the same port.
With ipv6 in place, well, port space is no longer a problem.

>
> This creates a godawful mess when trying to do an app. Whether in WebRTC (peer to peer UDP) or in a Linux userspace app, the scheduler has huge variance in delay.

I figure the bounding scheduler latency is still well manageable below
a single 60fps frame.

> Now getting rid of bloat currently requires TCP to respond to congestion signalling. UDP in the kernel doesn't do that, and it doesn't tell userspace much either (you can try to detect packet drops in userspace, but coding that up is quite hard because the schdulers get in the way of measurement, and forget about ECN being seen in userspace)

ECN in userspace is easy on udp, except that most api's tend to
abstract into a file handle style abstraction and a single return of
data, not control information, and the api for getting tos options
ugly. APIs that can return data and info (data, packetheader) =
getudp_someway() probably exist for more modern languages like go, but
rarely c or c++. Totally out of date on this, last I looked at the
google congestion congtrol code bae was in mozilla... 8 years ago!

As for doing udp semi-efficiently in batches...

sendmmsg, recvmmsg is a rather underused kernel api. And ugly as sin.
With some major limitations.


>
> This is OS architecture messiness, not a layer 2 or 3 issue.

To me the nightmare starts with most cpu context switch latencies
being 1000s of clocks nowadays.

>
> I've thought about this a lot. Here's my thoughts:
>
> I hate putting things in the kernel! It's insecure. But what this says is that for very historical and stupid reasons (related to the ideas of early timesharing systems like Unix and Multics) folks try to make real-time algorithms look like ordinary "processes" whose notion of controlling temporal behavior is abstracted away.

On the whole, with the rise of quic - in particular quic, as multiple
userspace libs have been emerging - we've got good bases to move
forward with more stuff in userspace.

>
> So:
> 1. We really should rethink how timing-sensitive algorithms are expressed, and it isn't gonna be good to base them on semaphores and threads that run at random rates. That means a very different OS conceptual framework. Can this share with, say, the Linux we know and love - yes, the hardware can be shared. One should be able to dedicate virtual processors that are not running Linux processes, but instead another computational model (dataflow?).

Linux switched to an EDF model for networking in 5.0

> An example of this (though clunky and unsupported by good tools) is in FreeBSD, it's called *netgraph*. It's a structured way to write reactive algorithms that are demand or arrival driven. It also has some security issues, and since it is heavily based on passing mbufs around it's really quirky. But I have found it useful for the kind of things that need to get done in teleconferencing voice and video.

Neat.

>
> 2. EBPF is interesting, because it is more secure, and is again focused on running code at kernel level, event-driven.  I think it would be a seriously difficult lift to get it to the point where one could program the networked media processing in BPF.

But there is huge demand for it, so people are writing way more in it
than i ever ever thought possible... or desirable.

>
> 3. One of the nice things about KVM (hardware virtualization) is that potentially it lets different low level machine models share a common machine. It occurs to me that using VIRTIO network devices and some kind of VIRTIO media processing devices, that a KVM virtual machine could be hooked up to the packet-level networking drivers in the end device, isolating the teleconferencing from the rest of the endpoint OS, and creating the right kind of near-bare--metal environment for managing the timing of network packets and the paths to the screen and audio that would be simple and clean and tightly scheduled. KVM could "own" one or more of the physical cores during the teleconference.

see also sch_etx and  van's teaching nics about time -
https://netdevconf.info/0x12/news.html?keynote-recording-is-up there
has been a lot of progress on this front in the past few years -
having applications say when they want a packet to emerge -

but the offload discussion over on the linux list I referenced has
seemingly missed this idea entirely.


>
> You can see, though, that this isn't just a "network protocol design" problem. This is only partly a network protocol issue, but one that is coupled with the architecture of the end systems.
>
> I reminisce a little bit thinking back to the 1970's and 80's when TCP/IP and UDP/IP were being designed. Sadly, it was one of the big problems of communicating between the OS community and the protocol community that the OS community couldn't think outside the "timesharing" system box, and the protocol community thought of networking like phone calls (sessions). This is where the need for control of timing and buffering got lost. The timesharing folks largely thought of networks as for reliable timeless sequential "streams" of data that had no particular urgency. The network protocol folks were focused on ARQ.
> Only a few of us cared about end-to-end latency bounds (where ends meant keyboard click or audio sample to screen display change or speaker motion).

I recently got a usb keyboard with truly annoying latencies in it.
https://danluu.com/ 's work makes me feel better about people
perpetually ignoring this. An Apple II won the benchmark...

Last year I got a voice processor, with 70+ms usb latencies for audio
- useless for overdubs. same for all the usb based audio mixers made
today.

thunderbolt based  audio gear is hard to find and expensive, I have
been scrounging old firewire based stuff. (hey, if anyone has a RME
multiface card let me know off list)

> The packet speech guys did, but most networking guys wanted to toss them under the bus as annoying. And those of us doing distributed multinode algorithms did, but the remote login and FTP guys were skeptical that would ever matter.

Yep, we're annoying. But annoyed. And: I really think it would be a
less stressed and better communicating world if we got cell phone
audio
latencies, in partiular, back below 20ms.

> It's the latency, stupid. Not the reliability, nor the consistency, nor throughput. Unless both the OS and the path are focused on minimizing latency, a vast set of applications will suck. Unfortunately, both the OS and network communities are *stuck* in a world where latency is uncontrollable, and there are no tools for getting it better.

except ours! :)

>
>
> On Friday, March 27, 2020 1:27pm, "Dave Taht" <dave.taht@gmail.com> said:
>
> > sort of an outgrowth of this convo:
> >
> > https://lwn.net/SubscriberLink/815751/786d161d06a90f0e/
> >
> > I imagine worldwide videoconferencing quality could be much better if
> > we could convince more folk to
> > finally install sqm or upgrade to a working docsis 3.1 solution, etc.
> > Maybe some rag somewhere will finally pick up on bufferbloat solutions
> > and run with it? Or we can write some articles? Or reach out to school
> > systems? Or?
> >
> > I've been fiddling with jitsi, and am about to give freeswitch a try.
> > Last I looked freeswitch's otherwise pretty nifty conference bridge
> > didn't dynamically adjust at all due to e2e signalling, but that was
> > years ago. (?)
> >
> > I have to admit that p2p multiparty videoconferencing seems more
> > plausible in a de-bufferbloated age, but
> > haven't explored what tools are available. (?)
> >
> > There's also been this somewhat entertaining convo on the ietf mbone
> > list: https://mailarchive.ietf.org/arch/msg/mboned/2thFQk_IYn38XCZBQavhUmOd6tk/
> >
> > Around me there has been this huge interest in "streaming". The user
> > agreement for these (see restream.io's) is scary - and the copyright
> > police have control... but I am very happy to report that even a
> > couple really lousy long distance fq_codel'd ath9k links work *really*
> > well (with facebook's implementation), where a non fq_codeled link
> > (ath10k) failed miserably... and setting up a reflector in nginx also
> > failed miserably.
> >
> > Anyone working on the ath10k AQL backport for openwrt as yet?
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> >
>
>


-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks?
  2020-03-27 19:00  1% ` David P. Reed
  2020-03-27 19:12  1%   ` David Lang
@ 2020-03-27 19:36  1%   ` Dave Taht
  2020-03-27 20:32  1%   ` Dave Taht
  2 siblings, 0 replies; 34+ results
From: Dave Taht @ 2020-03-27 19:36 UTC (permalink / raw)
  To: David P. Reed; +Cc: bloat, Make-Wifi-fast, Cake List, cerowrt-devel

Of interest given some of what you say below, there is a huge
discussion on netdev about how to best implement
hardware offloads for network slicing:

https://www.spinics.net/lists/netdev/msg638836.html

Me, I always rolled my eyes up at all the network virtualization stuff
and ran from the room, screaming, given ow much I care about low
latency. The udp vs tcp offload split has been nightmare enough. That
said, to this day I lack a clear idea how any multi-tenant dc
operation really works, I've generally assumed it was policers, and
have deployed sqm (now cake) instead on everything in the cloud that
seemed to need it.

On Fri, Mar 27, 2020 at 12:00 PM David P. Reed <dpreed@deepplum.com> wrote:
>
> Congestion control for real-time video is quite different than for streaming. Streaming really is dealt with by a big enough (multi-second) buffering, and can in principle work great over TCP (if debloated).
>
> UDP congestion control MUST be end-to-end and done in the application layer, which is usually outside the OS kernel. This makes it tricky, because you end up with latency variation due to eh OS's process scheduler that is on the order of magnitude of the real-time requirements for air-to-air or light-to-light response (meaning the physical transition from sound or picture to and from the transducer).
>
> This creates a godawful mess when trying to do an app. Whether in WebRTC (peer to peer UDP) or in a Linux userspace app, the scheduler has huge variance in delay.
>
> Now getting rid of bloat currently requires TCP to respond to congestion signalling. UDP in the kernel doesn't do that, and it doesn't tell userspace much either (you can try to detect packet drops in userspace, but coding that up is quite hard because the schdulers get in the way of measurement, and forget about ECN being seen in userspace)
>
> This is OS architecture messiness, not a layer 2 or 3 issue.
>
> I've thought about this a lot. Here's my thoughts:
>
> I hate putting things in the kernel! It's insecure. But what this says is that for very historical and stupid reasons (related to the ideas of early timesharing systems like Unix and Multics) folks try to make real-time algorithms look like ordinary "processes" whose notion of controlling temporal behavior is abstracted away.
>
> So:
> 1. We really should rethink how timing-sensitive algorithms are expressed, and it isn't gonna be good to base them on semaphores and threads that run at random rates. That means a very different OS conceptual framework. Can this share with, say, the Linux we know and love - yes, the hardware can be shared. One should be able to dedicate virtual processors that are not running Linux processes, but instead another computational model (dataflow?).
> An example of this (though clunky and unsupported by good tools) is in FreeBSD, it's called *netgraph*. It's a structured way to write reactive algorithms that are demand or arrival driven. It also has some security issues, and since it is heavily based on passing mbufs around it's really quirky. But I have found it useful for the kind of things that need to get done in teleconferencing voice and video.
>
> 2. EBPF is interesting, because it is more secure, and is again focused on running code at kernel level, event-driven.  I think it would be a seriously difficult lift to get it to the point where one could program the networked media processing in BPF.
>
> 3. One of the nice things about KVM (hardware virtualization) is that potentially it lets different low level machine models share a common machine. It occurs to me that using VIRTIO network devices and some kind of VIRTIO media processing devices, that a KVM virtual machine could be hooked up to the packet-level networking drivers in the end device, isolating the teleconferencing from the rest of the endpoint OS, and creating the right kind of near-bare--metal environment for managing the timing of network packets and the paths to the screen and audio that would be simple and clean and tightly scheduled. KVM could "own" one or more of the physical cores during the teleconference.
>
> You can see, though, that this isn't just a "network protocol design" problem. This is only partly a network protocol issue, but one that is coupled with the architecture of the end systems.
>
> I reminisce a little bit thinking back to the 1970's and 80's when TCP/IP and UDP/IP were being designed. Sadly, it was one of the big problems of communicating between the OS community and the protocol community that the OS community couldn't think outside the "timesharing" system box, and the protocol community thought of networking like phone calls (sessions). This is where the need for control of timing and buffering got lost. The timesharing folks largely thought of networks as for reliable timeless sequential "streams" of data that had no particular urgency. The network protocol folks were focused on ARQ.
> Only a few of us cared about end-to-end latency bounds (where ends meant keyboard click or audio sample to screen display change or speaker motion). The packet speech guys did, but most networking guys wanted to toss them under the bus as annoying. And those of us doing distributed multinode algorithms did, but the remote login and FTP guys were skeptical that would ever matter.
>
> It's the latency, stupid. Not the reliability, nor the consistency, nor throughput. Unless both the OS and the path are focused on minimizing latency, a vast set of applications will suck. Unfortunately, both the OS and network communities are *stuck* in a world where latency is uncontrollable, and there are no tools for getting it better.
>
>
>
> On Friday, March 27, 2020 1:27pm, "Dave Taht" <dave.taht@gmail.com> said:
>
> > sort of an outgrowth of this convo:
> >
> > https://lwn.net/SubscriberLink/815751/786d161d06a90f0e/
> >
> > I imagine worldwide videoconferencing quality could be much better if
> > we could convince more folk to
> > finally install sqm or upgrade to a working docsis 3.1 solution, etc.
> > Maybe some rag somewhere will finally pick up on bufferbloat solutions
> > and run with it? Or we can write some articles? Or reach out to school
> > systems? Or?
> >
> > I've been fiddling with jitsi, and am about to give freeswitch a try.
> > Last I looked freeswitch's otherwise pretty nifty conference bridge
> > didn't dynamically adjust at all due to e2e signalling, but that was
> > years ago. (?)
> >
> > I have to admit that p2p multiparty videoconferencing seems more
> > plausible in a de-bufferbloated age, but
> > haven't explored what tools are available. (?)
> >
> > There's also been this somewhat entertaining convo on the ietf mbone
> > list: https://mailarchive.ietf.org/arch/msg/mboned/2thFQk_IYn38XCZBQavhUmOd6tk/
> >
> > Around me there has been this huge interest in "streaming". The user
> > agreement for these (see restream.io's) is scary - and the copyright
> > police have control... but I am very happy to report that even a
> > couple really lousy long distance fq_codel'd ath9k links work *really*
> > well (with facebook's implementation), where a non fq_codeled link
> > (ath10k) failed miserably... and setting up a reflector in nginx also
> > failed miserably.
> >
> > Anyone working on the ath10k AQL backport for openwrt as yet?
> >
> > --
> > Make Music, Not War
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-435-0729
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
> >
>
>


-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks?
  2020-03-27 19:00  1% ` David P. Reed
@ 2020-03-27 19:12  1%   ` David Lang
  2020-03-27 19:36  1%   ` Dave Taht
  2020-03-27 20:32  1%   ` Dave Taht
  2 siblings, 0 replies; 34+ results
From: David Lang @ 2020-03-27 19:12 UTC (permalink / raw)
  To: David P. Reed
  Cc: Dave Taht, Make-Wifi-fast, Anthony Minessale II, Cake List,
	Ken Rice, cerowrt-devel, bloat

On Fri, 27 Mar 2020, David P. Reed wrote:

> 
> Congestion control for real-time video is quite different than for streaming. Streaming really is dealt with by a big enough (multi-second) buffering, and can in principle work great over TCP (if debloated).
>
> UDP congestion control MUST be end-to-end and done in the application layer, which is usually outside the OS kernel. This makes it tricky, because you end up with latency variation due to eh OS's process scheduler that is on the order of magnitude of the real-time requirements for air-to-air or light-to-light response (meaning the physical transition from sound or picture to and from the transducer).

at some level this is correct, but if the link is clogged with TCP packets, it 
doesn't matter what your UDP application attempts to do, so installing cake to 
keep individual links from being too congested will allow your UDP application 
have a chance to operate.

David Lang

^ permalink raw reply	[relevance 1%]

* Re: [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks?
  @ 2020-03-27 19:00  1% ` David P. Reed
  2020-03-27 19:12  1%   ` David Lang
                     ` (2 more replies)
  2020-03-28  6:53  1% ` Anthony Minessale II
  1 sibling, 3 replies; 34+ results
From: David P. Reed @ 2020-03-27 19:00 UTC (permalink / raw)
  To: Dave Taht
  Cc: bloat, Make-Wifi-fast, Cake List, cerowrt-devel, Ken Rice,
	Anthony Minessale II

Congestion control for real-time video is quite different than for streaming. Streaming really is dealt with by a big enough (multi-second) buffering, and can in principle work great over TCP (if debloated).

UDP congestion control MUST be end-to-end and done in the application layer, which is usually outside the OS kernel. This makes it tricky, because you end up with latency variation due to eh OS's process scheduler that is on the order of magnitude of the real-time requirements for air-to-air or light-to-light response (meaning the physical transition from sound or picture to and from the transducer).

This creates a godawful mess when trying to do an app. Whether in WebRTC (peer to peer UDP) or in a Linux userspace app, the scheduler has huge variance in delay.

Now getting rid of bloat currently requires TCP to respond to congestion signalling. UDP in the kernel doesn't do that, and it doesn't tell userspace much either (you can try to detect packet drops in userspace, but coding that up is quite hard because the schdulers get in the way of measurement, and forget about ECN being seen in userspace)

This is OS architecture messiness, not a layer 2 or 3 issue.

I've thought about this a lot. Here's my thoughts:

I hate putting things in the kernel! It's insecure. But what this says is that for very historical and stupid reasons (related to the ideas of early timesharing systems like Unix and Multics) folks try to make real-time algorithms look like ordinary "processes" whose notion of controlling temporal behavior is abstracted away.

So: 
1. We really should rethink how timing-sensitive algorithms are expressed, and it isn't gonna be good to base them on semaphores and threads that run at random rates. That means a very different OS conceptual framework. Can this share with, say, the Linux we know and love - yes, the hardware can be shared. One should be able to dedicate virtual processors that are not running Linux processes, but instead another computational model (dataflow?).
An example of this (though clunky and unsupported by good tools) is in FreeBSD, it's called *netgraph*. It's a structured way to write reactive algorithms that are demand or arrival driven. It also has some security issues, and since it is heavily based on passing mbufs around it's really quirky. But I have found it useful for the kind of things that need to get done in teleconferencing voice and video.

2. EBPF is interesting, because it is more secure, and is again focused on running code at kernel level, event-driven.  I think it would be a seriously difficult lift to get it to the point where one could program the networked media processing in BPF.

3. One of the nice things about KVM (hardware virtualization) is that potentially it lets different low level machine models share a common machine. It occurs to me that using VIRTIO network devices and some kind of VIRTIO media processing devices, that a KVM virtual machine could be hooked up to the packet-level networking drivers in the end device, isolating the teleconferencing from the rest of the endpoint OS, and creating the right kind of near-bare--metal environment for managing the timing of network packets and the paths to the screen and audio that would be simple and clean and tightly scheduled. KVM could "own" one or more of the physical cores during the teleconference.

You can see, though, that this isn't just a "network protocol design" problem. This is only partly a network protocol issue, but one that is coupled with the architecture of the end systems.

I reminisce a little bit thinking back to the 1970's and 80's when TCP/IP and UDP/IP were being designed. Sadly, it was one of the big problems of communicating between the OS community and the protocol community that the OS community couldn't think outside the "timesharing" system box, and the protocol community thought of networking like phone calls (sessions). This is where the need for control of timing and buffering got lost. The timesharing folks largely thought of networks as for reliable timeless sequential "streams" of data that had no particular urgency. The network protocol folks were focused on ARQ.
Only a few of us cared about end-to-end latency bounds (where ends meant keyboard click or audio sample to screen display change or speaker motion). The packet speech guys did, but most networking guys wanted to toss them under the bus as annoying. And those of us doing distributed multinode algorithms did, but the remote login and FTP guys were skeptical that would ever matter.

It's the latency, stupid. Not the reliability, nor the consistency, nor throughput. Unless both the OS and the path are focused on minimizing latency, a vast set of applications will suck. Unfortunately, both the OS and network communities are *stuck* in a world where latency is uncontrollable, and there are no tools for getting it better.

 

On Friday, March 27, 2020 1:27pm, "Dave Taht" <dave.taht@gmail.com> said:

> sort of an outgrowth of this convo:
> 
> https://lwn.net/SubscriberLink/815751/786d161d06a90f0e/
> 
> I imagine worldwide videoconferencing quality could be much better if
> we could convince more folk to
> finally install sqm or upgrade to a working docsis 3.1 solution, etc.
> Maybe some rag somewhere will finally pick up on bufferbloat solutions
> and run with it? Or we can write some articles? Or reach out to school
> systems? Or?
> 
> I've been fiddling with jitsi, and am about to give freeswitch a try.
> Last I looked freeswitch's otherwise pretty nifty conference bridge
> didn't dynamically adjust at all due to e2e signalling, but that was
> years ago. (?)
> 
> I have to admit that p2p multiparty videoconferencing seems more
> plausible in a de-bufferbloated age, but
> haven't explored what tools are available. (?)
> 
> There's also been this somewhat entertaining convo on the ietf mbone
> list: https://mailarchive.ietf.org/arch/msg/mboned/2thFQk_IYn38XCZBQavhUmOd6tk/
> 
> Around me there has been this huge interest in "streaming". The user
> agreement for these (see restream.io's) is scary - and the copyright
> police have control... but I am very happy to report that even a
> couple really lousy long distance fq_codel'd ath9k links work *really*
> well (with facebook's implementation), where a non fq_codeled link
> (ath10k) failed miserably... and setting up a reflector in nginx also
> failed miserably.
> 
> Anyone working on the ath10k AQL backport for openwrt as yet?
> 
> --
> Make Music, Not War
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
> 



^ permalink raw reply	[relevance 1%]

* [Cake] Fwd: [RFC PATCH 00/28]: Accurate ECN for TCP
       [not found]     <1584524612-24470-1-git-send-email-ilpo.jarvinen@helsinki.fi>
@ 2020-03-19 22:20  1% ` Dave Taht
  0 siblings, 0 replies; 34+ results
From: Dave Taht @ 2020-03-19 22:20 UTC (permalink / raw)
  To: Cake List

---------- Forwarded message ---------
From: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Date: Wed, Mar 18, 2020 at 2:45 AM
Subject: [RFC PATCH 00/28]: Accurate ECN for TCP
To: <netdev@vger.kernel.org>
Cc: Yuchung Cheng <ycheng@google.com>, Neal Cardwell
<ncardwell@google.com>, Eric Dumazet <eric.dumazet@gmail.com>, Olivier
Tilmans <olivier.tilmans@nokia-bell-labs.com>


Hi all,

Here's the full Accurate ECN implementation mostly based on
  https://tools.ietf.org/html/draft-ietf-tcpm-accurate-ecn-11

Comments would be highly appreciated. The GSO/TSO maze of bits
in particular is something I'm somewhat unsure if I got it
right (for a feature that has a software fallback).

There is an extensive set of packetdrill unit tests for most of
the functionality (I'll send separately to packetdrill).

Please note that this submission is not yet intented to be
included to net-next because some small changes seem still
possible to the spec.

 Documentation/networking/ip-sysctl.txt |  12 +-
 drivers/net/tun.c                      |   3 +-
 include/linux/netdev_features.h        |   3 +
 include/linux/skbuff.h                 |   2 +
 include/linux/tcp.h                    |  19 ++
 include/net/tcp.h                      | 221 ++++++++++---
 include/uapi/linux/tcp.h               |   9 +-
 net/ethtool/common.c                   |   1 +
 net/ipv4/bpf_tcp_ca.c                  |   2 +-
 net/ipv4/syncookies.c                  |  12 +
 net/ipv4/tcp.c                         |  10 +-
 net/ipv4/tcp_dctcp.c                   |   2 +-
 net/ipv4/tcp_dctcp.h                   |   2 +-
 net/ipv4/tcp_input.c                   | 558 ++++++++++++++++++++++++++++-----
 net/ipv4/tcp_ipv4.c                    |   8 +-
 net/ipv4/tcp_minisocks.c               |  84 ++++-
 net/ipv4/tcp_offload.c                 |  11 +-
 net/ipv4/tcp_output.c                  | 298 +++++++++++++++---
 net/ipv4/tcp_timer.c                   |   4 +-
 net/ipv6/syncookies.c                  |   1 +
 net/ipv6/tcp_ipv6.c                    |   4 +-
 net/netfilter/nf_log_common.c          |   4 +-

--
 i.

ps. My apologies if you got a duplicate copy of them. It seems that
answering "no" to git send-email asking "Send this email?" might
still have sent something out.



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] Large number of Flows
       [not found]       ` <etPan.5e4ab6c5.653ea685.1b7f@surfglobal.net>
@ 2020-02-17 18:21  1%     ` Dave Taht
  0 siblings, 0 replies; 34+ results
From: Dave Taht @ 2020-02-17 18:21 UTC (permalink / raw)
  To: Mike, Cake List

On Mon, Feb 17, 2020 at 7:52 AM Mike <mike@surfglobal.net> wrote:
>
> So 1024 is the max queues that it supports as is, so if I had 1500 users with their own traffic shaping setup per user it would be unsupported without recompiling the kernel?  Is there a command to see how many is used and available?

I think we are starting from two very different reference points for
what we can accomplish. Cake's primary use case is for the CPE, or
customer edge router shaping egress and ingress to suit the customers
needs, and usually below the ISPs (badly) shaped rate.

Each cake instance only does one bandwidth, with that 1024 queue default.

If you are looking to have a bandwidth per subscriber managed
centrally, It is certainly feasible and desirable to use cake from the
ISP premise. How I do it is I have one virtual interface per
"subscriber", managed by a route table (e.g. ip route 192.168.1.99 via
dev cust1; ip -6 route aa::bb:cc::/48 via dev cust1) to which I add
the cake instance, bandwidth parameter, etc

Others like preseem do things like a transparent bridge in between the
switch and the edge (dsl bras, etc

You can setup an htb or drr + one instance of cake per subscriber if you like.

> Also I saw on the fq_codel page they talk about issues with cores and netem but Cake doesn’t seem to use netem to delay packets etc based on the man page, so is the core issue still a factor?

Depends on your requirements. htb + anything or cake tend to lock the
processing to a single core. It doesn't in the case I describe above,
but I've not tried to push it past 10Gbit.

>
>
> On February 17, 2020 at 9:34:59 AM, Dave Taht (dave.taht@gmail.com) wrote:
>
> fq_codel, Cake etc, supports an infinite number of flows.
>
> It has a limited number of "queues" that can get mapped to flows, but
> it's usually ok if a collision happens.
>
> The 1024 queue tradeoff is based on the observation that usually a max
> of a few hundred active flows exist, and furthermore,
> excessive fair queueing tends to defeat the purpose of the aqm of
> keeping overall flow lengths short. Collisions of two fat flows are
> rare.
>
> You can recompile cake with more queues if you like (fq_codel has a
> soft limit of 64k queues). We don't have much data on 10GigE+
> behaviors. It was kind
> of my assumption more queues would help in the 40GigE world, but
> that's usually got hardware mq (64 or more), and what I'm seeing there
> is 64 default fq_codel instances, 64k
> queues essentially, and I think that's WAY too much....
>
>
> On Mon, Feb 17, 2020 at 6:07 AM Mike <mike@surfglobal.net> wrote:
> >
> > Will cake support a large number of flows like over a thousand per linux box without any modifications. I did see that there was a qdisc issue for fq_codel on a large scale. We would be using linux kernel 4.19 which has cake already in it. Any help or issues that might be encountered in scaling would be appreciated.
> >
> >
> >
> > Thanks
> > Mike Thompson
> >
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] Large number of Flows
  @ 2020-02-17 14:34  1% ` Dave Taht
       [not found]       ` <etPan.5e4ab6c5.653ea685.1b7f@surfglobal.net>
  0 siblings, 1 reply; 34+ results
From: Dave Taht @ 2020-02-17 14:34 UTC (permalink / raw)
  To: Mike; +Cc: Cake List

fq_codel, Cake etc, supports an infinite number of flows.

It has a limited number of "queues" that can get mapped to flows, but
it's usually ok if a collision happens.

The 1024 queue tradeoff is based on the observation that usually a max
of a few hundred active flows exist, and furthermore,
excessive fair queueing tends to defeat the purpose of the aqm of
keeping overall flow lengths short. Collisions of two fat flows are
rare.

You can recompile cake with more queues if you like (fq_codel has a
soft limit of 64k queues). We don't have much data on 10GigE+
behaviors. It was kind
of my assumption more queues would help in the 40GigE world, but
that's usually got hardware mq (64 or more), and what I'm seeing there
is 64 default fq_codel instances, 64k
queues essentially, and I think that's WAY too much....


On Mon, Feb 17, 2020 at 6:07 AM Mike <mike@surfglobal.net> wrote:
>
> Will cake support a large number of flows like over a thousand per linux box without any modifications.  I did see that there was a qdisc issue for fq_codel on a large scale.  We would be using linux kernel 4.19 which has cake already in it.  Any help or issues that might be encountered in scaling would be appreciated.
>
>
>
> Thanks
> Mike Thompson
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] [Make-wifi-fast]  Cake in mac80211
  2020-02-05 16:22  0%         ` Jonathan Morton
@ 2020-02-05 19:46  0%           ` Dave Taht
  0 siblings, 0 replies; 34+ results
From: Dave Taht @ 2020-02-05 19:46 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Bjørn Ivar Teigen, Cake List, Make-Wifi-fast

Jonathan Morton <chromatix99@gmail.com> writes:

>> On 5 Feb, 2020, at 6:06 pm, Dave Taht <dave@taht.net> wrote:
>> 
>>>    D) "cobalt" is proving out better in several respects than pure
>>>    codel,
>>>    and folding in some of that makes sense, except I don't know which
>>>    things are the most valuable considering wifi's other problems
>>> 
>>> Reading paper now. Thanks for the pointer.
>> 
>> I tend to think out that fq_codel is "good enough" in most
>> circumstances. The edge cases that cake handles better are a matter of a
>> few percentage points, vs orders of magnitude that we get with fq_codel
>> alone vs a vs a FIFO, and my focus of late has been to make things that
>> ate less cpu or were better offloadable than networked better. Others differ. 
>
> I think COBALT might be worth putting in, as it should have
> essentially no net cost and does behave a little better than stock
> Codel.  It's better at handling unresponsive traffic, in particular.

Cake, as a whole, benchmarks out at 2x+ more cpu than htb + fq_codel
does, while admittedly doing more stuff. 

There are 3 interrelated algorithms in cobalt

1) saturating arithmetic. I have no idea if current compilers do
saturated arith on either mips or arm boxes better than they do, but
intel still doesn't. Hate wasting the cpu on it, and don't mind that the
counter overflows after 4 billion iterations on some workloads.

(I did upstream a mild improvement to the bulk dropper a few months back)

2) Blue - to me - unproven as yet - as I'd like to try saturating
arithmetic.

3) I *LIKE* the more graduated drop off in cobalt... in theory.

...

Also, in the case of wifi, we never implemented the bulk dropper that
the mainline code has, and should definately do that. 

...

4) Increasingly I feel the need to drop unresponsive ecn flows more
robustly. I like what you stuck in your current SCE tree to make blue
kick in earlier. Needs benchmarks...

5) As for things like the invsqrt cache, meh, don't feel like that much
accuracy is required, costs an expensive memory access, wanted to see
how well pie and dualq worked. (Really wish P4 and BPF had an invsqrt
primitive.)

6) Same goes for set associativity.

I LIKE competition! The more folk we have hacking on this stuff the
better it gets. :) I've helped get fq-pie mainlined to have another
reference for comparison, with some hope for seeing more stuff offloaded
on more devices....

But in the scheme of debloating things, and sticking to just wifi for
this paragraph, tend to feel that txop clamping, & reducing hw retries,
and doing saner things with multicast, are a bigger win than
improvements to fq_codel itself or cake.

I haven't done much work on fq_codel_fast of late, but I threw out
everything people didn't use, and put in new things that were needed
like gso splitting and an early version of SCE, but few have tried
it... and my original goal for it was to have a multi-core shaper
facility in it and more limited queues automatically when used as
a default qdisc - 64000 fq_codeld (or cake!) queues seems like quite a
lot when you have 64 hw mqs. I'd be more comfortable if it autotuned...

(see also rss++)

... in terms of fantasizing ...

I'd like cake, to be able to use RSS and shape across
multiple cores. My basic dream has generally been that a single
line for inbound shaping that worked with RSS would work miracles.

tc qdisc add dev eth0 ingress cake 100Mbit.

without needing to use tc mirred.

A lot of good things have happened over the last few years to make that
more feasible - listification as one example. For all I know it's easy
to do now....

Would love to see a hardware offload. Am looking forward to google's
preso on their ebf+etx solution at netdevconf. Might be a game changer,
that... it feeds back into my old concept for the "bobbie" policer
much better if only timestamps worked from hw ingress to hw egress.

e2e: I'd really like to see BBRv1 gain RFC3168 and BBRv2 get SCE for
comparison purposes. I'm looking forward to the preliminary experiments
with mmwave radio (paper upcoming) because I think we're all thinking
about how that's going to work, wrong... 

And I'd like new grant money derived from a penny per user voluntary
donation from the billion+ machines running fq_codel...

And a pony.

It's my hope more people show up to go and explore all these options,
and collaborate and make a better, bufferbloat-free internet, somehow,
in my lifetime.



>
>  - Jonathan Morton

^ permalink raw reply	[relevance 0%]

* Re: [Cake] [Make-wifi-fast]  Cake in mac80211
  2020-02-05 16:06  0%       ` Dave Taht
  2020-02-05 16:16  0%         ` [Cake] [Make-wifi-fast] " Toke Høiland-Jørgensen
@ 2020-02-05 16:22  0%         ` Jonathan Morton
  2020-02-05 19:46  0%           ` Dave Taht
  1 sibling, 1 reply; 34+ results
From: Jonathan Morton @ 2020-02-05 16:22 UTC (permalink / raw)
  To: Dave Taht; +Cc: Bjørn Ivar Teigen, Cake List, Make-Wifi-fast

> On 5 Feb, 2020, at 6:06 pm, Dave Taht <dave@taht.net> wrote:
> 
>>    D) "cobalt" is proving out better in several respects than pure
>>    codel,
>>    and folding in some of that makes sense, except I don't know which
>>    things are the most valuable considering wifi's other problems
>> 
>> Reading paper now. Thanks for the pointer.
> 
> I tend to think out that fq_codel is "good enough" in most
> circumstances. The edge cases that cake handles better are a matter of a
> few percentage points, vs orders of magnitude that we get with fq_codel
> alone vs a vs a FIFO, and my focus of late has been to make things that
> ate less cpu or were better offloadable than networked better. Others differ. 

I think COBALT might be worth putting in, as it should have essentially no net cost and does behave a little better than stock Codel.  It's better at handling unresponsive traffic, in particular.

 - Jonathan Morton


^ permalink raw reply	[relevance 0%]

* Re: [Cake] [Make-wifi-fast]  Cake in mac80211
  2020-02-05 16:06  0%       ` Dave Taht
@ 2020-02-05 16:16  0%         ` Toke Høiland-Jørgensen
  2020-02-05 16:22  0%         ` Jonathan Morton
  1 sibling, 0 replies; 34+ results
From: Toke Høiland-Jørgensen @ 2020-02-05 16:16 UTC (permalink / raw)
  To: Dave Taht, Bjørn Ivar Teigen; +Cc: Cake List, Make-Wifi-fast

Dave Taht <dave@taht.net> writes:

> Bjørn Ivar Teigen <bjorn@domos.no> writes:
>
>> Thanks for the feedback!
>>
>> Some comments and questions added inline.
>>
>> On Tue, 4 Feb 2020 at 18:07, Dave Taht <dave.taht@gmail.com> wrote:
>>
>>     On Tue, Feb 4, 2020 at 7:25 AM Jonathan Morton
>>     <chromatix99@gmail.com> wrote:
>>     >
>>     > > On 4 Feb, 2020, at 5:20 pm, Bjørn Ivar Teigen <bjorn@domos.no>
>>     wrote:
>>     > >
>>     > > Are there any plans, work or just comments on the idea of
>>     implementing cake in mac80211 as was done with fq_codel?
>>     >
>>     > To consider doing that, there'd have to be a concrete benefit to
>>     doing so.
>>     
>>     Research is research! :) Everything is worth trying! There's got
>>     to be
>>     some better ideas out there, and we have a long list of things we
>>     could have done to keep improving wifi had funding not run out.
>>     
>>     We barely scratched the surface of this list.
>>     
>>     https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit
>>    
>>     
>>     > Most of Cake's most useful features, beyond what fq_codel
>>     already supports, are actually implied or even done better by the
>>     WiFi environment and the mac80211 layer adaptation (particularly
>>     airtime fairness).
>>     
>>     In my opinion(s)
>>     
>>     A) I think ack-filtering will help somewhat on 802.11n, but it's
>>     not
>>     worth the added cpu cost on an AP and I'd prefer hosts reduce
>>     their
>>     ack load in the tcp stack (IMHO, others may differ, it's worth
>>     trying)
>>     B) The underlying wifi scheduler essentially does per host fq
>>     better
>>     than cake can (because it's layer 2 vs layer 3), as per jonathan's
>>     comment above 
>>
>>     C) Instead of using a 8 way set associative hash and 1024 queues,
>>     fq_codel for wifi uses 4096 with a disambiguation pointer for
>>     collisions. Seems good enough.
>>     
>>
>> Didn't catch that before. Are the extra queues there because of the
>> different access categories on Wi-Fi? Seems like that would mean most
>> of them are not in use considering how little traffic is marked with
>> DSCP.
>
> I wasn't counting those. There's one set of 4k queues per access
> class.

Nit: not per access class; they're shared across the whole phy.

-Toke


^ permalink raw reply	[relevance 0%]

* Re: [Cake] Cake in mac80211
  2020-02-05 11:53  1%     ` Bjørn Ivar Teigen
@ 2020-02-05 16:06  0%       ` Dave Taht
  2020-02-05 16:16  0%         ` [Cake] [Make-wifi-fast] " Toke Høiland-Jørgensen
  2020-02-05 16:22  0%         ` Jonathan Morton
  0 siblings, 2 replies; 34+ results
From: Dave Taht @ 2020-02-05 16:06 UTC (permalink / raw)
  To: Bjørn Ivar Teigen; +Cc: Dave Taht, Cake List, Make-Wifi-fast

Bjørn Ivar Teigen <bjorn@domos.no> writes:

> Thanks for the feedback!
>
> Some comments and questions added inline.
>
> On Tue, 4 Feb 2020 at 18:07, Dave Taht <dave.taht@gmail.com> wrote:
>
>     On Tue, Feb 4, 2020 at 7:25 AM Jonathan Morton
>     <chromatix99@gmail.com> wrote:
>     >
>     > > On 4 Feb, 2020, at 5:20 pm, Bjørn Ivar Teigen <bjorn@domos.no>
>     wrote:
>     > >
>     > > Are there any plans, work or just comments on the idea of
>     implementing cake in mac80211 as was done with fq_codel?
>     >
>     > To consider doing that, there'd have to be a concrete benefit to
>     doing so.
>     
>     Research is research! :) Everything is worth trying! There's got
>     to be
>     some better ideas out there, and we have a long list of things we
>     could have done to keep improving wifi had funding not run out.
>     
>     We barely scratched the surface of this list.
>     
>     https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit
>    
>     
>     > Most of Cake's most useful features, beyond what fq_codel
>     already supports, are actually implied or even done better by the
>     WiFi environment and the mac80211 layer adaptation (particularly
>     airtime fairness).
>     
>     In my opinion(s)
>     
>     A) I think ack-filtering will help somewhat on 802.11n, but it's
>     not
>     worth the added cpu cost on an AP and I'd prefer hosts reduce
>     their
>     ack load in the tcp stack (IMHO, others may differ, it's worth
>     trying)
>     B) The underlying wifi scheduler essentially does per host fq
>     better
>     than cake can (because it's layer 2 vs layer 3), as per jonathan's
>     comment above 
>
>     C) Instead of using a 8 way set associative hash and 1024 queues,
>     fq_codel for wifi uses 4096 with a disambiguation pointer for
>     collisions. Seems good enough.
>     
>
> Didn't catch that before. Are the extra queues there because of the
> different access categories on Wi-Fi? Seems like that would mean most
> of them are not in use considering how little traffic is marked with
> DSCP.

I wasn't counting those. There's one set of 4k queues per access class.

While I agree that access classes are rarely used, and am of the opinion
that they shouldn't actually be used on an n or ac AP as better scheduling of the
BE class suffices. 802.11e is useful on well behaved clients for a few
things.

the number of queues was kind of picked as a function of the absolute maximum
number of stations wireless-n can take and a swag.

our original conception was that we'd have one fairly small fq_codel
instance per station, dynamically arriving and departing as the station
did, which proved really problematic to implement - we were stuck on how
stateful it was and all kinds of locking issues, for nearly 2 years
before michiel kazior came up with the simpler "lots of queues +
disambiguation pointer" idea.

Another idea unexplored is clamping the used and advertised (in the
beacon) txop size dynamically when under higher contention. I certainly
get better latency with a 2-3ms txop, but I never got around to
publishing those results in a coherent form. it also increases the
opportunities for an effective mu-mimo burst.

This to me is way better than explicitly choosing access classes.

My take on things for wifi 6 was that firmware needed to expose a per
station abstraction, and we needed to go back to the fq_codel instance
per station idea.

>
>     D) "cobalt" is proving out better in several respects than pure
>     codel,
>     and folding in some of that makes sense, except I don't know which
>     things are the most valuable considering wifi's other problems
>     
>
> Reading paper now. Thanks for the pointer.

I tend to think out that fq_codel is "good enough" in most
circumstances. The edge cases that cake handles better are a matter of a
few percentage points, vs orders of magnitude that we get with fq_codel
alone vs a vs a FIFO, and my focus of late has been to make things that
ate less cpu or were better offloadable than networked better. Others differ. 


>     E) I'd like to dynamically increase the quantum size as a function
>     of
>     load or number of flows. 
>     
>
>     I'd really like benchmarks of the proprietary versions coming out.
>     Qualcomm has their own fq_codelish thing baked into their firmware
>     now... I have no idea what broadcom is doing... fq-pie?
>     
>
> I've started looking at benchmarking proprietary drivers with emphasis
> on queueing performance. If you have any tips,

I've been after eero in particular to publish results.

> or if you would like to
> co-author a paper (I'm working on a PhD), I am very interested.

I have been without a voice since toke graduated, so yes.

>
>     The librerouter is now available. I'd like to try that.
>     
>     Recently I benchmarked red rock cafe in mountain view, which had
>     the
>     best bufferbloat and rrul score of any cybercafe I'd ever tried -
>     they
>     have a mojo networks AP, which arista bought a while back. It was
>     lovely.... I have no idea what they do,
>     but whatever it was it was *good*. I'm really happy see
>     bufferbloat
>     getting fixes everywhere, but really need to add quic to the
>     benchmark
>     suite somehow in order to feel better about people not rewriring
>     tcp
>     headers to do what they want.
>     
>     more importantly:
>     
>     Would really like to get cracking on a wifi 6 version. So far, all
>     the
>     vendors are lying, there is no OFDMA support in anything we've
>     played
>     with. There are some new outer limits there (1000+ devices), a
>     need to
>     do gang scheduling, and per-station firmware, and I'm
>     profoundly unimpressed with proprietary vendor's efforts so far
>     and
>     wish they'd open up their firmware more so more of us could take a
>     crack at it....
>     
>
> I agree, there are some interesting problems arising there. Interested
> to follow the work if and when this happens. Any luck finding a
> company willing to work on open-source drivers for Wi-Fi 6?

Nope. Feel free to try harder! I keep thinking that with various parties
struggling so hard they might actually try to open things up...

>     I'd really like to get the intel (iwl) version, especially the
>     ax200
>     chips, ported over to the AQL + fq_codel interfaces, at least. The
>     first attempt went badly, last quarter. Needs eyeballs and time...
>     Would like to find some other wifi chip worth fixing - raspi 4?
>     Some
>     android wifi chip? what?
>     Don't know how the ath11k effort is going...
>     
>     In mainline...
>     I'd like to get the wifi codel target on 5ghz down from 20ms (too
>     much) to 10ms, (or as I run it here to 8ms) in mainline, or at
>     least
>     openwrt, but that would require some benchmarking by multiple
>     folk,
>     and I was waiting for the ath10k ATF code to go upstream first. At
>     least make it tunable.
>     
>
> Have done some testing myself and 10ms looks like the correct limit on
> 5GHz.

Yea! Put results somewhere... I've kind of made a mistake in that I
ran my own patched kernels and openwrt instances for years now and
didn't really notice what hadn't got done until some testing at the last
battlemesh. Getting AQL for ath10k upstream is one piece of fallout from that.


>     Overall, reducing hw retries to sanity would be a nice thing to
>     attempt in the ath9k, at least. Although the ongoing SCE work
>     (gradual
>     rate reduction) is interesting, I tend to think reducing hardware
>     retries (with increased loss) would have a more dramatic effect on
>     reducing wifi latencies.
>     Presently with the codel target of 20ms in both directions, I get
>     60-80ms tcp latencies (still better than most fiber!) over wifi
>     with a
>     20ms target at 70mbits. What happens at 300+, no idea. cynically I
>     think much of the internet is essentially running at a max rwind
>     or
>     swind rather than within athe sawtooth.
>     
>
> Also interesting
>
>     doing something more sane to rate limit multicast would be good
>     also.
>     It was quite the long list in that google document, back in the
>     day we
>     thought the wifi industry might decide to collaborate in order to
>     meet
>     the 5G threat.
>     
>     > a Cake instance to the wifi interface as well, if you have a
>     need to do so.
>     
>     It certainly is feasible to do that. I do that now on several
>     802.11ac
>     devices that don't have the fq_codel for wifi hooks, preferring to
>     rate limit them well below capacity so as to ensure consistent low
>     latency. It's really neat to see people able to play world of
>     warcraft
>     and other games over
>     the wifi here. ( started deploying ubnt's uap mesh products,
>     reflashed
>     with openwrt, along portions of my wifi backbone . Looking forward
>     to
>     the AQL backport for those, but I hope someone else does it)
>     
>
> Have this setup at home and it really does make a difference, even
> with just normal browsing. Has bigger impact than I would have
> guessed!

Normal browsing rocks on fq_codel derived solutions.

See fig 14 and fig 24 on this cablelabs study

https://www-res.cablelabs.com/wp-content/uploads/2019/02/28094118/Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf

Why pie "won" to this day bothers me, as at the time it seemed feasible
to implement fq_codel decently on this class of devices. 

(and they weren't even benchmarking the final fq_codel version, but a
quite crippled sfq based one)

>
>     >
>     > - Jonathan Morton
>     > _______________________________________________
>     > Cake mailing list
>     > Cake@lists.bufferbloat.net
>     > https://lists.bufferbloat.net/listinfo/cake
>     
>     
>     
>     -- 
>     Make Music, Not War
>     
>     Dave Täht
>     CTO, TekLibre, LLC
>     http://www.teklibre.com
>     Tel: 1-831-435-0729

^ permalink raw reply	[relevance 0%]

* Re: [Cake] Cake in mac80211
  2020-02-04 17:07  1%   ` Dave Taht
@ 2020-02-05 11:53  1%     ` Bjørn Ivar Teigen
  2020-02-05 16:06  0%       ` Dave Taht
  0 siblings, 1 reply; 34+ results
From: Bjørn Ivar Teigen @ 2020-02-05 11:53 UTC (permalink / raw)
  To: Dave Taht; +Cc: Cake List, Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 6802 bytes --]

Thanks for the feedback!

Some comments and questions added inline.

On Tue, 4 Feb 2020 at 18:07, Dave Taht <dave.taht@gmail.com> wrote:

> On Tue, Feb 4, 2020 at 7:25 AM Jonathan Morton <chromatix99@gmail.com>
> wrote:
> >
> > > On 4 Feb, 2020, at 5:20 pm, Bjørn Ivar Teigen <bjorn@domos.no> wrote:
> > >
> > > Are there any plans, work or just comments on the idea of implementing
> cake in mac80211 as was done with fq_codel?
> >
> > To consider doing that, there'd have to be a concrete benefit to doing
> so.
>
> Research is research! :) Everything is worth trying! There's got to be
> some better ideas out there, and we have a long list of things we
> could have done to keep improving wifi had funding not run out.
>
> We barely scratched the surface of this list.
>
>
> https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit
>
> > Most of Cake's most useful features, beyond what fq_codel already
> supports, are actually implied or even done better by the WiFi environment
> and the mac80211 layer adaptation (particularly airtime fairness).
>
> In my opinion(s)
>
> A) I think ack-filtering will help somewhat on 802.11n, but it's not
> worth the added cpu cost on an AP and I'd prefer hosts reduce their
> ack load in the tcp stack (IMHO, others may differ, it's worth trying)
> B) The underlying wifi scheduler essentially does per host fq better
> than cake can (because it's layer 2 vs layer 3), as per jonathan's
> comment above

C) Instead of using a 8 way set associative hash and 1024 queues,
> fq_codel for wifi uses 4096 with a disambiguation pointer for
> collisions. Seems good enough.
>

Didn't catch that before. Are the extra queues there because of the
different access categories on Wi-Fi? Seems like that would mean most of
them are not in use considering how little traffic is marked with DSCP.

D) "cobalt" is proving out better in several respects than pure codel,
> and folding in some of that makes sense, except I don't know which
> things are the most valuable considering wifi's other problems
>

Reading paper now. Thanks for the pointer.


> E) I'd like to dynamically increase the quantum size as a function of
> load or number of flows.
>

>
> I'd really like benchmarks of the proprietary versions coming out.
> Qualcomm has their own fq_codelish thing baked into their firmware
> now... I have no idea what broadcom is doing... fq-pie?
>

I've started looking at benchmarking proprietary drivers with emphasis on
queueing performance. If you have any tips, or if you would like to
co-author a paper (I'm working on a PhD), I am very interested.


>
> The librerouter is now available. I'd like to try that.
>
> Recently I benchmarked red rock cafe in mountain view, which had the
> best bufferbloat and rrul score of any cybercafe I'd ever tried - they
> have a mojo networks AP, which arista bought a while back. It was
> lovely.... I have no idea what they do,
> but whatever it was it was *good*. I'm really happy see bufferbloat
> getting fixes everywhere, but really need to add quic to the benchmark
> suite somehow in order to feel better about people not rewriring tcp
> headers to do what they want.
>
> more importantly:
>
> Would really like to get cracking on a wifi 6 version. So far, all the
> vendors are lying, there is no OFDMA support in anything we've played
> with. There are some new outer limits there (1000+ devices), a need to
> do gang scheduling, and per-station firmware, and I'm
> profoundly unimpressed with proprietary vendor's efforts so far and
> wish they'd open up their firmware more so more of us could take a
> crack at it....
>

I agree, there are some interesting problems arising there. Interested to
follow the work if and when this happens. Any luck finding a company
willing to work on open-source drivers for Wi-Fi 6?


> I'd really like to get the intel  (iwl) version, especially the ax200
> chips, ported over to the AQL + fq_codel interfaces, at least.  The
> first attempt went badly, last quarter. Needs eyeballs and time...
> Would like to find some other wifi chip worth fixing - raspi 4? Some
> android wifi chip? what?
> Don't know how the ath11k effort is going...
>
> In mainline...
>  I'd like to get the wifi codel target on 5ghz down from 20ms (too
> much) to 10ms, (or as I run it here to 8ms) in mainline, or at least
> openwrt, but that would require some benchmarking by multiple folk,
> and I was waiting for the ath10k ATF code to go upstream first. At
> least make it tunable.
>

Have done some testing myself and 10ms looks like the correct limit on 5GHz.


>
> Overall, reducing hw retries to sanity would be a nice thing to
> attempt in the ath9k, at least. Although the ongoing SCE work (gradual
> rate reduction) is interesting, I tend to think reducing hardware
> retries (with increased loss) would have a more dramatic effect on
> reducing wifi latencies.
> Presently with the codel target of 20ms in both directions, I get
> 60-80ms tcp latencies (still better than most fiber!) over wifi with a
> 20ms target at 70mbits. What happens at 300+, no idea. cynically I
> think much of the internet is essentially running at a max rwind or
> swind rather than within athe sawtooth.
>

Also interesting


>
> doing something more sane to rate limit multicast would be good also.
> It was quite the long list in that google document, back in the day we
> thought the wifi industry might decide to collaborate in order to meet
> the 5G threat.
>
> > a Cake instance to the wifi interface as well, if you have a need to do
> so.
>
> It certainly is feasible to do that. I do that now on several 802.11ac
> devices that don't have the fq_codel for wifi hooks, preferring to
> rate limit them well below capacity so as to ensure consistent low
> latency. It's really neat to see people able to play world of warcraft
> and other games over
> the wifi here. ( started deploying ubnt's uap mesh products, reflashed
> with openwrt, along portions of my wifi backbone . Looking forward to
> the AQL backport for those, but I hope someone else does it)
>

Have this setup at home and it really does make a difference, even with
just normal browsing. Has bigger impact than I would have guessed!

>
> >
> >  - Jonathan Morton
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
>


-- 
Bjørn Ivar Teigen
Head of Research
Domos, Machine Learning for the Home
www.domos.no

[-- Attachment #2: Type: text/html, Size: 9676 bytes --]

^ permalink raw reply	[relevance 1%]

* Re: [Cake] Cake in mac80211
  2020-02-04 15:25  0% ` Jonathan Morton
@ 2020-02-04 17:07  1%   ` Dave Taht
  2020-02-05 11:53  1%     ` Bjørn Ivar Teigen
  0 siblings, 1 reply; 34+ results
From: Dave Taht @ 2020-02-04 17:07 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Bjørn Ivar Teigen, Cake List, Make-Wifi-fast

On Tue, Feb 4, 2020 at 7:25 AM Jonathan Morton <chromatix99@gmail.com> wrote:
>
> > On 4 Feb, 2020, at 5:20 pm, Bjørn Ivar Teigen <bjorn@domos.no> wrote:
> >
> > Are there any plans, work or just comments on the idea of implementing cake in mac80211 as was done with fq_codel?
>
> To consider doing that, there'd have to be a concrete benefit to doing so.

Research is research! :) Everything is worth trying! There's got to be
some better ideas out there, and we have a long list of things we
could have done to keep improving wifi had funding not run out.

We barely scratched the surface of this list.

https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit

> Most of Cake's most useful features, beyond what fq_codel already supports, are actually implied or even done better by the WiFi environment and the mac80211 layer adaptation (particularly airtime fairness).

In my opinion(s)

A) I think ack-filtering will help somewhat on 802.11n, but it's not
worth the added cpu cost on an AP and I'd prefer hosts reduce their
ack load in the tcp stack (IMHO, others may differ, it's worth trying)
B) The underlying wifi scheduler essentially does per host fq better
than cake can (because it's layer 2 vs layer 3), as per jonathan's
comment above
C) Instead of using a 8 way set associative hash and 1024 queues,
fq_codel for wifi uses 4096 with a disambiguation pointer for
collisions. Seems good enough.
D) "cobalt" is proving out better in several respects than pure codel,
and folding in some of that makes sense, except I don't know which
things are the most valuable considering wifi's other problems
E) I'd like to dynamically increase the quantum size as a function of
load or number of flows.


I'd really like benchmarks of the proprietary versions coming out.
Qualcomm has their own fq_codelish thing baked into their firmware
now... I have no idea what broadcom is doing... fq-pie?

The librerouter is now available. I'd like to try that.

Recently I benchmarked red rock cafe in mountain view, which had the
best bufferbloat and rrul score of any cybercafe I'd ever tried - they
have a mojo networks AP, which arista bought a while back. It was
lovely.... I have no idea what they do,
but whatever it was it was *good*. I'm really happy see bufferbloat
getting fixes everywhere, but really need to add quic to the benchmark
suite somehow in order to feel better about people not rewriring tcp
headers to do what they want.

more importantly:

Would really like to get cracking on a wifi 6 version. So far, all the
vendors are lying, there is no OFDMA support in anything we've played
with. There are some new outer limits there (1000+ devices), a need to
do gang scheduling, and per-station firmware, and I'm
profoundly unimpressed with proprietary vendor's efforts so far and
wish they'd open up their firmware more so more of us could take a
crack at it....

I'd really like to get the intel  (iwl) version, especially the ax200
chips, ported over to the AQL + fq_codel interfaces, at least.  The
first attempt went badly, last quarter. Needs eyeballs and time...
Would like to find some other wifi chip worth fixing - raspi 4? Some
android wifi chip? what?
Don't know how the ath11k effort is going...

In mainline...
 I'd like to get the wifi codel target on 5ghz down from 20ms (too
much) to 10ms, (or as I run it here to 8ms) in mainline, or at least
openwrt, but that would require some benchmarking by multiple folk,
and I was waiting for the ath10k ATF code to go upstream first. At
least make it tunable.

Overall, reducing hw retries to sanity would be a nice thing to
attempt in the ath9k, at least. Although the ongoing SCE work (gradual
rate reduction) is interesting, I tend to think reducing hardware
retries (with increased loss) would have a more dramatic effect on
reducing wifi latencies.
Presently with the codel target of 20ms in both directions, I get
60-80ms tcp latencies (still better than most fiber!) over wifi with a
20ms target at 70mbits. What happens at 300+, no idea. cynically I
think much of the internet is essentially running at a max rwind or
swind rather than within athe sawtooth.

doing something more sane to rate limit multicast would be good also.
It was quite the long list in that google document, back in the day we
thought the wifi industry might decide to collaborate in order to meet
the 5G threat.

> a Cake instance to the wifi interface as well, if you have a need to do so.

It certainly is feasible to do that. I do that now on several 802.11ac
devices that don't have the fq_codel for wifi hooks, preferring to
rate limit them well below capacity so as to ensure consistent low
latency. It's really neat to see people able to play world of warcraft
and other games over
the wifi here. ( started deploying ubnt's uap mesh products, reflashed
with openwrt, along portions of my wifi backbone . Looking forward to
the AQL backport for those, but I hope someone else does it)

>
>  - Jonathan Morton
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] Cake in mac80211
  @ 2020-02-04 15:25  0% ` Jonathan Morton
  2020-02-04 17:07  1%   ` Dave Taht
  0 siblings, 1 reply; 34+ results
From: Jonathan Morton @ 2020-02-04 15:25 UTC (permalink / raw)
  To: Bjørn Ivar Teigen; +Cc: cake

> On 4 Feb, 2020, at 5:20 pm, Bjørn Ivar Teigen <bjorn@domos.no> wrote:
> 
> Are there any plans, work or just comments on the idea of implementing cake in mac80211 as was done with fq_codel?

To consider doing that, there'd have to be a concrete benefit to doing so.  Most of Cake's most useful features, beyond what fq_codel already supports, are actually implied or even done better by the WiFi environment and the mac80211 layer adaptation (particularly airtime fairness).

You can of course attach a Cake instance to the wifi interface as well, if you have a need to do so.

 - Jonathan Morton

^ permalink raw reply	[relevance 0%]

* Re: [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree
  2020-01-11 21:38  1%     ` Dave Taht
  2020-01-12  9:53  0%       ` Kevin 'ldir' Darbyshire-Bryant
@ 2020-01-16 12:47  0%       ` Sebastian Gottschall
  1 sibling, 0 replies; 34+ results
From: Sebastian Gottschall @ 2020-01-16 12:47 UTC (permalink / raw)
  To: cake

dd-wrt has it already :-)

Am 11.01.2020 um 22:38 schrieb Dave Taht:
> Thank you for all the gymnastics to keep cake alive in openwrt.
>
> I would still like there to be a sce branch of the out of tree work
> that I could point people at
> in my lca talk this week, but I understand that's increasingly difficult.
>
> On Sat, Jan 11, 2020 at 1:20 PM Kevin 'ldir' Darbyshire-Bryant
> <ldir@darbyshire-bryant.me.uk> wrote:
>>
>>
>>> On 11 Jan 2020, at 20:40, Dave Taht <dave.taht@gmail.com> wrote:
>>>
>>> did this make it into openwrt already?
>> It’s complicated and it depends what you mean by openwrt.
>>
>> First off, the fix relates to auto-bandwith mode or whatever it’s called and I don’t think many people use it.  Nonetheless:
>>
>> Is the fix in ’net-next’: yes
>> Is the fix in 4.19 stable: In the queue for 4.19.95
>>
>> Is openwrt on 4.19.95: No
>> Does openwrt use the in-tree version of Cake?: No
>>
>> Is the fix in the Out-Of-Tree cake git repo: Yes
>>
>> Has the openwrt CAKE package been bumped to follow cake git repo?: master, yes, as of 2020/01/11 (earlier today)
>>
>> OpenWrt 19.07 has just been released, its concept of cake package has not been bumped.  Neither has 18.06.
>>
>>
>> It is worth noting that until yesterday/recently the out of tree cake repo had residue in it from some experimental stuff (SCE & updating conntrack marks) and did not represent upstream in-tree CAKE anyway.  That situation was corrected AFAIK completely this morning.
>>
>> Ideally I would like openwrt to use the in-tree CAKE, with ‘feature backports’ from later kernels as backport patches.  Unfortunately some targets in openwrt are still on 4.14 kernels so there is no in-tree CAKE to use.  Dropping CAKE from pre 4.19 kernels caused a bit of an outcry when I did it, so the next idea was to have a choice of cake kernel module for K4.19 targets, in-tree & out-of-tree CAKE.  Unfortunately that exposed a weakness in package dependency selection, so that idea hasn’t flown either.  I’m afraid enthusiasm levels then dropped.
>>
>>
>

^ permalink raw reply	[relevance 0%]

* Re: [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree
  2020-01-11 21:38  1%     ` Dave Taht
@ 2020-01-12  9:53  0%       ` Kevin 'ldir' Darbyshire-Bryant
  2020-01-16 12:47  0%       ` Sebastian Gottschall
  1 sibling, 0 replies; 34+ results
From: Kevin 'ldir' Darbyshire-Bryant @ 2020-01-12  9:53 UTC (permalink / raw)
  To: Dave Taht; +Cc: Cake List, Toke Høiland-Jørgensen

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]



> On 11 Jan 2020, at 21:38, Dave Taht <dave.taht@gmail.com> wrote:
> 
> Thank you for all the gymnastics to keep cake alive in openwrt.
> 
> I would still like there to be a sce branch of the out of tree work
> that I could point people at
> in my lca talk this week, but I understand that's increasingly difficult.

Jonathan advised me that the version of SCE in CAKE was out of date and we didn’t really want openwrt accidentally using the bonus features, hence the ‘no objection’ to removing all the non upstream kernel stuff from master.  Perhaps creating an ’SCE’ feature branch is the way forward?  Wireguard recently went through a similar change in that now wireguard is in (very recent) kernels, Jason has split development repos into something like ‘Wireguard for upstream’, ‘Wireguard for out-of-tree with loads of compat stubs’ & ‘wireguard userspace tools’.

We have a similar problem in that we have requirement for ‘cake for upstream’, ‘cake for out-of-tree’ & ‘cake for out-of-tree feature dev'

Cheers,

Kevin

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[relevance 0%]

* Re: [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree
  2020-01-11 21:20  1%   ` Kevin 'ldir' Darbyshire-Bryant
@ 2020-01-11 21:38  1%     ` Dave Taht
  2020-01-12  9:53  0%       ` Kevin 'ldir' Darbyshire-Bryant
  2020-01-16 12:47  0%       ` Sebastian Gottschall
  0 siblings, 2 replies; 34+ results
From: Dave Taht @ 2020-01-11 21:38 UTC (permalink / raw)
  To: Kevin 'ldir' Darbyshire-Bryant
  Cc: Cake List, Toke Høiland-Jørgensen

Thank you for all the gymnastics to keep cake alive in openwrt.

I would still like there to be a sce branch of the out of tree work
that I could point people at
in my lca talk this week, but I understand that's increasingly difficult.

On Sat, Jan 11, 2020 at 1:20 PM Kevin 'ldir' Darbyshire-Bryant
<ldir@darbyshire-bryant.me.uk> wrote:
>
>
>
> > On 11 Jan 2020, at 20:40, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > did this make it into openwrt already?
>
> It’s complicated and it depends what you mean by openwrt.
>
> First off, the fix relates to auto-bandwith mode or whatever it’s called and I don’t think many people use it.  Nonetheless:
>
> Is the fix in ’net-next’: yes
> Is the fix in 4.19 stable: In the queue for 4.19.95
>
> Is openwrt on 4.19.95: No
> Does openwrt use the in-tree version of Cake?: No
>
> Is the fix in the Out-Of-Tree cake git repo: Yes
>
> Has the openwrt CAKE package been bumped to follow cake git repo?: master, yes, as of 2020/01/11 (earlier today)
>
> OpenWrt 19.07 has just been released, its concept of cake package has not been bumped.  Neither has 18.06.
>
>
> It is worth noting that until yesterday/recently the out of tree cake repo had residue in it from some experimental stuff (SCE & updating conntrack marks) and did not represent upstream in-tree CAKE anyway.  That situation was corrected AFAIK completely this morning.
>
> Ideally I would like openwrt to use the in-tree CAKE, with ‘feature backports’ from later kernels as backport patches.  Unfortunately some targets in openwrt are still on 4.14 kernels so there is no in-tree CAKE to use.  Dropping CAKE from pre 4.19 kernels caused a bit of an outcry when I did it, so the next idea was to have a choice of cake kernel module for K4.19 targets, in-tree & out-of-tree CAKE.  Unfortunately that exposed a weakness in package dependency selection, so that idea hasn’t flown either.  I’m afraid enthusiasm levels then dropped.
>
>


-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* Re: [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree
  2020-01-11 20:40  1% ` Dave Taht
@ 2020-01-11 21:20  1%   ` Kevin 'ldir' Darbyshire-Bryant
  2020-01-11 21:38  1%     ` Dave Taht
  0 siblings, 1 reply; 34+ results
From: Kevin 'ldir' Darbyshire-Bryant @ 2020-01-11 21:20 UTC (permalink / raw)
  To: Dave Taht; +Cc: Cake List, Toke Høiland-Jørgensen

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]



> On 11 Jan 2020, at 20:40, Dave Taht <dave.taht@gmail.com> wrote:
> 
> did this make it into openwrt already?

It’s complicated and it depends what you mean by openwrt.

First off, the fix relates to auto-bandwith mode or whatever it’s called and I don’t think many people use it.  Nonetheless:

Is the fix in ’net-next’: yes
Is the fix in 4.19 stable: In the queue for 4.19.95

Is openwrt on 4.19.95: No
Does openwrt use the in-tree version of Cake?: No

Is the fix in the Out-Of-Tree cake git repo: Yes

Has the openwrt CAKE package been bumped to follow cake git repo?: master, yes, as of 2020/01/11 (earlier today)

OpenWrt 19.07 has just been released, its concept of cake package has not been bumped.  Neither has 18.06.


It is worth noting that until yesterday/recently the out of tree cake repo had residue in it from some experimental stuff (SCE & updating conntrack marks) and did not represent upstream in-tree CAKE anyway.  That situation was corrected AFAIK completely this morning.

Ideally I would like openwrt to use the in-tree CAKE, with ‘feature backports’ from later kernels as backport patches.  Unfortunately some targets in openwrt are still on 4.14 kernels so there is no in-tree CAKE to use.  Dropping CAKE from pre 4.19 kernels caused a bit of an outcry when I did it, so the next idea was to have a choice of cake kernel module for K4.19 targets, in-tree & out-of-tree CAKE.  Unfortunately that exposed a weakness in package dependency selection, so that idea hasn’t flown either.  I’m afraid enthusiasm levels then dropped.



[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[relevance 1%]

* Re: [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree
  2020-01-11  8:18  2% [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree gregkh
@ 2020-01-11 20:40  1% ` Dave Taht
  2020-01-11 21:20  1%   ` Kevin 'ldir' Darbyshire-Bryant
  0 siblings, 1 reply; 34+ results
From: Dave Taht @ 2020-01-11 20:40 UTC (permalink / raw)
  Cc: Cake List, Kevin 'ldir' Darbyshire-Bryant,
	Toke Høiland-Jørgensen

did this make it into openwrt already?

On Sat, Jan 11, 2020 at 12:19 AM <gregkh@linuxfoundation.org> wrote:
>
>
> This is a note to let you know that I've just added the patch titled
>
>     sch_cake: avoid possible divide by zero in cake_enqueue()
>
> to the 4.19-stable tree which can be found at:
>     http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
>
> The filename of the patch is:
>      sch_cake-avoid-possible-divide-by-zero-in-cake_enqueue.patch
> and it can be found in the queue-4.19 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable@vger.kernel.org> know about it.
>
>
> From foo@baz Sat 11 Jan 2020 09:14:34 AM CET
> From: Wen Yang <wenyang@linux.alibaba.com>
> Date: Thu, 2 Jan 2020 17:21:43 +0800
> Subject: sch_cake: avoid possible divide by zero in cake_enqueue()
>
> From: Wen Yang <wenyang@linux.alibaba.com>
>
> [ Upstream commit 68aab823c223646fab311f8a6581994facee66a0 ]
>
> The variables 'window_interval' is u64 and do_div()
> truncates it to 32 bits, which means it can test
> non-zero and be truncated to zero for division.
> The unit of window_interval is nanoseconds,
> so its lower 32-bit is relatively easy to exceed.
> Fix this issue by using div64_u64() instead.
>
> Fixes: 7298de9cd725 ("sch_cake: Add ingress mode")
> Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
> Cc: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
> Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Cong Wang <xiyou.wangcong@gmail.com>
> Cc: cake@lists.bufferbloat.net
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  net/sched/sch_cake.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/net/sched/sch_cake.c
> +++ b/net/sched/sch_cake.c
> @@ -1758,7 +1758,7 @@ static s32 cake_enqueue(struct sk_buff *
>                                                       q->avg_window_begin));
>                         u64 b = q->avg_window_bytes * (u64)NSEC_PER_SEC;
>
> -                       do_div(b, window_interval);
> +                       b = div64_u64(b, window_interval);
>                         q->avg_peak_bandwidth =
>                                 cake_ewma(q->avg_peak_bandwidth, b,
>                                           b > q->avg_peak_bandwidth ? 2 : 8);
>
>
> Patches currently in stable-queue which might be from wenyang@linux.alibaba.com are
>
> queue-4.19/sch_cake-avoid-possible-divide-by-zero-in-cake_enqueue.patch
> queue-4.19/regulator-fix-use-after-free-issue.patch
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake



-- 
Make Music, Not War

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729

^ permalink raw reply	[relevance 1%]

* [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 5.4-stable tree
@ 2020-01-11  8:18  2% gregkh
  0 siblings, 0 replies; 34+ results
From: gregkh @ 2020-01-11  8:18 UTC (permalink / raw)
  To: cake, davem, gregkh, ldir, toke, toke, wenyang, xiyou.wangcong
  Cc: stable-commits


This is a note to let you know that I've just added the patch titled

    sch_cake: avoid possible divide by zero in cake_enqueue()

to the 5.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     sch_cake-avoid-possible-divide-by-zero-in-cake_enqueue.patch
and it can be found in the queue-5.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


From foo@baz Sat 11 Jan 2020 09:13:20 AM CET
From: Wen Yang <wenyang@linux.alibaba.com>
Date: Thu, 2 Jan 2020 17:21:43 +0800
Subject: sch_cake: avoid possible divide by zero in cake_enqueue()

From: Wen Yang <wenyang@linux.alibaba.com>

[ Upstream commit 68aab823c223646fab311f8a6581994facee66a0 ]

The variables 'window_interval' is u64 and do_div()
truncates it to 32 bits, which means it can test
non-zero and be truncated to zero for division.
The unit of window_interval is nanoseconds,
so its lower 32-bit is relatively easy to exceed.
Fix this issue by using div64_u64() instead.

Fixes: 7298de9cd725 ("sch_cake: Add ingress mode")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: cake@lists.bufferbloat.net
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_cake.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -1769,7 +1769,7 @@ static s32 cake_enqueue(struct sk_buff *
 						      q->avg_window_begin));
 			u64 b = q->avg_window_bytes * (u64)NSEC_PER_SEC;
 
-			do_div(b, window_interval);
+			b = div64_u64(b, window_interval);
 			q->avg_peak_bandwidth =
 				cake_ewma(q->avg_peak_bandwidth, b,
 					  b > q->avg_peak_bandwidth ? 2 : 8);


Patches currently in stable-queue which might be from wenyang@linux.alibaba.com are

queue-5.4/regulator-core-fix-regulator_register-error-paths-to.patch
queue-5.4/sch_cake-avoid-possible-divide-by-zero-in-cake_enqueue.patch
queue-5.4/regulator-fix-use-after-free-issue.patch

^ permalink raw reply	[relevance 2%]

* [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree
@ 2020-01-11  8:18  2% gregkh
  2020-01-11 20:40  1% ` Dave Taht
  0 siblings, 1 reply; 34+ results
From: gregkh @ 2020-01-11  8:18 UTC (permalink / raw)
  To: cake, davem, gregkh, ldir, toke, toke, wenyang, xiyou.wangcong
  Cc: stable-commits


This is a note to let you know that I've just added the patch titled

    sch_cake: avoid possible divide by zero in cake_enqueue()

to the 4.19-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     sch_cake-avoid-possible-divide-by-zero-in-cake_enqueue.patch
and it can be found in the queue-4.19 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


From foo@baz Sat 11 Jan 2020 09:14:34 AM CET
From: Wen Yang <wenyang@linux.alibaba.com>
Date: Thu, 2 Jan 2020 17:21:43 +0800
Subject: sch_cake: avoid possible divide by zero in cake_enqueue()

From: Wen Yang <wenyang@linux.alibaba.com>

[ Upstream commit 68aab823c223646fab311f8a6581994facee66a0 ]

The variables 'window_interval' is u64 and do_div()
truncates it to 32 bits, which means it can test
non-zero and be truncated to zero for division.
The unit of window_interval is nanoseconds,
so its lower 32-bit is relatively easy to exceed.
Fix this issue by using div64_u64() instead.

Fixes: 7298de9cd725 ("sch_cake: Add ingress mode")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: cake@lists.bufferbloat.net
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_cake.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -1758,7 +1758,7 @@ static s32 cake_enqueue(struct sk_buff *
 						      q->avg_window_begin));
 			u64 b = q->avg_window_bytes * (u64)NSEC_PER_SEC;
 
-			do_div(b, window_interval);
+			b = div64_u64(b, window_interval);
 			q->avg_peak_bandwidth =
 				cake_ewma(q->avg_peak_bandwidth, b,
 					  b > q->avg_peak_bandwidth ? 2 : 8);


Patches currently in stable-queue which might be from wenyang@linux.alibaba.com are

queue-4.19/sch_cake-avoid-possible-divide-by-zero-in-cake_enqueue.patch
queue-4.19/regulator-fix-use-after-free-issue.patch

^ permalink raw reply	[relevance 2%]

* Re: [Cake] [PATCH] sch_cake: avoid possible divide by zero in cake_enqueue()
    2020-01-02 21:58  0% ` Jonathan Morton
@ 2020-01-03  0:35  1% ` David Miller
  1 sibling, 0 replies; 34+ results
From: David Miller @ 2020-01-03  0:35 UTC (permalink / raw)
  To: wenyang; +Cc: toke, ldir, toke, xiyou.wangcong, cake, netdev, linux-kernel

From: Wen Yang <wenyang@linux.alibaba.com>
Date: Thu,  2 Jan 2020 17:21:43 +0800

> The variables 'window_interval' is u64 and do_div()
> truncates it to 32 bits, which means it can test
> non-zero and be truncated to zero for division.
> The unit of window_interval is nanoseconds,
> so its lower 32-bit is relatively easy to exceed.
> Fix this issue by using div64_u64() instead.
> 
> Fixes: 7298de9cd725 ("sch_cake: Add ingress mode")
> Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>

Applied and queued up for -stable.

^ permalink raw reply	[relevance 1%]

* Re: [Cake] [PATCH] sch_cake: avoid possible divide by zero in cake_enqueue()
  @ 2020-01-02 21:58  0% ` Jonathan Morton
  2020-01-03  0:35  1% ` David Miller
  1 sibling, 0 replies; 34+ results
From: Jonathan Morton @ 2020-01-02 21:58 UTC (permalink / raw)
  To: Wen Yang
  Cc: Toke Høiland-Jørgensen, netdev, linux-kernel, cake,
	Kevin Darbyshire-Bryant, Cong Wang, David S . Miller

> On 2 Jan, 2020, at 11:21 am, Wen Yang <wenyang@linux.alibaba.com> wrote:
> 
> The variables 'window_interval' is u64 and do_div()
> truncates it to 32 bits, which means it can test
> non-zero and be truncated to zero for division.
> The unit of window_interval is nanoseconds,
> so its lower 32-bit is relatively easy to exceed.
> Fix this issue by using div64_u64() instead.

That might actually explain a few things.  I approve.

Honestly the *correct* fix is for the compiler to implement division in a way that doesn't require substituting it with function calls.  As this shows, it's error-prone to do this manually.

 - Jonathan Morton

^ permalink raw reply	[relevance 0%]

Results 201-234 of 234	 | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2020-01-02  9:21     [Cake] [PATCH] sch_cake: avoid possible divide by zero in cake_enqueue() Wen Yang
2020-01-02 21:58  0% ` Jonathan Morton
2020-01-03  0:35  1% ` David Miller
2020-01-11  8:18  2% [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 4.19-stable tree gregkh
2020-01-11 20:40  1% ` Dave Taht
2020-01-11 21:20  1%   ` Kevin 'ldir' Darbyshire-Bryant
2020-01-11 21:38  1%     ` Dave Taht
2020-01-12  9:53  0%       ` Kevin 'ldir' Darbyshire-Bryant
2020-01-16 12:47  0%       ` Sebastian Gottschall
2020-01-11  8:18  2% [Cake] Patch "sch_cake: avoid possible divide by zero in cake_enqueue()" has been added to the 5.4-stable tree gregkh
2020-02-04 15:20     [Cake] Cake in mac80211 Bjørn Ivar Teigen
2020-02-04 15:25  0% ` Jonathan Morton
2020-02-04 17:07  1%   ` Dave Taht
2020-02-05 11:53  1%     ` Bjørn Ivar Teigen
2020-02-05 16:06  0%       ` Dave Taht
2020-02-05 16:16  0%         ` [Cake] [Make-wifi-fast] " Toke Høiland-Jørgensen
2020-02-05 16:22  0%         ` Jonathan Morton
2020-02-05 19:46  0%           ` Dave Taht
2020-02-17 13:56     [Cake] Large number of Flows Mike
2020-02-17 14:34  1% ` Dave Taht
     [not found]       ` <etPan.5e4ab6c5.653ea685.1b7f@surfglobal.net>
2020-02-17 18:21  1%     ` Dave Taht
     [not found]     <1584524612-24470-1-git-send-email-ilpo.jarvinen@helsinki.fi>
2020-03-19 22:20  1% ` [Cake] Fwd: [RFC PATCH 00/28]: Accurate ECN for TCP Dave Taht
2020-03-27 17:27     [Cake] mo bettah open source multi-party videoconferncing in an age of bloated uplinks? Dave Taht
2020-03-27 19:00  1% ` David P. Reed
2020-03-27 19:12  1%   ` David Lang
2020-03-27 19:36  1%   ` Dave Taht
2020-03-27 20:32  1%   ` Dave Taht
2020-03-28 19:58         ` Toke Høiland-Jørgensen
2020-03-28 23:15  1%       ` David P. Reed
2020-03-28  6:53  1% ` Anthony Minessale II
     [not found]     <CAN_LGv1h8Ut4bGm7ZgYaGV_Tbdy3ABW+epb_p6jeX=TxnAvH1g@mail.gmail.com>
2020-04-03 18:49  1% ` [Cake] tc-cake(8) needs to explain a common mistake Dave Taht
2020-04-03 20:44  0%   ` Sebastian Moeller
2020-04-03 21:37  1%     ` Alexander E. Patrakov
2020-04-04  4:12     [Cake] New board that looks interesting Aaron Wood
2020-04-04 14:47  1% ` David P. Reed
2020-04-04 16:10  1%   ` [Cake] [Bloat] " Dave Taht
2020-04-04 16:27  1%     ` Aaron Wood
2020-04-04 17:36  1%       ` Dave Taht
2020-04-05  4:17     [Cake] cake and nat in openwrt... on by default? Dave Taht
2020-04-05  7:57  0% ` Kevin Darbyshire-Bryant
2020-04-05 15:22  1%   ` Dave Taht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox