From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 30C1D3B2A4 for ; Tue, 1 Sep 2020 16:04:52 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1598990686; bh=PlgJDqJLC37E9j8RUbOA0NLz/LUEAHyc+3u6bHTndFw=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=Mf13HLGissGOSLc0Hvc60zBZk1aNPO/3MSrNFSL26Lyq3v3h1ygyQ8HgHmGHJ00pD 4EZHvFqG2polQTScPJoFV2nxSI2GoNJ4lkxQ+Q56F6EXKCRCF2v5OYthR4kjB9dPZN TTyGumc9WvxUG8O3nPoFk4eyT2DVzzdX8Vownvis= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from [192.168.42.229] ([95.116.152.69]) by mail.gmx.com (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MMXUN-1juIiw2vpP-00JWvV; Tue, 01 Sep 2020 22:04:45 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) From: Sebastian Moeller In-Reply-To: <1170B9AF-977E-45F4-ABCB-78A801F609C9@jonathanfoulkes.com> Date: Tue, 1 Sep 2020 22:04:43 +0200 Cc: =?utf-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , bloat@lists.bufferbloat.net Content-Transfer-Encoding: quoted-printable Message-Id: <1B20AB20-8C92-41E8-A064-212E311ABD57@gmx.de> References: <87mu2bjbf8.fsf@toke.dk> <5DBFB383-13E8-4587-BE49-1767471D7D59@jonathanfoulkes.com> <4E67F6C6-5187-4052-B7CA-177245F01CC0@gmx.de> <1170B9AF-977E-45F4-ABCB-78A801F609C9@jonathanfoulkes.com> To: Jonathan Foulkes X-Mailer: Apple Mail (2.3445.104.15) X-Provags-ID: V03:K1:SmWHxVf672IzHEL+loTlDg9oU82CFnw5feUWxM9kCFoO2UVAdrN wJlFCUXCZQnleSaSxr/B+bukSmibnuG6kFCJADtUHdKh0TxEfnfvtmPyeTNNhQgYEo2gLXD RzqbpY62baz7yuBRc32gTU9pMVFUr22DAfIZhH8X+57OFxTjUqaJYelpV36TsPkIdgzPKLM IaLaMI7Xkc1+obgoL56pA== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:SFr6RuBtOTc=:77xH9nflry0rOISAjB7yP+ 7SWstdlKHmwKyRSvBT0SolSyATuuI24aJvwcUqhwnXdUaVCCFkfzqWSpw0l9Xtv/EIQGNbzn2 R0+UJZeNzQCj+sfk3UsvrVli4MDpWwI69sPicVIuLgokuO/7GT12vZvMbfkVQgF4KSshVjJ2S 6+L2hQKaCSmh2oAk4VOQW7EBOtcCnd4UqU/+HCa1Pm6GX65wHuSd27BIY/Mvk4klpgjQdM8EJ rPe49975HJwmd3GVyAe37vlVgMjYB3x7kGKHV67kpvwyyuMirkMI8zM+di614foIFL3O3HYOH D7lcQNuMdodD5ZOq24oIAMLtJ/l1bFL/G07GSMek0echE//Zs9hVW2Ip+xPy8vNNWyTDIqBY5 AYWHMaIZIPM4pVlKgTm4i5OqoCwA5RWdKHCcWxM49mcMy3kyZsbcOXfBJkfZ0Xt2bi3t2P60u w4lsdR2nOAhPI1Ry9iHrWe3h6HaPMemIFRpYdTE6Bns38aMi5A7q9y3+aL743yih2q7bt59o4 5mSGd8nKyuobas3Cgvo+TJGsoAEBrhZQ1qs1SuJDtIV3Eucz4/vhl/o7kgRT06djioIa6JD7I 0qCk4AyQ6+gvi4b3Dp6K08qfKMXnmWnzxF7TsABxkS47VE+6NBSj9OI/R49D5P7DlVMCMWZq3 sKN6rE5jIRDacMY8b9DOmfHNp4LfcVXeglaoFYNRjEDPKKKY0KJbWR7EeMoprOG6iHt+GhJ6y 9O3Cw8gb8ab95Fn1Kjxjpu3SNo62OPzR/L4sEykGaT+WSC/9Zp5+Hhbwu9vKiKJOuwwGQnYwI nmG5ls3kaTpwHiWAY13YonQcwIFctVaHghs9nDhzOrjI0JUZrwKsUIz8oNzQsOQnN9paZ3eIm VtRmqa2liyj1mQiYrh5OiGrz3LhFwayVp6e/D9OzHoJtjeYCqkCFroVO+2nwzqX+XbopIPb3c R6dH8fj/3CEAJpE+ehweWW80Tx0sKujo9aOMSLdg3BEBx9QnRV3XpxZe2wt9TDy221g6qnR11 UPvLtJq9yDTVVJiUgd5hjncIaKOOb2OD2VKgVcmnn9itaY2l06NdneAUE+UTu6uCehzE+YuYs GjfShBOfoS0nA8crbtKDCnba2omVdWnfYI9YCDPSipMqHzXLG7DomvnVOxyZhyRqiL0wpUE6r +5azMpVTgYJ7S/87+B1VslN6Ii2Cg3TRf6zLiL7egvipvKF+sfmdsfF/Ta+TaxRUugKYUV1zU wTfplmwWvmJJ97wHJ Subject: Re: [Bloat] CAKE in openwrt high CPU X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Sep 2020 20:04:52 -0000 Hi Jonathan, > On Sep 1, 2020, at 21:31, Jonathan Foulkes = wrote: >=20 > Hi Sebastian, Cake functions wonderfully, it=E2=80=99s a marvel in = terms of goodput. >=20 > My comment was more oriented at the metrics process users use to = evaluate results. Only those who spend time analyzing just how busy an = =E2=80=98idle=E2=80=99 network can be know that there are a lot of = processes in constant communications with their cloud services.=20 True, intestinally, quite a number of speedtests seem to err on = the side of too high, probably because that way users are happy to see = something close to their contracted rates... > The challenge are the end users, who only understand the silly = =E2=80=99speed=E2=80=99 metric, and feel anything that lowers that = number is a =E2=80=98bad=E2=80=99 thing. It takes effort to get even = technical users to get it. I repeatedly fall into that trap... > But even beyond the basic, the further cuts induced by fairness is the = new wrinkle in dealing with widely varying speed test results with = isolation enabled on a busy network. Yes, but one can try to make lemonade out of it, by running = speedtests from two devices while observing something like "sudo mtr = -ezb4 -i 0.3 8.8.8.8" not budging much een though the tests come and go; = demonstrating the quality of the isolation and that low queueing delay = can "happen" even on a busy link. >=20 > The high density of devices and constant chatter with cloud services = means the average home has way more devices and connections than many = realize. Keep a note of the number of =E2=80=98active connections=E2=80=99= displayed on the OpenWRT overview page, you might be surprised (well, = not you Seb ;) ) Count me in, I just switched over to a turris omnia (which I had = crowd-funded before I realized IQrouters will be delivered to Germany ;) = ) and while playning with its pakon feature I was quite baffled by how = many addresses are used even in a short amount of time. (All of this is = just a hobby to me, so I keep forgetting stuff regularly, because I do = approach things a bit casually at times). >=20 > As an example, on my network, I average 1,000 active connections all = day, it rarely drops below 700. And it=E2=80=99s just two WFH = professionals and 60+ network devices, not all of which are active at = any one time. > I actually run some custom firewall rules to de-prioritize four IoT = devices that generate a LOT of traffic to their services. Two of which = power panel monitors with real-time updates. This is why my bulk tin on = egress has such high traffic. Nice, I think being able to deprioritize stuff is one of the = best reasons for using diffserve. >=20 > Since you like to see tc output, here=E2=80=99s the one from my system = after nearly a week. > I run four-layer Cake as we do a lot of Zoom calls and our accounts = are set up to do the appropriate DSCP marking. I saw your nice writeup of how to do that on the OpenWrt forum = IIRC. Need to talk to our IT guys at work, whether they are willing to = actually configure it in the first place. >=20 > root@IQrouter:~# tc -s qdisc > qdisc noqueue 0: dev lo root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 = quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn=20 > Sent 51311363856 bytes 86785488 pkt (dropped 53, overlimits 0 = requeues 9114)=20 > backlog 0b 0p requeues 9114 > maxpacket 12112 drop_overlimit 0 new_flow_count 691740 ecn_mark 0 > new_flows_len 0 old_flows_len 0 > qdisc noqueue 0: dev br-lan root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc noqueue 0: dev eth0.1 root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc cake 8005: dev eth0.2 root refcnt 2 bandwidth 22478Kbit = diffserv4 dual-srchost nat nowash ack-filter split-gso rtt 100.0ms raw = overhead 0 mpu 64=20 > Sent 6943407136 bytes 35467722 pkt (dropped 51747, overlimits 3912091 = requeues 0)=20 > backlog 0b 0p requeues 0 > memory used: 843816b of 4Mb > capacity estimate: 22478Kbit > min/max network layer size: 42 / 1514 > min/max overhead-adjusted size: 64 / 1514 > average network hdr offset: 14 >=20 > Bulk Best Effort Video Voice > thresh 1404Kbit 22478Kbit 11239Kbit 5619Kbit > target 12.9ms 5.0ms 5.0ms 5.0ms > interval 107.9ms 100.0ms 100.0ms 100.0ms > pk_delay 5.9ms 6.4ms 3.7ms 1.6ms > av_delay 426us 445us 124us 188us > sp_delay 13us 13us 12us 8us > backlog 0b 0b 0b 0b > pkts 3984407 30899121 474818 161123 > bytes 789740113 5883832402 246917562 30556915 > way_inds 65175 2580935 1064 5 > way_miss 1427 918529 15960 1120 > way_cols 0 0 0 0 > drops 0 2966 511 7 > marks 0 105 0 0 > ack_drop 0 48263 0 0 > sp_flows 2 4 1 0 > bk_flows 0 0 0 0 > un_flows 0 0 0 0 > max_len 1035 43094 3094 590 > quantum 300 685 342 300 >=20 > qdisc ingress ffff: dev eth0.2 parent ffff:fff1 ----------------=20 > Sent 43188461026 bytes 67870269 pkt (dropped 0, overlimits 0 requeues = 0)=20 > backlog 0b 0p requeues 0 > qdisc noqueue 0: dev br-guest root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc noqueue 0: dev wlan1 root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc noqueue 0: dev wlan0 root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc noqueue 0: dev wlan0-1 root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc noqueue 0: dev wlan1-1 root refcnt 2=20 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)=20 > backlog 0b 0p requeues 0 > qdisc cake 8006: dev ifb4eth0.2 root refcnt 2 bandwidth 289066Kbit = diffserv4 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt = 100.0ms noatm overhead 18 mpu 64=20 > Sent 44692280901 bytes 67864800 pkt (dropped 5472, overlimits 5572964 = requeues 0)=20 > backlog 0b 0p requeues 0 > memory used: 7346339b of 14453300b > capacity estimate: 289066Kbit > min/max network layer size: 46 / 1500 > min/max overhead-adjusted size: 64 / 1518 > average network hdr offset: 14 >=20 > Bulk Best Effort Video Voice > thresh 18066Kbit 289066Kbit 144533Kbit 72266Kbit > target 5.0ms 5.0ms 5.0ms 5.0ms > interval 100.0ms 100.0ms 100.0ms 100.0ms > pk_delay 47us 740us 42us 18us > av_delay 24us 32us 22us 9us > sp_delay 14us 11us 12us 5us > backlog 0b 0b 0b 0b > pkts 1389 45323600 3704409 18840874 > bytes 136046 43347299847 222296446 1130523693 > way_inds 0 3016679 0 0 > way_miss 17 903215 1053 2318 > way_cols 0 0 0 0 > drops 0 5471 0 1 > marks 0 27 0 0 > ack_drop 0 0 0 0 > sp_flows 1 4 2 1 > bk_flows 0 1 0 0 > un_flows 0 0 0 0 > max_len 98 68338 136 221 > quantum 551 1514 1514 1514 >=20 > root@IQrouter:~# uptime > 15:07:37 up 6 days, 3:23, load average: 0.46, 0.21, 0.20 > root@IQrouter:~#=20 >=20 Thanks for sharing the stats, good reference. Best Regards Sebastian >=20 > Cheers, >=20 > Jonathan >=20 >> On Sep 1, 2020, at 12:18 PM, Sebastian Moeller = wrote: >>=20 >> HI Jonathan, >>=20 >>> On Sep 1, 2020, at 17:41, Jonathan Foulkes = wrote: >>>=20 >>> Toke, that link returns a 404 for me. >>>=20 >>> For others, I=E2=80=99ve found that testing cake throughput with = isolation options enabled is tricky if there are many competing = connections.=20 >>=20 >> Are you talking about the fact that with competing connections, = you only see the current isolation quantum's equivalent f the actual = rate? In that case maybe parse the "tc -s qdisc" output to get an idea = how much data/packets cake managed to push through in total in each = direction instead of relaying on the measured goodput? I am probably = barking up the wrong tree here... >>=20 >>> Like I keep having to tell my customers, fairness algorithms mean no = one device will ever gain 100% of the bandwidth so long as there are = other open & active connections from other devices. >>=20 >> That sounds like solid advice ;) Especially in the light of the = exceedingly useful "ingress" keyword, which under-load-will drop = depending on a flow's "unresponsiveness" such that more responsive flows = end up getting a somewhat bigger share of the post-cake throughput... >>=20 >>>=20 >>> That said, I=E2=80=99d love to find options to increase throughput = for single-tin configs. >>=20 >> With or without isolation options? >>=20 >> Best Regards >> Sebastian >>=20 >>>=20 >>> Cheers, >>>=20 >>> Jonathan >>>=20 >>>> On Aug 31, 2020, at 7:35 AM, Toke H=C3=B8iland-J=C3=B8rgensen via = Bloat wrote: >>>>=20 >>>> Mikael Abrahamsson via Bloat writes: >>>>=20 >>>>> Hi, >>>>>=20 >>>>> I migrated to an APU2 (https://www.pcengines.ch/apu2.htm) as = residential=20 >>>>> router, from my previous WRT1200AC (marvell armada 385). >>>>>=20 >>>>> I was running OpenWrt 18.06 on that one, now I am running latest = 19.07.3=20 >>>>> on the APU2. >>>>>=20 >>>>> Before I had 500/100 and I had to use FQ_CODEL because CAKE took = too much=20 >>>>> CPU to be able to do 500/100 on the WRT1200AC. Now I upgraded to = 1000/1000=20 >>>>> and tried it again, and even the APU2 can only do CAKE up to ~300=20= >>>>> megabit/s. With FQ_CODEL I get full speed (configure 900/900 in = SQM in=20 >>>>> OpenWrt). >>>>>=20 >>>>> Looking in top, I see sirq% sitting at 50% pegged. This is typical = what I=20 >>>>> see when CPU based forwarding is maxed out. =46rom my recollection = of=20 >>>>> running CAKE on earlier versions of openwrt (17.x) I don't = remember CAKE=20 >>>>> using more CPU than FQ_CODEL. >>>>>=20 >>>>> Anyone know what's up? I'm fine running FQ_CODEL, it solves any=20 >>>>> bufferbloat but... I thought CAKE supposedly should use less CPU, = not=20 >>>>> more? >>>>=20 >>>> Hmm, you say CAKE and FQ-Codel - so you're not enabling the shaper = (that >>>> would be FQ-CoDel+HTB)? An exact config might be useful (or just = the >>>> output of tc -s qdisc). >>>>=20 >>>> If you are indeed not shaping, maybe you're hitting the issue fixed = by this commit? >>>>=20 >>>> = https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d3= 7ec10e6n >>>>=20 >>>> -Toke >>>> _______________________________________________ >>>> Bloat mailing list >>>> Bloat@lists.bufferbloat.net >>>> https://lists.bufferbloat.net/listinfo/bloat >>>=20 >>> _______________________________________________ >>> Bloat mailing list >>> Bloat@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/bloat >>=20 >=20