[Bloat] CAKE in openwrt high CPU
Sebastian Moeller
moeller0 at gmx.de
Tue Sep 1 16:04:43 EDT 2020
Hi Jonathan,
> On Sep 1, 2020, at 21:31, Jonathan Foulkes <jf at jonathanfoulkes.com> wrote:
>
> Hi Sebastian, Cake functions wonderfully, it’s a marvel in terms of goodput.
>
> My comment was more oriented at the metrics process users use to evaluate results. Only those who spend time analyzing just how busy an ‘idle’ network can be know that there are a lot of processes in constant communications with their cloud services.
True, intestinally, quite a number of speedtests seem to err on the side of too high, probably because that way users are happy to see something close to their contracted rates...
> The challenge are the end users, who only understand the silly ’speed’ metric, and feel anything that lowers that number is a ‘bad’ thing. It takes effort to get even technical users to get it.
I repeatedly fall into that trap...
> But even beyond the basic, the further cuts induced by fairness is the new wrinkle in dealing with widely varying speed test results with isolation enabled on a busy network.
Yes, but one can try to make lemonade out of it, by running speedtests from two devices while observing something like "sudo mtr -ezb4 -i 0.3 8.8.8.8" not budging much een though the tests come and go; demonstrating the quality of the isolation and that low queueing delay can "happen" even on a busy link.
>
> The high density of devices and constant chatter with cloud services means the average home has way more devices and connections than many realize. Keep a note of the number of ‘active connections’ displayed on the OpenWRT overview page, you might be surprised (well, not you Seb ;) )
Count me in, I just switched over to a turris omnia (which I had crowd-funded before I realized IQrouters will be delivered to Germany ;) ) and while playning with its pakon feature I was quite baffled by how many addresses are used even in a short amount of time. (All of this is just a hobby to me, so I keep forgetting stuff regularly, because I do approach things a bit casually at times).
>
> As an example, on my network, I average 1,000 active connections all day, it rarely drops below 700. And it’s just two WFH professionals and 60+ network devices, not all of which are active at any one time.
> I actually run some custom firewall rules to de-prioritize four IoT devices that generate a LOT of traffic to their services. Two of which power panel monitors with real-time updates. This is why my bulk tin on egress has such high traffic.
Nice, I think being able to deprioritize stuff is one of the best reasons for using diffserve.
>
> Since you like to see tc output, here’s the one from my system after nearly a week.
> I run four-layer Cake as we do a lot of Zoom calls and our accounts are set up to do the appropriate DSCP marking.
I saw your nice writeup of how to do that on the OpenWrt forum IIRC. Need to talk to our IT guys at work, whether they are willing to actually configure it in the first place.
>
> root at IQrouter:~# tc -s qdisc
> qdisc noqueue 0: dev lo root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
> Sent 51311363856 bytes 86785488 pkt (dropped 53, overlimits 0 requeues 9114)
> backlog 0b 0p requeues 9114
> maxpacket 12112 drop_overlimit 0 new_flow_count 691740 ecn_mark 0
> new_flows_len 0 old_flows_len 0
> qdisc noqueue 0: dev br-lan root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc noqueue 0: dev eth0.1 root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc cake 8005: dev eth0.2 root refcnt 2 bandwidth 22478Kbit diffserv4 dual-srchost nat nowash ack-filter split-gso rtt 100.0ms raw overhead 0 mpu 64
> Sent 6943407136 bytes 35467722 pkt (dropped 51747, overlimits 3912091 requeues 0)
> backlog 0b 0p requeues 0
> memory used: 843816b of 4Mb
> capacity estimate: 22478Kbit
> min/max network layer size: 42 / 1514
> min/max overhead-adjusted size: 64 / 1514
> average network hdr offset: 14
>
> Bulk Best Effort Video Voice
> thresh 1404Kbit 22478Kbit 11239Kbit 5619Kbit
> target 12.9ms 5.0ms 5.0ms 5.0ms
> interval 107.9ms 100.0ms 100.0ms 100.0ms
> pk_delay 5.9ms 6.4ms 3.7ms 1.6ms
> av_delay 426us 445us 124us 188us
> sp_delay 13us 13us 12us 8us
> backlog 0b 0b 0b 0b
> pkts 3984407 30899121 474818 161123
> bytes 789740113 5883832402 246917562 30556915
> way_inds 65175 2580935 1064 5
> way_miss 1427 918529 15960 1120
> way_cols 0 0 0 0
> drops 0 2966 511 7
> marks 0 105 0 0
> ack_drop 0 48263 0 0
> sp_flows 2 4 1 0
> bk_flows 0 0 0 0
> un_flows 0 0 0 0
> max_len 1035 43094 3094 590
> quantum 300 685 342 300
>
> qdisc ingress ffff: dev eth0.2 parent ffff:fff1 ----------------
> Sent 43188461026 bytes 67870269 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc noqueue 0: dev br-guest root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan1 root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan0 root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan0-1 root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan1-1 root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc cake 8006: dev ifb4eth0.2 root refcnt 2 bandwidth 289066Kbit diffserv4 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100.0ms noatm overhead 18 mpu 64
> Sent 44692280901 bytes 67864800 pkt (dropped 5472, overlimits 5572964 requeues 0)
> backlog 0b 0p requeues 0
> memory used: 7346339b of 14453300b
> capacity estimate: 289066Kbit
> min/max network layer size: 46 / 1500
> min/max overhead-adjusted size: 64 / 1518
> average network hdr offset: 14
>
> Bulk Best Effort Video Voice
> thresh 18066Kbit 289066Kbit 144533Kbit 72266Kbit
> target 5.0ms 5.0ms 5.0ms 5.0ms
> interval 100.0ms 100.0ms 100.0ms 100.0ms
> pk_delay 47us 740us 42us 18us
> av_delay 24us 32us 22us 9us
> sp_delay 14us 11us 12us 5us
> backlog 0b 0b 0b 0b
> pkts 1389 45323600 3704409 18840874
> bytes 136046 43347299847 222296446 1130523693
> way_inds 0 3016679 0 0
> way_miss 17 903215 1053 2318
> way_cols 0 0 0 0
> drops 0 5471 0 1
> marks 0 27 0 0
> ack_drop 0 0 0 0
> sp_flows 1 4 2 1
> bk_flows 0 1 0 0
> un_flows 0 0 0 0
> max_len 98 68338 136 221
> quantum 551 1514 1514 1514
>
> root at IQrouter:~# uptime
> 15:07:37 up 6 days, 3:23, load average: 0.46, 0.21, 0.20
> root at IQrouter:~#
>
Thanks for sharing the stats, good reference.
Best Regards
Sebastian
>
> Cheers,
>
> Jonathan
>
>> On Sep 1, 2020, at 12:18 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>
>> HI Jonathan,
>>
>>> On Sep 1, 2020, at 17:41, Jonathan Foulkes <jf at jonathanfoulkes.com> wrote:
>>>
>>> Toke, that link returns a 404 for me.
>>>
>>> For others, I’ve found that testing cake throughput with isolation options enabled is tricky if there are many competing connections.
>>
>> Are you talking about the fact that with competing connections, you only see the current isolation quantum's equivalent f the actual rate? In that case maybe parse the "tc -s qdisc" output to get an idea how much data/packets cake managed to push through in total in each direction instead of relaying on the measured goodput? I am probably barking up the wrong tree here...
>>
>>> Like I keep having to tell my customers, fairness algorithms mean no one device will ever gain 100% of the bandwidth so long as there are other open & active connections from other devices.
>>
>> That sounds like solid advice ;) Especially in the light of the exceedingly useful "ingress" keyword, which under-load-will drop depending on a flow's "unresponsiveness" such that more responsive flows end up getting a somewhat bigger share of the post-cake throughput...
>>
>>>
>>> That said, I’d love to find options to increase throughput for single-tin configs.
>>
>> With or without isolation options?
>>
>> Best Regards
>> Sebastian
>>
>>>
>>> Cheers,
>>>
>>> Jonathan
>>>
>>>> On Aug 31, 2020, at 7:35 AM, Toke Høiland-Jørgensen via Bloat <bloat at lists.bufferbloat.net> wrote:
>>>>
>>>> Mikael Abrahamsson via Bloat <bloat at lists.bufferbloat.net> writes:
>>>>
>>>>> Hi,
>>>>>
>>>>> I migrated to an APU2 (https://www.pcengines.ch/apu2.htm) as residential
>>>>> router, from my previous WRT1200AC (marvell armada 385).
>>>>>
>>>>> I was running OpenWrt 18.06 on that one, now I am running latest 19.07.3
>>>>> on the APU2.
>>>>>
>>>>> Before I had 500/100 and I had to use FQ_CODEL because CAKE took too much
>>>>> CPU to be able to do 500/100 on the WRT1200AC. Now I upgraded to 1000/1000
>>>>> and tried it again, and even the APU2 can only do CAKE up to ~300
>>>>> megabit/s. With FQ_CODEL I get full speed (configure 900/900 in SQM in
>>>>> OpenWrt).
>>>>>
>>>>> Looking in top, I see sirq% sitting at 50% pegged. This is typical what I
>>>>> see when CPU based forwarding is maxed out. From my recollection of
>>>>> running CAKE on earlier versions of openwrt (17.x) I don't remember CAKE
>>>>> using more CPU than FQ_CODEL.
>>>>>
>>>>> Anyone know what's up? I'm fine running FQ_CODEL, it solves any
>>>>> bufferbloat but... I thought CAKE supposedly should use less CPU, not
>>>>> more?
>>>>
>>>> Hmm, you say CAKE and FQ-Codel - so you're not enabling the shaper (that
>>>> would be FQ-CoDel+HTB)? An exact config might be useful (or just the
>>>> output of tc -s qdisc).
>>>>
>>>> If you are indeed not shaping, maybe you're hitting the issue fixed by this commit?
>>>>
>>>> https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d37ec10e6n
>>>>
>>>> -Toke
>>>> _______________________________________________
>>>> Bloat mailing list
>>>> Bloat at lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/bloat
>>>
>>> _______________________________________________
>>> Bloat mailing list
>>> Bloat at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/bloat
>>
>
More information about the Bloat
mailing list