[Bloat] CAKE in openwrt high CPU

Sebastian Moeller moeller0 at gmx.de
Tue Sep 1 16:04:43 EDT 2020


Hi Jonathan,



> On Sep 1, 2020, at 21:31, Jonathan Foulkes <jf at jonathanfoulkes.com> wrote:
> 
> Hi Sebastian, Cake functions wonderfully, it’s a marvel in terms of goodput.
> 
> My comment was more oriented at the metrics process users use to evaluate results. Only those who spend time analyzing just how busy an ‘idle’ network can be know that there are a lot of processes in constant communications with their cloud services. 

	True, intestinally, quite a number of speedtests seem to err on the side of too high, probably because that way users are happy to see something close to their contracted rates...

> The challenge are the end users, who only understand the silly ’speed’ metric, and feel anything that lowers that number is a ‘bad’ thing. It takes effort to get even technical users to get it.

	I repeatedly fall into that trap...

> But even beyond the basic, the further cuts induced by fairness is the new wrinkle in dealing with widely varying speed test results with isolation enabled on a busy network.

	Yes, but one can try to make lemonade out of it, by running speedtests from two devices while observing something like "sudo mtr -ezb4 -i 0.3 8.8.8.8" not budging much een though the tests come and go; demonstrating the quality of the isolation and that low queueing delay can "happen" even on a busy link.

> 
> The high density of devices and constant chatter with cloud services means the average home has way more devices and connections than many realize. Keep a note of the number of ‘active connections’ displayed on the OpenWRT overview page, you might be surprised (well, not you Seb ;) )

	Count me in, I just switched over to a turris omnia (which I had crowd-funded before I realized IQrouters will be delivered to Germany ;) ) and while playning with its pakon feature I was quite baffled by how many addresses are used even in a short amount of time. (All of this is just a hobby to me, so I keep forgetting stuff regularly, because I do approach things a bit casually at times).

> 
> As an example, on my network, I average 1,000 active connections all day, it rarely drops below 700. And it’s just two WFH professionals and 60+ network devices, not all of which are active at any one time.
> I actually run some custom firewall rules to de-prioritize four IoT devices that generate a LOT of traffic to their services. Two of which power panel monitors with real-time updates. This is why my bulk tin on egress has such high traffic.

	Nice, I think being able to deprioritize stuff is one of the best reasons for using diffserve.

> 
> Since you like to see tc output, here’s the one from my system after nearly a week.
> I run four-layer Cake as we do a lot of Zoom calls and our accounts are set up to do the appropriate DSCP marking.

	I saw your nice writeup of how to do that on the OpenWrt forum IIRC. Need to talk to our IT guys at work, whether they are willing to actually configure it in the first place.


> 
> root at IQrouter:~# tc -s qdisc
> qdisc noqueue 0: dev lo root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn 
>  Sent 51311363856 bytes 86785488 pkt (dropped 53, overlimits 0 requeues 9114) 
>  backlog 0b 0p requeues 9114
>   maxpacket 12112 drop_overlimit 0 new_flow_count 691740 ecn_mark 0
>   new_flows_len 0 old_flows_len 0
> qdisc noqueue 0: dev br-lan root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc noqueue 0: dev eth0.1 root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc cake 8005: dev eth0.2 root refcnt 2 bandwidth 22478Kbit diffserv4 dual-srchost nat nowash ack-filter split-gso rtt 100.0ms raw overhead 0 mpu 64 
>  Sent 6943407136 bytes 35467722 pkt (dropped 51747, overlimits 3912091 requeues 0) 
>  backlog 0b 0p requeues 0
>  memory used: 843816b of 4Mb
>  capacity estimate: 22478Kbit
>  min/max network layer size:           42 /    1514
>  min/max overhead-adjusted size:       64 /    1514
>  average network hdr offset:           14
> 
>                    Bulk  Best Effort        Video        Voice
>   thresh       1404Kbit    22478Kbit    11239Kbit     5619Kbit
>   target         12.9ms        5.0ms        5.0ms        5.0ms
>   interval      107.9ms      100.0ms      100.0ms      100.0ms
>   pk_delay        5.9ms        6.4ms        3.7ms        1.6ms
>   av_delay        426us        445us        124us        188us
>   sp_delay         13us         13us         12us          8us
>   backlog            0b           0b           0b           0b
>   pkts          3984407     30899121       474818       161123
>   bytes       789740113   5883832402    246917562     30556915
>   way_inds        65175      2580935         1064            5
>   way_miss         1427       918529        15960         1120
>   way_cols            0            0            0            0
>   drops               0         2966          511            7
>   marks               0          105            0            0
>   ack_drop            0        48263            0            0
>   sp_flows            2            4            1            0
>   bk_flows            0            0            0            0
>   un_flows            0            0            0            0
>   max_len          1035        43094         3094          590
>   quantum           300          685          342          300
> 
> qdisc ingress ffff: dev eth0.2 parent ffff:fff1 ---------------- 
>  Sent 43188461026 bytes 67870269 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc noqueue 0: dev br-guest root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan1 root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan0 root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan0-1 root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc noqueue 0: dev wlan1-1 root refcnt 2 
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0
> qdisc cake 8006: dev ifb4eth0.2 root refcnt 2 bandwidth 289066Kbit diffserv4 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100.0ms noatm overhead 18 mpu 64 
>  Sent 44692280901 bytes 67864800 pkt (dropped 5472, overlimits 5572964 requeues 0) 
>  backlog 0b 0p requeues 0
>  memory used: 7346339b of 14453300b
>  capacity estimate: 289066Kbit
>  min/max network layer size:           46 /    1500
>  min/max overhead-adjusted size:       64 /    1518
>  average network hdr offset:           14
> 
>                    Bulk  Best Effort        Video        Voice
>   thresh      18066Kbit   289066Kbit   144533Kbit    72266Kbit
>   target          5.0ms        5.0ms        5.0ms        5.0ms
>   interval      100.0ms      100.0ms      100.0ms      100.0ms
>   pk_delay         47us        740us         42us         18us
>   av_delay         24us         32us         22us          9us
>   sp_delay         14us         11us         12us          5us
>   backlog            0b           0b           0b           0b
>   pkts             1389     45323600      3704409     18840874
>   bytes          136046  43347299847    222296446   1130523693
>   way_inds            0      3016679            0            0
>   way_miss           17       903215         1053         2318
>   way_cols            0            0            0            0
>   drops               0         5471            0            1
>   marks               0           27            0            0
>   ack_drop            0            0            0            0
>   sp_flows            1            4            2            1
>   bk_flows            0            1            0            0
>   un_flows            0            0            0            0
>   max_len            98        68338          136          221
>   quantum           551         1514         1514         1514
> 
> root at IQrouter:~# uptime
>  15:07:37 up 6 days,  3:23,  load average: 0.46, 0.21, 0.20
> root at IQrouter:~# 
> 

	Thanks for sharing the stats, good reference.

Best Regards
	Sebastian


> 
> Cheers,
> 
> Jonathan
> 
>> On Sep 1, 2020, at 12:18 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> 
>> HI Jonathan,
>> 
>>> On Sep 1, 2020, at 17:41, Jonathan Foulkes <jf at jonathanfoulkes.com> wrote:
>>> 
>>> Toke, that link returns a 404 for me.
>>> 
>>> For others, I’ve found that testing cake throughput with isolation options enabled is tricky if there are many competing connections. 
>> 
>> 	Are you talking about the fact that with competing connections, you only see the current isolation quantum's equivalent f the actual rate? In that case maybe parse the "tc -s qdisc" output to get an idea how much data/packets cake managed to push through in total in each direction instead of relaying on the measured goodput? I am probably barking up the wrong tree here...
>> 
>>> Like I keep having to tell my customers, fairness algorithms mean no one device will ever gain 100% of the bandwidth so long as there are other open & active connections from other devices.
>> 
>> 	That sounds like solid advice ;) Especially in the light of the exceedingly useful "ingress" keyword, which under-load-will drop depending on a flow's "unresponsiveness" such that more responsive flows end up getting a somewhat bigger share of the post-cake throughput...
>> 
>>> 
>>> That said, I’d love to find options to increase throughput for single-tin configs.
>> 
>> 	With or without isolation options?
>> 
>> Best Regards
>> 	Sebastian
>> 
>>> 
>>> Cheers,
>>> 
>>> Jonathan
>>> 
>>>> On Aug 31, 2020, at 7:35 AM, Toke Høiland-Jørgensen via Bloat <bloat at lists.bufferbloat.net> wrote:
>>>> 
>>>> Mikael Abrahamsson via Bloat <bloat at lists.bufferbloat.net> writes:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I migrated to an APU2 (https://www.pcengines.ch/apu2.htm) as residential 
>>>>> router, from my previous WRT1200AC (marvell armada 385).
>>>>> 
>>>>> I was running OpenWrt 18.06 on that one, now I am running latest 19.07.3 
>>>>> on the APU2.
>>>>> 
>>>>> Before I had 500/100 and I had to use FQ_CODEL because CAKE took too much 
>>>>> CPU to be able to do 500/100 on the WRT1200AC. Now I upgraded to 1000/1000 
>>>>> and tried it again, and even the APU2 can only do CAKE up to ~300 
>>>>> megabit/s. With FQ_CODEL I get full speed (configure 900/900 in SQM in 
>>>>> OpenWrt).
>>>>> 
>>>>> Looking in top, I see sirq% sitting at 50% pegged. This is typical what I 
>>>>> see when CPU based forwarding is maxed out. From my recollection of 
>>>>> running CAKE on earlier versions of openwrt (17.x) I don't remember CAKE 
>>>>> using more CPU than FQ_CODEL.
>>>>> 
>>>>> Anyone know what's up? I'm fine running FQ_CODEL, it solves any 
>>>>> bufferbloat but... I thought CAKE supposedly should use less CPU, not 
>>>>> more?
>>>> 
>>>> Hmm, you say CAKE and FQ-Codel - so you're not enabling the shaper (that
>>>> would be FQ-CoDel+HTB)? An exact config might be useful (or just the
>>>> output of tc -s qdisc).
>>>> 
>>>> If you are indeed not shaping, maybe you're hitting the issue fixed by this commit?
>>>> 
>>>> https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d37ec10e6n
>>>> 
>>>> -Toke
>>>> _______________________________________________
>>>> Bloat mailing list
>>>> Bloat at lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/bloat
>>> 
>>> _______________________________________________
>>> Bloat mailing list
>>> Bloat at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/bloat
>> 
> 



More information about the Bloat mailing list