* [Cake] cake infinite loop(?) with hfsc on one-armed router
@ 2018-12-27 23:30 Pete Heist
2018-12-28 12:58 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2018-12-27 23:30 UTC (permalink / raw)
To: Cake List
I’m seeing what I think it an infinite loop when cake is used in a one-armed router configuration with hfsc as the rate limiter. Three APUs are connected to the same switch and the “middle” APU (apu1a) routes between the default VLAN and a tagged VLAN.
apu2a <— default VLAN —> apu1a <— VLAN 3300 —> apu2b
After qos is set up, ping from apu2a to apu2b still works fine. When iperf3 is run from apu2a to apu2b it works fine, but when it goes in reverse (apu2b to apu2a), all traffic stops flowing from apu1a on the default VLAN. Traffic still flows from apu1a on VLAN 3300 however, with very high RTT (mean 500ms), leading me to believe that the cake instance on the default VLAN is in an infinite loop.
It does not happen with hfsc+fq_codel, or with htb+cake in the same configuration.
Here are the commands that set up qos, and it only locks up when cake is used as the instance at handle 20, not at handle 21:
-----
tc qdisc add dev eth0 root handle 1: hfsc default 10
tc class add dev eth0 parent 1: classid 1:1 hfsc sc rate 200mbit ul rate 200mbit
tc class add dev eth0 parent 1:1 classid 1:10 hfsc sc rate 100mbit ul rate 100mbit
tc class add dev eth0 parent 1:1 classid 1:11 hfsc sc rate 100mbit ul rate 100mbit
tc filter add dev eth0 protocol ip parent 1:0 prio 1 \
basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:11
tc qdisc add dev eth0 parent 1:10 handle 20: fq_codel # using cake here locks up !!!
tc qdisc add dev eth0 parent 1:11 handle 21: cake
——
I’m using sch_cake and tc-adv from the current HEAD, on kernel 3.16.7 (yeah, I know).
root@apu1a:~/qos# uname -a
Linux apu1a 3.16.7-ckt9-voyage #1 SMP Thu Apr 23 11:10:44 HKT 2015 i686 GNU/Linux
Any ideas just from just this? Otherwise, I can only think to hook up the serial cable and start with the printk’s…
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2018-12-27 23:30 [Cake] cake infinite loop(?) with hfsc on one-armed router Pete Heist
@ 2018-12-28 12:58 ` Pete Heist
2018-12-28 21:22 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2018-12-28 12:58 UTC (permalink / raw)
To: Cake List
Note that this doesn’t happen when prio is used in place of hfsc and cake is used in the leafs to do the rate limiting, i.e.:
tc qdisc add dev eth0 root handle 1: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
tc qdisc add dev eth0 parent 1:1 handle 10: cake besteffort bandwidth 100mbit ethernet
tc qdisc add dev eth0 parent 1:2 handle 11: cake besteffort bandwidth 100mbit ethernet ether-vlan
tc filter add dev eth0 protocol all parent 1:0 prio 1 basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:2
tc filter add dev eth0 protocol all parent 1:0 prio 2 u32 match u32 0 0 flowid 1:1
But it does happen when drr is used instead of prio:
tc qdisc add dev eth0 root handle 1: drr
tc class add dev eth0 parent 1: classid 1:1
tc class add dev eth0 parent 1: classid 1:2
tc qdisc add dev eth0 parent 1:1 handle 10: cake besteffort bandwidth 100mbit
tc qdisc add dev eth0 parent 1:2 handle 11: cake besteffort bandwidth 100mbit ether-vlan
tc filter add dev eth0 protocol all parent 1:0 prio 1 basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:2
tc filter add dev eth0 protocol all parent 1:0 prio 2 u32 match u32 0 0 flowid 1:1
drr might ultimately be what I want to use for this, so I can use cake to do the rate limiting instead of htb. prio works well but leads to starvation when the rate limit is above what the CPU can handle.
Meanwhile, using htb classes with rate limits way above the actual, then rate limiting in the cake leafs, works as well, but this seems like a hack:
tc qdisc add dev eth0 root handle 1: htb default 10
tc class add dev eth0 parent 1: classid 1:1 htb rate 10gbit
tc class add dev eth0 parent 1:1 classid 1:10 htb rate 5gbit
tc class add dev eth0 parent 1:1 classid 1:11 htb rate 5gbit
tc filter add dev eth0 protocol ip parent 1:0 prio 1 basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:11
tc qdisc add dev eth0 parent 1:10 handle 20: cake besteffort bandwidth 100mbit ethernet
tc qdisc add dev eth0 parent 1:11 handle 21: cake besteffort bandwidth 100mbit ethernet ether-vlan
> On Dec 28, 2018, at 12:30 AM, Pete Heist <pete@heistp.net> wrote:
>
> I’m seeing what I think it an infinite loop when cake is used in a one-armed router configuration with hfsc as the rate limiter. Three APUs are connected to the same switch and the “middle” APU (apu1a) routes between the default VLAN and a tagged VLAN.
>
> apu2a <— default VLAN —> apu1a <— VLAN 3300 —> apu2b
>
> After qos is set up, ping from apu2a to apu2b still works fine. When iperf3 is run from apu2a to apu2b it works fine, but when it goes in reverse (apu2b to apu2a), all traffic stops flowing from apu1a on the default VLAN. Traffic still flows from apu1a on VLAN 3300 however, with very high RTT (mean 500ms), leading me to believe that the cake instance on the default VLAN is in an infinite loop.
>
> It does not happen with hfsc+fq_codel, or with htb+cake in the same configuration.
>
> Here are the commands that set up qos, and it only locks up when cake is used as the instance at handle 20, not at handle 21:
>
> -----
> tc qdisc add dev eth0 root handle 1: hfsc default 10
> tc class add dev eth0 parent 1: classid 1:1 hfsc sc rate 200mbit ul rate 200mbit
> tc class add dev eth0 parent 1:1 classid 1:10 hfsc sc rate 100mbit ul rate 100mbit
> tc class add dev eth0 parent 1:1 classid 1:11 hfsc sc rate 100mbit ul rate 100mbit
> tc filter add dev eth0 protocol ip parent 1:0 prio 1 \
> basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:11
> tc qdisc add dev eth0 parent 1:10 handle 20: fq_codel # using cake here locks up !!!
> tc qdisc add dev eth0 parent 1:11 handle 21: cake
> ——
>
> I’m using sch_cake and tc-adv from the current HEAD, on kernel 3.16.7 (yeah, I know).
>
> root@apu1a:~/qos# uname -a
> Linux apu1a 3.16.7-ckt9-voyage #1 SMP Thu Apr 23 11:10:44 HKT 2015 i686 GNU/Linux
>
> Any ideas just from just this? Otherwise, I can only think to hook up the serial cable and start with the printk’s…
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2018-12-28 12:58 ` Pete Heist
@ 2018-12-28 21:22 ` Pete Heist
2018-12-28 22:07 ` Jonathan Morton
2019-01-04 21:34 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2018-12-28 21:22 UTC (permalink / raw)
To: Cake List
Ok, the lockup goes away if you use no-split-gso on the cake qdiscs for the default traffic (noted below in the drr and hfsc cases with "!!! must use no-split-gso here !!!"). Only I’d like my 600 μs back. :)
This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
I managed to capture some output from what happens to hfsc:
[ 683.864456] ------------[ cut here ]------------
[ 683.869116] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427 0xf9ced4ef()
[ 683.876267] Modules linked in: cls_u32 em_meta cls_basic sch_cake(O) sch_drr xt_ACCOUNT(O) sch_hfsc cls_fw sch_sfq sch_prio ipt_Ra
[ 683.931317] CPU: 1 PID: 11 Comm: ksoftirqd/1 Tainted: G W O 3.16.7-ckt9-voyage #1
[ 683.939595] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 683.945790] 00000000 00000000 f5c8bc9c c13167e9 00000000 f5c8bcb4 c102a7dd f9ced4ef
[ 683.953792] f1907c00 00000000 00000000 f5c8bcc4 c102a803 00000009 00000000 f5c8bce4
[ 683.961791] f9ced4ef f1907fc8 732494ae 00000002 f1907c00 00000000 00000040 f5c8bd00
[ 683.969783] Call Trace:
[ 683.972256] [<c13167e9>] dump_stack+0x41/0x52
[ 683.976729] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 683.982063] [<f9ced4ef>] ? 0xf9ced4ee
[ 683.985834] [<c102a803>] warn_slowpath_null+0xf/0x13
[ 683.990905] [<f9ced4ef>] 0xf9ced4ee
[ 683.994499] [<c129edf2>] __qdisc_run+0x81/0xf0
[ 683.999052] [<c128b655>] __dev_queue_xmit+0x23d/0x35f
[ 684.004216] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 684.008857] [<f89fff29>] register_vlan_dev+0x938/0xe3b [8021q]
[ 684.014799] [<c128b33b>] dev_hard_start_xmit+0x29e/0x37b
[ 684.020223] [<c128b6c0>] __dev_queue_xmit+0x2a8/0x35f
[ 684.025381] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 684.030016] [<c12cf8d3>] arp_xmit+0x1c/0x47
[ 684.034307] [<c12cff27>] arp_send+0x2e/0x33
[ 684.038598] [<c12d01b4>] arp_process+0x288/0x4d8
[ 684.043331] [<c12ad986>] ? ip_forward_finish+0x66/0x6b
[ 684.048581] [<c128170e>] ? __kfree_skb+0x5d/0x5f
[ 684.053303] [<c12d04ce>] arp_rcv+0xca/0x102
[ 684.057597] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 684.063453] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 684.068698] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 684.074557] [<c128a2cc>] napi_gro_receive+0x31/0x6d
[ 684.079549] [<c10065ec>] ? text_poke_bp+0xa0/0xa0
[ 684.084369] [<f808604a>] 0xf8086049
[ 684.087974] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 684.092791] [<c102d689>] __do_softirq+0x91/0x175
[ 684.097523] [<c102d783>] run_ksoftirqd+0x16/0x29
[ 684.102255] [<c1042734>] smpboot_thread_fn+0x108/0x11e
[ 684.107505] [<c104262c>] ? SyS_setgroups+0xa6/0xa6
[ 684.112403] [<c103de80>] kthread+0x9f/0xa4
[ 684.116615] [<c1319e01>] ret_from_kernel_thread+0x21/0x30
[ 684.122126] [<c103dde1>] ? kthread_freezable_should_stop+0x40/0x40
[ 684.128407] ---[ end trace cb7778967851e0ad ]---
[ 684.133646] ------------[ cut here ]------------
[ 684.138337] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427 0xf9ced4ef()
[ 684.145487] Modules linked in: cls_u32 em_meta cls_basic sch_cake(O) sch_drr xt_ACCOUNT(O) sch_hfsc cls_fw sch_sfq sch_prio ipt_Ra
[ 684.200459] CPU: 1 PID: 11 Comm: ksoftirqd/1 Tainted: G W O 3.16.7-ckt9-voyage #1
[ 684.208736] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 684.214933] 00000000 00000000 f5c8be98 c13167e9 00000000 f5c8beb0 c102a7dd f9ced4ef
[ 684.222930] f1907c00 00000000 00000000 f5c8bec0 c102a803 00000009 00000000 f5c8bee0
[ 684.230928] f9ced4ef f1907fc8 7364c482 00000002 f1907c00 00000000 00000040 f5c8befc
[ 684.238926] Call Trace:
[ 684.241399] [<c13167e9>] dump_stack+0x41/0x52
[ 684.245870] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 684.251206] [<f9ced4ef>] ? 0xf9ced4ee
[ 684.254979] [<c102a803>] warn_slowpath_null+0xf/0x13
[ 684.260055] [<f9ced4ef>] 0xf9ced4ee
[ 684.263651] [<c129edf2>] __qdisc_run+0x81/0xf0
[ 684.268203] [<c128744f>] net_tx_action+0x91/0xdd
[ 684.272927] [<c102d689>] __do_softirq+0x91/0x175
[ 684.277659] [<c102d783>] run_ksoftirqd+0x16/0x29
[ 684.282389] [<c1042734>] smpboot_thread_fn+0x108/0x11e
[ 684.287633] [<c104262c>] ? SyS_setgroups+0xa6/0xa6
[ 684.292529] [<c103de80>] kthread+0x9f/0xa4
[ 684.296735] [<c1319e01>] ret_from_kernel_thread+0x21/0x30
[ 684.302246] [<c103dde1>] ? kthread_freezable_should_stop+0x40/0x40
[ 684.308536] ---[ end trace cb7778967851e0ae ]---
> On Dec 28, 2018, at 1:58 PM, Pete Heist <pete@heistp.net> wrote:
>
> Note that this doesn’t happen when prio is used in place of hfsc and cake is used in the leafs to do the rate limiting, i.e.:
>
> tc qdisc add dev eth0 root handle 1: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> tc qdisc add dev eth0 parent 1:1 handle 10: cake besteffort bandwidth 100mbit ethernet # !!! must use no-split-gso here !!!
> tc qdisc add dev eth0 parent 1:2 handle 11: cake besteffort bandwidth 100mbit ethernet ether-vlan
> tc filter add dev eth0 protocol all parent 1:0 prio 1 basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:2
> tc filter add dev eth0 protocol all parent 1:0 prio 2 u32 match u32 0 0 flowid 1:1
>
> But it does happen when drr is used instead of prio:
>
> tc qdisc add dev eth0 root handle 1: drr
> tc class add dev eth0 parent 1: classid 1:1
> tc class add dev eth0 parent 1: classid 1:2
> tc qdisc add dev eth0 parent 1:1 handle 10: cake besteffort bandwidth 100mbit
> tc qdisc add dev eth0 parent 1:2 handle 11: cake besteffort bandwidth 100mbit ether-vlan
> tc filter add dev eth0 protocol all parent 1:0 prio 1 basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:2
> tc filter add dev eth0 protocol all parent 1:0 prio 2 u32 match u32 0 0 flowid 1:1
>
> drr might ultimately be what I want to use for this, so I can use cake to do the rate limiting instead of htb. prio works well but leads to starvation when the rate limit is above what the CPU can handle.
>
> Meanwhile, using htb classes with rate limits way above the actual, then rate limiting in the cake leafs, works as well, but this seems like a hack:
>
> tc qdisc add dev eth0 root handle 1: htb default 10
> tc class add dev eth0 parent 1: classid 1:1 htb rate 10gbit
> tc class add dev eth0 parent 1:1 classid 1:10 htb rate 5gbit
> tc class add dev eth0 parent 1:1 classid 1:11 htb rate 5gbit
> tc filter add dev eth0 protocol ip parent 1:0 prio 1 basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:11
> tc qdisc add dev eth0 parent 1:10 handle 20: cake besteffort bandwidth 100mbit ethernet # !!! must use no-split-gso here !!!
> tc qdisc add dev eth0 parent 1:11 handle 21: cake besteffort bandwidth 100mbit ethernet ether-vlan
>
>> On Dec 28, 2018, at 12:30 AM, Pete Heist <pete@heistp.net> wrote:
>>
>> I’m seeing what I think it an infinite loop when cake is used in a one-armed router configuration with hfsc as the rate limiter. Three APUs are connected to the same switch and the “middle” APU (apu1a) routes between the default VLAN and a tagged VLAN.
>>
>> apu2a <— default VLAN —> apu1a <— VLAN 3300 —> apu2b
>>
>> After qos is set up, ping from apu2a to apu2b still works fine. When iperf3 is run from apu2a to apu2b it works fine, but when it goes in reverse (apu2b to apu2a), all traffic stops flowing from apu1a on the default VLAN. Traffic still flows from apu1a on VLAN 3300 however, with very high RTT (mean 500ms), leading me to believe that the cake instance on the default VLAN is in an infinite loop.
>>
>> It does not happen with hfsc+fq_codel, or with htb+cake in the same configuration.
>>
>> Here are the commands that set up qos, and it only locks up when cake is used as the instance at handle 20, not at handle 21:
>>
>> -----
>> tc qdisc add dev eth0 root handle 1: hfsc default 10
>> tc class add dev eth0 parent 1: classid 1:1 hfsc sc rate 200mbit ul rate 200mbit
>> tc class add dev eth0 parent 1:1 classid 1:10 hfsc sc rate 100mbit ul rate 100mbit
>> tc class add dev eth0 parent 1:1 classid 1:11 hfsc sc rate 100mbit ul rate 100mbit
>> tc filter add dev eth0 protocol ip parent 1:0 prio 1 \
>> basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:11
>> tc qdisc add dev eth0 parent 1:10 handle 20: fq_codel # using cake here locks up !!!
>> tc qdisc add dev eth0 parent 1:11 handle 21: cake
>> ——
>>
>> I’m using sch_cake and tc-adv from the current HEAD, on kernel 3.16.7 (yeah, I know).
>>
>> root@apu1a:~/qos# uname -a
>> Linux apu1a 3.16.7-ckt9-voyage #1 SMP Thu Apr 23 11:10:44 HKT 2015 i686 GNU/Linux
>>
>> Any ideas just from just this? Otherwise, I can only think to hook up the serial cable and start with the printk’s…
>>
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2018-12-28 21:22 ` Pete Heist
@ 2018-12-28 22:07 ` Jonathan Morton
2018-12-28 22:42 ` Pete Heist
2019-01-04 21:34 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 47+ messages in thread
From: Jonathan Morton @ 2018-12-28 22:07 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
> On 28 Dec, 2018, at 11:22 pm, Pete Heist <pete@heistp.net> wrote:
>
> This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
>
> Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
I did wonder if there was something about the old kernel messing things up.
Do you still get acceptable throughput if you disable GRO/GSO at the interfaces?
- Jonathan Morton
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2018-12-28 22:07 ` Jonathan Morton
@ 2018-12-28 22:42 ` Pete Heist
0 siblings, 0 replies; 47+ messages in thread
From: Pete Heist @ 2018-12-28 22:42 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1659 bytes --]
> On Dec 28, 2018, at 11:07 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 28 Dec, 2018, at 11:22 pm, Pete Heist <pete@heistp.net> wrote:
>>
>> This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
>>
>> Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
>
> I did wonder if there was something about the old kernel messing things up.
>
> Do you still get acceptable throughput if you disable GRO/GSO at the interfaces?
Turning off GRO does avoid the problem, but unlimited throughput through the one-armed router gets slashed:
GRO on: 920mbit up / 935mbit down
GRO off: 510mbit / 534mbit down
Incidentally, for those using the v1 PCEngines APU, turning off rx-vlan-offload provides much better throughput for traffic routed from VLANs (presumably a quirk of the Realtek r8169 driver):
ethtool -K eth0 rxvlan on: 494mbit
ethtool -K eth0 rxvlan off: 936mbit
It would be great if there’s a way to work around this in cake on older kernels, but I’m not holding my breath. It’s also very possible that the current fix in newer kernels takes care of this case, even if the original workaround in cake does not. This was the kernel change:
https://patchwork.ozlabs.org/patch/968734/ <https://patchwork.ozlabs.org/patch/968734/>
So far, I’ll just use prio for this at lower throughputs and htb or drr if the CPU is taxed. This works well enough on 3.16, all things considered…
[-- Attachment #2: Type: text/html, Size: 2668 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2018-12-28 21:22 ` Pete Heist
2018-12-28 22:07 ` Jonathan Morton
@ 2019-01-04 21:34 ` Toke Høiland-Jørgensen
2019-01-04 22:10 ` Pete Heist
1 sibling, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-04 21:34 UTC (permalink / raw)
To: Pete Heist, Cake List
Pete Heist <pete@heistp.net> writes:
> Ok, the lockup goes away if you use no-split-gso on the cake qdiscs for the default traffic (noted below in the drr and hfsc cases with "!!! must use no-split-gso here !!!"). Only I’d like my 600 μs back. :)
>
> This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
>
> Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
>
> I managed to capture some output from what happens to hfsc:
>
> [ 683.864456] ------------[ cut here ]------------
> [ 683.869116] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427
> 0xf9ced4ef()
So this seems to be this line:
WARN_ON(next_time == 0);
See https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1427
Which seems to indicate that HFSC can't find the next class to schedule.
Not entirely sure why, nor why this only happens with CAKE as a qdisc.
But I don't think it's actually an infinite loop that's causing it...
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-04 21:34 ` Toke Høiland-Jørgensen
@ 2019-01-04 22:10 ` Pete Heist
2019-01-04 22:12 ` Pete Heist
2019-01-04 22:34 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-04 22:10 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 4, 2019, at 10:34 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>> Ok, the lockup goes away if you use no-split-gso on the cake qdiscs for the default traffic (noted below in the drr and hfsc cases with "!!! must use no-split-gso here !!!"). Only I’d like my 600 μs back. :)
>>
>> This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
>>
>> Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
>>
>> I managed to capture some output from what happens to hfsc:
>>
>> [ 683.864456] ------------[ cut here ]------------
>> [ 683.869116] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427
>> 0xf9ced4ef()
>
> So this seems to be this line:
>
> WARN_ON(next_time == 0);
>
> See https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1427
>
> Which seems to indicate that HFSC can't find the next class to schedule.
> Not entirely sure why, nor why this only happens with CAKE as a qdisc.
> But I don't think it's actually an infinite loop that's causing it...
Ok, fwiw one doesn’t actually need a one-armed router or VLANs to reproduce this. Just do this:
tc qdisc add dev $IFACE root handle 1: hfsc default 1
tc class add dev $IFACE parent 1: classid 1:1 hfsc ls rate $RATE ul rate $RATE
tc qdisc add dev $IFACE parent 1:1 cake # add split-gso here, or else…
I’ve tried it as far as 4.9.0-8, but no farther. It’s not much of a priority for me now that I have a workaround for it...
Pete
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-04 22:10 ` Pete Heist
@ 2019-01-04 22:12 ` Pete Heist
2019-01-04 22:34 ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-04 22:12 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 664 bytes --]
> On Jan 4, 2019, at 11:10 PM, Pete Heist <pete@heistp.net> wrote:
>
> Ok, fwiw one doesn’t actually need a one-armed router or VLANs to reproduce this. Just do this:
>
> tc qdisc add dev $IFACE root handle 1: hfsc default 1
> tc class add dev $IFACE parent 1: classid 1:1 hfsc ls rate $RATE ul rate $RATE
> tc qdisc add dev $IFACE parent 1:1 cake # add no-split-gso here, or else…
>
> I’ve tried it as far as 4.9.0-8, but no farther. It’s not much of a priority for me now that I have a workaround for it...
Correction, add “no-split-gso” to fix it, not “split-gso”. That’s lack of sleep, and I will go correct that now… :)
[-- Attachment #2: Type: text/html, Size: 5656 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-04 22:10 ` Pete Heist
2019-01-04 22:12 ` Pete Heist
@ 2019-01-04 22:34 ` Toke Høiland-Jørgensen
2019-01-05 5:58 ` Pete Heist
2019-01-05 10:44 ` Jonathan Morton
1 sibling, 2 replies; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-04 22:34 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 4, 2019, at 10:34 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>> Ok, the lockup goes away if you use no-split-gso on the cake qdiscs for the default traffic (noted below in the drr and hfsc cases with "!!! must use no-split-gso here !!!"). Only I’d like my 600 μs back. :)
>>>
>>> This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
>>>
>>> Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
>>>
>>> I managed to capture some output from what happens to hfsc:
>>>
>>> [ 683.864456] ------------[ cut here ]------------
>>> [ 683.869116] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427
>>> 0xf9ced4ef()
>>
>> So this seems to be this line:
>>
>> WARN_ON(next_time == 0);
>>
>> See https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1427
>>
>> Which seems to indicate that HFSC can't find the next class to schedule.
>> Not entirely sure why, nor why this only happens with CAKE as a qdisc.
>> But I don't think it's actually an infinite loop that's causing it...
>
>
> Ok, fwiw one doesn’t actually need a one-armed router or VLANs to reproduce this. Just do this:
>
> tc qdisc add dev $IFACE root handle 1: hfsc default 1
> tc class add dev $IFACE parent 1: classid 1:1 hfsc ls rate $RATE ul rate $RATE
> tc qdisc add dev $IFACE parent 1:1 cake # add split-gso here, or else…
>
> I’ve tried it as far as 4.9.0-8, but no farther. It’s not much of a
> priority for me now that I have a workaround for it...
Ah, I think I know what's going on:
On enqueue, HFSC will increase its own internal notion of qlen (q.qlen++
in hfsc_enqueue()), and in dequeue, it will return immediately if this
qlen is 0. Now, with GSO packet splitting, a single packet on enqueue
can turn in to several packets on dequeue, which means that HFSC will
think the queue is empty after dequeueing the first on, and refuse to
dequeue any more packets.
This basically means that we can't use CAKE as a leaf qdisc with GSO
splitting as it stands currently. I *think* the solution is for CAKE to
notify its parents; could you try the patch below and see if it helps?
-Toke
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index b910cd5c56f7..77b0ebd673ac 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -1617,6 +1617,30 @@ static u32 cake_classify(struct Qdisc *sch, struct cake_tin_data **t,
static void cake_reconfigure(struct Qdisc *sch);
+void adjust_parent_qlen(struct Qdisc *sch, unsigned int n,
+ unsigned int len)
+{
+ u32 parentid;
+ if (n == 0 && len == 0)
+ return;
+ rcu_read_lock();
+ while ((parentid = sch->parent)) {
+ if (TC_H_MAJ(parentid) == TC_H_MAJ(TC_H_INGRESS))
+ break;
+
+ if (sch->flags & TCQ_F_NOPARENT)
+ break;
+ sch = qdisc_lookup(qdisc_dev(sch), TC_H_MAJ(parentid));
+ if (sch == NULL) {
+ WARN_ON_ONCE(parentid != TC_H_ROOT);
+ break;
+ }
+ sch->q.qlen += n;
+ sch->qstats.backlog += len;
+ }
+ rcu_read_unlock();
+}
+
static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
@@ -1667,7 +1691,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
if (skb_is_gso(skb) && q->rate_flags & CAKE_FLAG_SPLIT_GSO) {
struct sk_buff *segs, *nskb;
netdev_features_t features = netif_skb_features(skb);
- unsigned int slen = 0;
+ unsigned int slen = 0, numsegs = 0;
segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK);
if (IS_ERR_OR_NULL(segs))
@@ -1684,6 +1708,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
sch->q.qlen++;
slen += segs->len;
+ numsegs++;
q->buffer_used += segs->truesize;
b->packets++;
segs = nskb;
@@ -1696,7 +1721,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
sch->qstats.backlog += slen;
q->avg_window_bytes += slen;
- qdisc_tree_reduce_backlog(sch, 1, len);
+ adjust_parent_qlen(sch, numsegs - 1, slen - len);
consume_skb(skb);
} else {
/* not splitting */
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-04 22:34 ` Toke Høiland-Jørgensen
@ 2019-01-05 5:58 ` Pete Heist
2019-01-05 10:06 ` Toke Høiland-Jørgensen
2019-01-05 10:44 ` Jonathan Morton
1 sibling, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 5:58 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 4, 2019, at 11:34 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
> This basically means that we can't use CAKE as a leaf qdisc with GSO
> splitting as it stands currently. I *think* the solution is for CAKE to
> notify its parents; could you try the patch below and see if it helps?
Aha, good news. :)
I’m probably not currently in a position to try it on my old kernels with the out of tree build:
On 3.16.7:
root@apu1a:~/src/sch_cake# make
make[1]: Entering directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
CC [M] /root/src/sch_cake/sch_cake.o
/root/src/sch_cake/sch_cake.c: In function ‘adjust_parent_qlen’:
/root/src/sch_cake/sch_cake.c:1738:20: error: ‘TCQ_F_NOPARENT’ undeclared (first use in this function)
if (sch->flags & TCQ_F_NOPARENT)
^
/root/src/sch_cake/sch_cake.c:1738:20: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:263: recipe for target '/root/src/sch_cake/sch_cake.o' failed
make[2]: *** [/root/src/sch_cake/sch_cake.o] Error 1
Makefile:1350: recipe for target '_module_/root/src/sch_cake' failed
make[1]: *** [_module_/root/src/sch_cake] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
Makefile:7: recipe for target 'default' failed
make: *** [default] Error 2
On 4.9.0-8:
root@apu2a:~/src/sch_cake$ make
make[1]: Entering directory '/usr/src/linux-headers-4.9.0-8-amd64'
CC [M] /home/sysadmin/src/sch_cake/sch_cake.o
Building modules, stage 2.
MODPOST 1 modules
WARNING: "qdisc_lookup" [/home/sysadmin/src/sch_cake/sch_cake.ko] undefined!
CC /home/sysadmin/src/sch_cake/sch_cake.mod.o
LD [M] /home/sysadmin/src/sch_cake/sch_cake.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.9.0-8-amd64'
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 5:58 ` Pete Heist
@ 2019-01-05 10:06 ` Toke Høiland-Jørgensen
2019-01-05 10:59 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 10:06 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 4, 2019, at 11:34 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>> This basically means that we can't use CAKE as a leaf qdisc with GSO
>> splitting as it stands currently. I *think* the solution is for CAKE to
>> notify its parents; could you try the patch below and see if it helps?
>
> Aha, good news. :)
>
> I’m probably not currently in a position to try it on my old kernels with the out of tree build:
>
> On 3.16.7:
Hmm, try this version for 3.16 - probably doesn't work on later kernels.
I'll look into a proper backport once you've confirmed that it works :)
-Toke
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index b910cd5c56f7..ef3acdbb8429 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -1617,6 +1617,44 @@ static u32 cake_classify(struct Qdisc *sch, struct cake_tin_data **t,
static void cake_reconfigure(struct Qdisc *sch);
+
+static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
+{
+ struct Qdisc *q;
+
+ if (!(root->flags & TCQ_F_BUILTIN) &&
+ root->handle == handle)
+ return root;
+
+ list_for_each_entry(q, &root->list, list) {
+ if (q->handle == handle)
+ return q;
+ }
+ return NULL;
+}
+
+void adjust_parent_qlen(struct Qdisc *sch, unsigned int n,
+ unsigned int len)
+{
+ u32 parentid;
+ if (n == 0 && len == 0)
+ return;
+ rcu_read_lock();
+ while ((parentid = sch->parent)) {
+ if (TC_H_MAJ(parentid) == TC_H_MAJ(TC_H_INGRESS))
+ break;
+
+ sch = qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
+ if (sch == NULL) {
+ WARN_ON_ONCE(parentid != TC_H_ROOT);
+ break;
+ }
+ sch->q.qlen += n;
+ sch->qstats.backlog += len;
+ }
+ rcu_read_unlock();
+}
+
static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
@@ -1667,7 +1705,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
if (skb_is_gso(skb) && q->rate_flags & CAKE_FLAG_SPLIT_GSO) {
struct sk_buff *segs, *nskb;
netdev_features_t features = netif_skb_features(skb);
- unsigned int slen = 0;
+ unsigned int slen = 0, numsegs = 0;
segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK);
if (IS_ERR_OR_NULL(segs))
@@ -1684,6 +1722,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
sch->q.qlen++;
slen += segs->len;
+ numsegs++;
q->buffer_used += segs->truesize;
b->packets++;
segs = nskb;
@@ -1696,7 +1735,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
sch->qstats.backlog += slen;
q->avg_window_bytes += slen;
- qdisc_tree_reduce_backlog(sch, 1, len);
+ adjust_parent_qlen(sch, numsegs - 1, slen - len);
consume_skb(skb);
} else {
/* not splitting */
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-04 22:34 ` Toke Høiland-Jørgensen
2019-01-05 5:58 ` Pete Heist
@ 2019-01-05 10:44 ` Jonathan Morton
2019-01-05 11:17 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 47+ messages in thread
From: Jonathan Morton @ 2019-01-05 10:44 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Pete Heist, Cake List
> On 5 Jan, 2019, at 12:34 am, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> This basically means that we can't use CAKE as a leaf qdisc with GSO
> splitting as it stands currently. I *think* the solution is for CAKE to
> notify its parents; could you try the patch below and see if it helps?
Is this also a problem on current kernels, or only older ones?
- Jonathan Morton
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 10:06 ` Toke Høiland-Jørgensen
@ 2019-01-05 10:59 ` Pete Heist
2019-01-05 11:06 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 10:59 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 5, 2019, at 11:06 AM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Hmm, try this version for 3.16 - probably doesn't work on later kernels.
> I'll look into a proper backport once you've confirmed that it works :)
Thanks! Quick reminder, I’ve only seen this happen with hfsc, not when cake is a leaf below htb, for whatever reason, but that aside...
After the patch I was able to do an iperf3 upload through the one-armed router (receive on default VLAN and send on tagged VLAN), but when I ran iperf3 in reverse mode (receive on tagged VLAN and send on default VLAN), this happened right away (also see compile warnings below):
root@apu1a:~# [ 341.268556] BUG: unable to handle kernel NULL pointer dereference at 00000008
[ 341.275801] IP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a [sch_cake]
[ 341.282290] *pde = 00000000
[ 341.285203] Oops: 0000 [#1] SMP
[ 341.288496] Modules linked in: em_meta cls_basic sch_hfsc sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle iptable_nat nfi
[ 341.339568] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.7-ckt9-voyage #1
[ 341.347576] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 341.353765] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
[ 341.359173] EIP: 0060:[<fa0e8834>] EFLAGS: 00210206 CPU: 1
[ 341.364669] EIP is at adjust_parent_qlen+0x37/0xf1a [sch_cake]
[ 341.370508] EAX: f5d65000 EBX: ffffffe8 ECX: 00000000 EDX: 00000003
[ 341.376774] ESI: 00010000 EDI: f2900000 EBP: f5c99cf0 ESP: f5c99ce8
[ 341.383041] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 341.388447] CR0: 8005003b CR2: 00000008 CR3: 35cde000 CR4: 00000790
[ 341.394713] Stack:
[ 341.396734] 000001e1 f56f0000 f5c99d84 fa0e913c e0b706f9 0209090a f2907840 000017a8
[ 341.404678] 00000000 00000004 000001e0 000001e1 f56f0100 f2a90000 000017a8 c131996d
[ 341.412626] ca4edbd8 0000004d f54fca00 f54fca00 f5c99dbc f5c99d54 0209090a 0209480a
[ 341.420573] Call Trace:
[ 341.423037] [<fa0e913c>] adjust_parent_qlen+0x93f/0xf1a [sch_cake]
[ 341.429322] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
[ 341.434734] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
[ 341.440054] [<c12a091e>] ? tc_classify+0x1a/0x8b
[ 341.444767] [<fa0f4a8f>] 0xfa0f4a8e
[ 341.448361] [<c128b628>] __dev_queue_xmit+0x210/0x35f
[ 341.453504] [<c12af74e>] ? ip_fragment+0x79f/0x79f
[ 341.458392] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 341.463023] [<c12916f9>] neigh_resolve_output+0x12f/0x145
[ 341.468517] [<c12afaa0>] ip_finish_output+0x352/0x73d
[ 341.473664] [<c12b0e19>] ip_output+0x73/0xaf
[ 341.478033] [<c12ad986>] ip_forward_finish+0x66/0x6b
[ 341.483091] [<c12adc3b>] ip_forward+0x2b0/0x36d
[ 341.487720] [<c12ac467>] ip_rcv_finish+0x267/0x29a
[ 341.492607] [<c12aca4c>] ip_rcv+0x2b4/0x338
[ 341.496894] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 341.502741] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 341.507975] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 341.513818] [<c1289d6c>] napi_gro_complete+0x87/0x8c
[ 341.518878] [<c128a020>] napi_gro_flush+0x3e/0x53
[ 341.523680] [<c128a04c>] napi_complete+0x17/0x27
[ 341.528394] [<f80361a3>] 0xf80361a2
[ 341.531985] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 341.536785] [<c102d689>] __do_softirq+0x91/0x175
[ 341.541501] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 341.547685] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 341.553002] <IRQ>
[ 341.554933] [<c102d8a9>] irq_exit+0x34/0x75
[ 341.559445] [<c1002f30>] do_IRQ+0x92/0xa6
[ 341.563554] [<c131a4ec>] common_interrupt+0x2c/0x40
[ 341.568531] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
[ 341.573936] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 341.578567] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 341.583802] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 341.588856] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74 53 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 18e
[ 341.609292] EIP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a [sch_cake] SS:ESP 0068:f5c99ce8
[ 341.617689] CR2: 0000000000000008
[ 341.621012] ---[ end trace db8ecd998020cc49 ]---
[ 341.625639] Kernel panic - not syncing: Fatal exception in interrupt
[ 341.632073] Kernel Offset: 0x0 from 0xc1000000 (relocation range: 0xc0000000-0xf7ffdfff)
[ 341.640215] Rebooting in 30 seconds..
Probably less critically, some compile warnings:
root@apu1a:~/src/sch_cake# make clean
make[1]: Entering directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
CLEAN /root/src/sch_cake/.tmp_versions
CLEAN /root/src/sch_cake/Module.symvers
make[1]: Leaving directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
root@apu1a:~/src/sch_cake# make
make[1]: Entering directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
CC [M] /root/src/sch_cake/sch_cake.o
/root/src/sch_cake/sch_cake.c: In function ‘adjust_parent_qlen’:
/root/src/sch_cake/sch_cake.c:1753:31: warning: passing argument 1 of ‘qdisc_match_from_root’ from incompatible pointer type
sch = qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
^
/root/src/sch_cake/sch_cake.c:1727:22: note: expected ‘struct Qdisc *’ but argument is of type ‘struct net_device *’
static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
^
Building modules, stage 2.
MODPOST 1 modules
CC /root/src/sch_cake/sch_cake.mod.o
LD [M] /root/src/sch_cake/sch_cake.ko
make[1]: Leaving directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 10:59 ` Pete Heist
@ 2019-01-05 11:06 ` Pete Heist
2019-01-05 11:18 ` Toke Høiland-Jørgensen
2019-01-05 12:38 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-05 11:06 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Quick update to the trace because I had to apply the patch manually and missed one line to remove (qdisc_tree_reduce_backlog...), just so it doesn’t through off the addresses for you, but it still does the same thing:
root@apu1a:~/src/sch_cake# [ 697.089814] BUG: unable to handle kernel NULL pointer dereference at 00000008
[ 697.097009] IP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08 [sch_cake]
[ 697.103491] *pde = 00000000
[ 697.106405] Oops: 0000 [#1] SMP
[ 697.109697] Modules linked in: em_meta cls_basic sch_hfsc sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle iptable_nat nfn
[ 697.160768] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.7-ckt9-voyage #1
[ 697.168776] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 697.174957] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
[ 697.180366] EIP: 0060:[<f9f39834>] EFLAGS: 00210206 CPU: 1
[ 697.185862] EIP is at adjust_parent_qlen+0x37/0xf08 [sch_cake]
[ 697.191701] EAX: f5cdd000 EBX: ffffffe8 ECX: 00000000 EDX: 00000003
[ 697.197977] ESI: 00010000 EDI: f2f00000 EBP: f5c99cf0 ESP: f5c99ce8
[ 697.204250] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 697.209648] CR0: 8005003b CR2: 00000008 CR3: 305c6000 CR4: 00000790
[ 697.215913] Stack:
[ 697.217932] 000000ba ef8f0000 f5c99d84 f9f3a12a e0b706f9 0209090a 000011be 000017a8
[ 697.225879] 00000000 00000004 000000b9 f2f02e80 00200246 000000ba ef8f0100 c131996d
[ 697.233828] 93ceb5ff 0000009f f5486e80 f5486e80 f5c99dbc f5c99d54 0209090a 0209480a
[ 697.241776] Call Trace:
[ 697.244239] [<f9f3a12a>] adjust_parent_qlen+0x92d/0xf08 [sch_cake]
[ 697.250524] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
[ 697.255936] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
[ 697.261254] [<c12a091e>] ? tc_classify+0x1a/0x8b
[ 697.265967] [<f9f45a8f>] 0xf9f45a8e
[ 697.269564] [<c128b628>] __dev_queue_xmit+0x210/0x35f
[ 697.274715] [<c12af74e>] ? ip_fragment+0x79f/0x79f
[ 697.279601] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 697.284231] [<c12916f9>] neigh_resolve_output+0x12f/0x145
[ 697.289727] [<c12afaa0>] ip_finish_output+0x352/0x73d
[ 697.294872] [<c12b0e19>] ip_output+0x73/0xaf
[ 697.299240] [<c12ad986>] ip_forward_finish+0x66/0x6b
[ 697.304301] [<c12adc3b>] ip_forward+0x2b0/0x36d
[ 697.308930] [<c12ac467>] ip_rcv_finish+0x267/0x29a
[ 697.313817] [<c12aca4c>] ip_rcv+0x2b4/0x338
[ 697.318103] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 697.323944] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 697.329176] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 697.335017] [<c1289d6c>] napi_gro_complete+0x87/0x8c
[ 697.340080] [<c128a020>] napi_gro_flush+0x3e/0x53
[ 697.344880] [<c128a04c>] napi_complete+0x17/0x27
[ 697.349594] [<f81161a3>] 0xf81161a2
[ 697.353186] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 697.357986] [<c102d689>] __do_softirq+0x91/0x175
[ 697.362701] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 697.368886] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 697.374203] <IRQ>
[ 697.376136] [<c102d8a9>] irq_exit+0x34/0x75
[ 697.380646] [<c1002f30>] do_IRQ+0x92/0xa6
[ 697.384755] [<c131a4ec>] common_interrupt+0x2c/0x40
[ 697.389733] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
[ 697.395137] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 697.399768] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 697.405002] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 697.410056] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74 53 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 183
[ 697.430491] EIP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08 [sch_cake] SS:ESP 0068:f5c99ce8
[ 697.438891] CR2: 0000000000000008
[ 697.442220] ---[ end trace 4fdb119875d1f11d ]---
[ 697.446847] Kernel panic - not syncing: Fatal exception in interrupt
[ 697.453281] Kernel Offset: 0x0 from 0xc1000000 (relocation range: 0xc0000000-0xf7ffdfff)
[ 697.461426] Rebooting in 30 seconds..
> On Jan 5, 2019, at 11:59 AM, Pete Heist <pete@heistp.net> wrote:
>
>
>> On Jan 5, 2019, at 11:06 AM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Hmm, try this version for 3.16 - probably doesn't work on later kernels.
>> I'll look into a proper backport once you've confirmed that it works :)
>
> Thanks! Quick reminder, I’ve only seen this happen with hfsc, not when cake is a leaf below htb, for whatever reason, but that aside...
>
> After the patch I was able to do an iperf3 upload through the one-armed router (receive on default VLAN and send on tagged VLAN), but when I ran iperf3 in reverse mode (receive on tagged VLAN and send on default VLAN), this happened right away (also see compile warnings below):
>
>
> root@apu1a:~# [ 341.268556] BUG: unable to handle kernel NULL pointer dereference at 00000008
> [ 341.275801] IP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a [sch_cake]
> [ 341.282290] *pde = 00000000
> [ 341.285203] Oops: 0000 [#1] SMP
> [ 341.288496] Modules linked in: em_meta cls_basic sch_hfsc sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle iptable_nat nfi
> [ 341.339568] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.7-ckt9-voyage #1
> [ 341.347576] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
> [ 341.353765] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
> [ 341.359173] EIP: 0060:[<fa0e8834>] EFLAGS: 00210206 CPU: 1
> [ 341.364669] EIP is at adjust_parent_qlen+0x37/0xf1a [sch_cake]
> [ 341.370508] EAX: f5d65000 EBX: ffffffe8 ECX: 00000000 EDX: 00000003
> [ 341.376774] ESI: 00010000 EDI: f2900000 EBP: f5c99cf0 ESP: f5c99ce8
> [ 341.383041] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 341.388447] CR0: 8005003b CR2: 00000008 CR3: 35cde000 CR4: 00000790
> [ 341.394713] Stack:
> [ 341.396734] 000001e1 f56f0000 f5c99d84 fa0e913c e0b706f9 0209090a f2907840 000017a8
> [ 341.404678] 00000000 00000004 000001e0 000001e1 f56f0100 f2a90000 000017a8 c131996d
> [ 341.412626] ca4edbd8 0000004d f54fca00 f54fca00 f5c99dbc f5c99d54 0209090a 0209480a
> [ 341.420573] Call Trace:
> [ 341.423037] [<fa0e913c>] adjust_parent_qlen+0x93f/0xf1a [sch_cake]
> [ 341.429322] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
> [ 341.434734] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
> [ 341.440054] [<c12a091e>] ? tc_classify+0x1a/0x8b
> [ 341.444767] [<fa0f4a8f>] 0xfa0f4a8e
> [ 341.448361] [<c128b628>] __dev_queue_xmit+0x210/0x35f
> [ 341.453504] [<c12af74e>] ? ip_fragment+0x79f/0x79f
> [ 341.458392] [<c128b78b>] dev_queue_xmit+0xa/0xc
> [ 341.463023] [<c12916f9>] neigh_resolve_output+0x12f/0x145
> [ 341.468517] [<c12afaa0>] ip_finish_output+0x352/0x73d
> [ 341.473664] [<c12b0e19>] ip_output+0x73/0xaf
> [ 341.478033] [<c12ad986>] ip_forward_finish+0x66/0x6b
> [ 341.483091] [<c12adc3b>] ip_forward+0x2b0/0x36d
> [ 341.487720] [<c12ac467>] ip_rcv_finish+0x267/0x29a
> [ 341.492607] [<c12aca4c>] ip_rcv+0x2b4/0x338
> [ 341.496894] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
> [ 341.502741] [<c1289674>] __netif_receive_skb+0x48/0x59
> [ 341.507975] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
> [ 341.513818] [<c1289d6c>] napi_gro_complete+0x87/0x8c
> [ 341.518878] [<c128a020>] napi_gro_flush+0x3e/0x53
> [ 341.523680] [<c128a04c>] napi_complete+0x17/0x27
> [ 341.528394] [<f80361a3>] 0xf80361a2
> [ 341.531985] [<c128a0b2>] net_rx_action+0x56/0x10e
> [ 341.536785] [<c102d689>] __do_softirq+0x91/0x175
> [ 341.541501] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
> [ 341.547685] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
> [ 341.553002] <IRQ>
> [ 341.554933] [<c102d8a9>] irq_exit+0x34/0x75
> [ 341.559445] [<c1002f30>] do_IRQ+0x92/0xa6
> [ 341.563554] [<c131a4ec>] common_interrupt+0x2c/0x40
> [ 341.568531] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
> [ 341.573936] [<c126aee8>] cpuidle_enter+0xf/0x12
> [ 341.578567] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
> [ 341.583802] [<c101d553>] start_secondary+0x1a6/0x1ab
> [ 341.588856] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74 53 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 18e
> [ 341.609292] EIP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a [sch_cake] SS:ESP 0068:f5c99ce8
> [ 341.617689] CR2: 0000000000000008
> [ 341.621012] ---[ end trace db8ecd998020cc49 ]---
> [ 341.625639] Kernel panic - not syncing: Fatal exception in interrupt
> [ 341.632073] Kernel Offset: 0x0 from 0xc1000000 (relocation range: 0xc0000000-0xf7ffdfff)
> [ 341.640215] Rebooting in 30 seconds..
>
>
>
> Probably less critically, some compile warnings:
>
> root@apu1a:~/src/sch_cake# make clean
> make[1]: Entering directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
> CLEAN /root/src/sch_cake/.tmp_versions
> CLEAN /root/src/sch_cake/Module.symvers
> make[1]: Leaving directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
> root@apu1a:~/src/sch_cake# make
> make[1]: Entering directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
> CC [M] /root/src/sch_cake/sch_cake.o
> /root/src/sch_cake/sch_cake.c: In function ‘adjust_parent_qlen’:
> /root/src/sch_cake/sch_cake.c:1753:31: warning: passing argument 1 of ‘qdisc_match_from_root’ from incompatible pointer type
> sch = qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
> ^
> /root/src/sch_cake/sch_cake.c:1727:22: note: expected ‘struct Qdisc *’ but argument is of type ‘struct net_device *’
> static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
> ^
> Building modules, stage 2.
> MODPOST 1 modules
> CC /root/src/sch_cake/sch_cake.mod.o
> LD [M] /root/src/sch_cake/sch_cake.ko
> make[1]: Leaving directory '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 10:44 ` Jonathan Morton
@ 2019-01-05 11:17 ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 11:17 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Pete Heist, Cake List
On 5 January 2019 11:44:44 CET, Jonathan Morton <chromatix99@gmail.com> wrote:
>> On 5 Jan, 2019, at 12:34 am, Toke Høiland-Jørgensen <toke@toke.dk>
>wrote:
>>
>> This basically means that we can't use CAKE as a leaf qdisc with GSO
>> splitting as it stands currently. I *think* the solution is for CAKE
>to
>> notify its parents; could you try the patch below and see if it
>helps?
>
>Is this also a problem on current kernels, or only older ones?
Newer ones as well (I assume - haven't tested).
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 11:06 ` Pete Heist
@ 2019-01-05 11:18 ` Toke Høiland-Jørgensen
2019-01-05 11:26 ` Pete Heist
2019-01-05 12:38 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 11:18 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Reverse, is that with an ingress qdisc?
-Toke
On 5 January 2019 12:06:44 CET, Pete Heist <pete@heistp.net> wrote:
>Quick update to the trace because I had to apply the patch manually and
>missed one line to remove (qdisc_tree_reduce_backlog...), just so it
>doesn’t through off the addresses for you, but it still does the same
>thing:
>
>root@apu1a:~/src/sch_cake# [ 697.089814] BUG: unable to handle kernel
>NULL pointer dereference at 00000008
>[ 697.097009] IP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08
>[sch_cake]
>[ 697.103491] *pde = 00000000
>[ 697.106405] Oops: 0000 [#1] SMP
>[ 697.109697] Modules linked in: em_meta cls_basic sch_hfsc
>sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle
>iptable_nat nfn
>[ 697.160768] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O
>3.16.7-ckt9-voyage #1
>[ 697.168776] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
>[ 697.174957] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
>[ 697.180366] EIP: 0060:[<f9f39834>] EFLAGS: 00210206 CPU: 1
>[ 697.185862] EIP is at adjust_parent_qlen+0x37/0xf08 [sch_cake]
>[ 697.191701] EAX: f5cdd000 EBX: ffffffe8 ECX: 00000000 EDX: 00000003
>[ 697.197977] ESI: 00010000 EDI: f2f00000 EBP: f5c99cf0 ESP: f5c99ce8
>[ 697.204250] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>[ 697.209648] CR0: 8005003b CR2: 00000008 CR3: 305c6000 CR4: 00000790
>[ 697.215913] Stack:
>[ 697.217932] 000000ba ef8f0000 f5c99d84 f9f3a12a e0b706f9 0209090a
>000011be 000017a8
>[ 697.225879] 00000000 00000004 000000b9 f2f02e80 00200246 000000ba
>ef8f0100 c131996d
>[ 697.233828] 93ceb5ff 0000009f f5486e80 f5486e80 f5c99dbc f5c99d54
>0209090a 0209480a
>[ 697.241776] Call Trace:
>[ 697.244239] [<f9f3a12a>] adjust_parent_qlen+0x92d/0xf08 [sch_cake]
>[ 697.250524] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
>[ 697.255936] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
>[ 697.261254] [<c12a091e>] ? tc_classify+0x1a/0x8b
>[ 697.265967] [<f9f45a8f>] 0xf9f45a8e
>[ 697.269564] [<c128b628>] __dev_queue_xmit+0x210/0x35f
>[ 697.274715] [<c12af74e>] ? ip_fragment+0x79f/0x79f
>[ 697.279601] [<c128b78b>] dev_queue_xmit+0xa/0xc
>[ 697.284231] [<c12916f9>] neigh_resolve_output+0x12f/0x145
>[ 697.289727] [<c12afaa0>] ip_finish_output+0x352/0x73d
>[ 697.294872] [<c12b0e19>] ip_output+0x73/0xaf
>[ 697.299240] [<c12ad986>] ip_forward_finish+0x66/0x6b
>[ 697.304301] [<c12adc3b>] ip_forward+0x2b0/0x36d
>[ 697.308930] [<c12ac467>] ip_rcv_finish+0x267/0x29a
>[ 697.313817] [<c12aca4c>] ip_rcv+0x2b4/0x338
>[ 697.318103] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
>[ 697.323944] [<c1289674>] __netif_receive_skb+0x48/0x59
>[ 697.329176] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
>[ 697.335017] [<c1289d6c>] napi_gro_complete+0x87/0x8c
>[ 697.340080] [<c128a020>] napi_gro_flush+0x3e/0x53
>[ 697.344880] [<c128a04c>] napi_complete+0x17/0x27
>[ 697.349594] [<f81161a3>] 0xf81161a2
>[ 697.353186] [<c128a0b2>] net_rx_action+0x56/0x10e
>[ 697.357986] [<c102d689>] __do_softirq+0x91/0x175
>[ 697.362701] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
>[ 697.368886] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
>[ 697.374203] <IRQ>
>[ 697.376136] [<c102d8a9>] irq_exit+0x34/0x75
>[ 697.380646] [<c1002f30>] do_IRQ+0x92/0xa6
>[ 697.384755] [<c131a4ec>] common_interrupt+0x2c/0x40
>[ 697.389733] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
>[ 697.395137] [<c126aee8>] cpuidle_enter+0xf/0x12
>[ 697.399768] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
>[ 697.405002] [<c101d553>] start_secondary+0x1a6/0x1ab
>[ 697.410056] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74 53
>8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 183
>[ 697.430491] EIP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08
>[sch_cake] SS:ESP 0068:f5c99ce8
>[ 697.438891] CR2: 0000000000000008
>[ 697.442220] ---[ end trace 4fdb119875d1f11d ]---
>[ 697.446847] Kernel panic - not syncing: Fatal exception in interrupt
>[ 697.453281] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>0xc0000000-0xf7ffdfff)
>[ 697.461426] Rebooting in 30 seconds..
>
>
>> On Jan 5, 2019, at 11:59 AM, Pete Heist <pete@heistp.net> wrote:
>>
>>
>>> On Jan 5, 2019, at 11:06 AM, Toke Høiland-Jørgensen <toke@toke.dk>
>wrote:
>>>
>>> Hmm, try this version for 3.16 - probably doesn't work on later
>kernels.
>>> I'll look into a proper backport once you've confirmed that it works
>:)
>>
>> Thanks! Quick reminder, I’ve only seen this happen with hfsc, not
>when cake is a leaf below htb, for whatever reason, but that aside...
>>
>> After the patch I was able to do an iperf3 upload through the
>one-armed router (receive on default VLAN and send on tagged VLAN), but
>when I ran iperf3 in reverse mode (receive on tagged VLAN and send on
>default VLAN), this happened right away (also see compile warnings
>below):
>>
>>
>> root@apu1a:~# [ 341.268556] BUG: unable to handle kernel NULL
>pointer dereference at 00000008
>> [ 341.275801] IP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a
>[sch_cake]
>> [ 341.282290] *pde = 00000000
>> [ 341.285203] Oops: 0000 [#1] SMP
>> [ 341.288496] Modules linked in: em_meta cls_basic sch_hfsc
>sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle
>iptable_nat nfi
>> [ 341.339568] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O
>3.16.7-ckt9-voyage #1
>> [ 341.347576] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
>> [ 341.353765] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
>> [ 341.359173] EIP: 0060:[<fa0e8834>] EFLAGS: 00210206 CPU: 1
>> [ 341.364669] EIP is at adjust_parent_qlen+0x37/0xf1a [sch_cake]
>> [ 341.370508] EAX: f5d65000 EBX: ffffffe8 ECX: 00000000 EDX:
>00000003
>> [ 341.376774] ESI: 00010000 EDI: f2900000 EBP: f5c99cf0 ESP:
>f5c99ce8
>> [ 341.383041] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [ 341.388447] CR0: 8005003b CR2: 00000008 CR3: 35cde000 CR4:
>00000790
>> [ 341.394713] Stack:
>> [ 341.396734] 000001e1 f56f0000 f5c99d84 fa0e913c e0b706f9 0209090a
>f2907840 000017a8
>> [ 341.404678] 00000000 00000004 000001e0 000001e1 f56f0100 f2a90000
>000017a8 c131996d
>> [ 341.412626] ca4edbd8 0000004d f54fca00 f54fca00 f5c99dbc f5c99d54
>0209090a 0209480a
>> [ 341.420573] Call Trace:
>> [ 341.423037] [<fa0e913c>] adjust_parent_qlen+0x93f/0xf1a
>[sch_cake]
>> [ 341.429322] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
>> [ 341.434734] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
>> [ 341.440054] [<c12a091e>] ? tc_classify+0x1a/0x8b
>> [ 341.444767] [<fa0f4a8f>] 0xfa0f4a8e
>> [ 341.448361] [<c128b628>] __dev_queue_xmit+0x210/0x35f
>> [ 341.453504] [<c12af74e>] ? ip_fragment+0x79f/0x79f
>> [ 341.458392] [<c128b78b>] dev_queue_xmit+0xa/0xc
>> [ 341.463023] [<c12916f9>] neigh_resolve_output+0x12f/0x145
>> [ 341.468517] [<c12afaa0>] ip_finish_output+0x352/0x73d
>> [ 341.473664] [<c12b0e19>] ip_output+0x73/0xaf
>> [ 341.478033] [<c12ad986>] ip_forward_finish+0x66/0x6b
>> [ 341.483091] [<c12adc3b>] ip_forward+0x2b0/0x36d
>> [ 341.487720] [<c12ac467>] ip_rcv_finish+0x267/0x29a
>> [ 341.492607] [<c12aca4c>] ip_rcv+0x2b4/0x338
>> [ 341.496894] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
>> [ 341.502741] [<c1289674>] __netif_receive_skb+0x48/0x59
>> [ 341.507975] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
>> [ 341.513818] [<c1289d6c>] napi_gro_complete+0x87/0x8c
>> [ 341.518878] [<c128a020>] napi_gro_flush+0x3e/0x53
>> [ 341.523680] [<c128a04c>] napi_complete+0x17/0x27
>> [ 341.528394] [<f80361a3>] 0xf80361a2
>> [ 341.531985] [<c128a0b2>] net_rx_action+0x56/0x10e
>> [ 341.536785] [<c102d689>] __do_softirq+0x91/0x175
>> [ 341.541501] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
>> [ 341.547685] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
>> [ 341.553002] <IRQ>
>> [ 341.554933] [<c102d8a9>] irq_exit+0x34/0x75
>> [ 341.559445] [<c1002f30>] do_IRQ+0x92/0xa6
>> [ 341.563554] [<c131a4ec>] common_interrupt+0x2c/0x40
>> [ 341.568531] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
>> [ 341.573936] [<c126aee8>] cpuidle_enter+0xf/0x12
>> [ 341.578567] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
>> [ 341.583802] [<c101d553>] start_secondary+0x1a6/0x1ab
>> [ 341.588856] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74
>53 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 18e
>> [ 341.609292] EIP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a
>[sch_cake] SS:ESP 0068:f5c99ce8
>> [ 341.617689] CR2: 0000000000000008
>> [ 341.621012] ---[ end trace db8ecd998020cc49 ]---
>> [ 341.625639] Kernel panic - not syncing: Fatal exception in
>interrupt
>> [ 341.632073] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>0xc0000000-0xf7ffdfff)
>> [ 341.640215] Rebooting in 30 seconds..
>>
>>
>>
>> Probably less critically, some compile warnings:
>>
>> root@apu1a:~/src/sch_cake# make clean
>> make[1]: Entering directory
>'/usr/src/linux-headers-3.16.7-ckt9-voyage'
>> CLEAN /root/src/sch_cake/.tmp_versions
>> CLEAN /root/src/sch_cake/Module.symvers
>> make[1]: Leaving directory
>'/usr/src/linux-headers-3.16.7-ckt9-voyage'
>> root@apu1a:~/src/sch_cake# make
>> make[1]: Entering directory
>'/usr/src/linux-headers-3.16.7-ckt9-voyage'
>> CC [M] /root/src/sch_cake/sch_cake.o
>> /root/src/sch_cake/sch_cake.c: In function ‘adjust_parent_qlen’:
>> /root/src/sch_cake/sch_cake.c:1753:31: warning: passing argument 1 of
>‘qdisc_match_from_root’ from incompatible pointer type
>> sch = qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
>> ^
>> /root/src/sch_cake/sch_cake.c:1727:22: note: expected ‘struct Qdisc
>*’ but argument is of type ‘struct net_device *’
>> static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32
>handle)
>> ^
>> Building modules, stage 2.
>> MODPOST 1 modules
>> CC /root/src/sch_cake/sch_cake.mod.o
>> LD [M] /root/src/sch_cake/sch_cake.ko
>> make[1]: Leaving directory
>'/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 11:18 ` Toke Høiland-Jørgensen
@ 2019-01-05 11:26 ` Pete Heist
2019-01-05 11:35 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 11:26 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Nope, egress on both eth0 and eth0.3300.
Dunce question, but I’m applying the patch manually because copying it from email didn’t seem to work- how to get patch to work?
root@apu1a:~/src/sch_cake# patch sch_cake.c ../hfsc.patch
patching file sch_cake.c
patch: **** malformed patch at line 7: static void cake_reconfigure(struct Qdisc *sch);
root@apu1a:~/src/sch_cake# git apply ../hfsc.patch
fatal: corrupt patch at line 7
> On Jan 5, 2019, at 12:18 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Reverse, is that with an ingress qdisc?
>
> -Toke
>
> On 5 January 2019 12:06:44 CET, Pete Heist <pete@heistp.net> wrote:
>> Quick update to the trace because I had to apply the patch manually and
>> missed one line to remove (qdisc_tree_reduce_backlog...), just so it
>> doesn’t through off the addresses for you, but it still does the same
>> thing:
>>
>> root@apu1a:~/src/sch_cake# [ 697.089814] BUG: unable to handle kernel
>> NULL pointer dereference at 00000008
>> [ 697.097009] IP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08
>> [sch_cake]
>> [ 697.103491] *pde = 00000000
>> [ 697.106405] Oops: 0000 [#1] SMP
>> [ 697.109697] Modules linked in: em_meta cls_basic sch_hfsc
>> sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle
>> iptable_nat nfn
>> [ 697.160768] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O
>> 3.16.7-ckt9-voyage #1
>> [ 697.168776] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
>> [ 697.174957] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
>> [ 697.180366] EIP: 0060:[<f9f39834>] EFLAGS: 00210206 CPU: 1
>> [ 697.185862] EIP is at adjust_parent_qlen+0x37/0xf08 [sch_cake]
>> [ 697.191701] EAX: f5cdd000 EBX: ffffffe8 ECX: 00000000 EDX: 00000003
>> [ 697.197977] ESI: 00010000 EDI: f2f00000 EBP: f5c99cf0 ESP: f5c99ce8
>> [ 697.204250] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [ 697.209648] CR0: 8005003b CR2: 00000008 CR3: 305c6000 CR4: 00000790
>> [ 697.215913] Stack:
>> [ 697.217932] 000000ba ef8f0000 f5c99d84 f9f3a12a e0b706f9 0209090a
>> 000011be 000017a8
>> [ 697.225879] 00000000 00000004 000000b9 f2f02e80 00200246 000000ba
>> ef8f0100 c131996d
>> [ 697.233828] 93ceb5ff 0000009f f5486e80 f5486e80 f5c99dbc f5c99d54
>> 0209090a 0209480a
>> [ 697.241776] Call Trace:
>> [ 697.244239] [<f9f3a12a>] adjust_parent_qlen+0x92d/0xf08 [sch_cake]
>> [ 697.250524] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
>> [ 697.255936] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
>> [ 697.261254] [<c12a091e>] ? tc_classify+0x1a/0x8b
>> [ 697.265967] [<f9f45a8f>] 0xf9f45a8e
>> [ 697.269564] [<c128b628>] __dev_queue_xmit+0x210/0x35f
>> [ 697.274715] [<c12af74e>] ? ip_fragment+0x79f/0x79f
>> [ 697.279601] [<c128b78b>] dev_queue_xmit+0xa/0xc
>> [ 697.284231] [<c12916f9>] neigh_resolve_output+0x12f/0x145
>> [ 697.289727] [<c12afaa0>] ip_finish_output+0x352/0x73d
>> [ 697.294872] [<c12b0e19>] ip_output+0x73/0xaf
>> [ 697.299240] [<c12ad986>] ip_forward_finish+0x66/0x6b
>> [ 697.304301] [<c12adc3b>] ip_forward+0x2b0/0x36d
>> [ 697.308930] [<c12ac467>] ip_rcv_finish+0x267/0x29a
>> [ 697.313817] [<c12aca4c>] ip_rcv+0x2b4/0x338
>> [ 697.318103] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
>> [ 697.323944] [<c1289674>] __netif_receive_skb+0x48/0x59
>> [ 697.329176] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
>> [ 697.335017] [<c1289d6c>] napi_gro_complete+0x87/0x8c
>> [ 697.340080] [<c128a020>] napi_gro_flush+0x3e/0x53
>> [ 697.344880] [<c128a04c>] napi_complete+0x17/0x27
>> [ 697.349594] [<f81161a3>] 0xf81161a2
>> [ 697.353186] [<c128a0b2>] net_rx_action+0x56/0x10e
>> [ 697.357986] [<c102d689>] __do_softirq+0x91/0x175
>> [ 697.362701] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
>> [ 697.368886] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
>> [ 697.374203] <IRQ>
>> [ 697.376136] [<c102d8a9>] irq_exit+0x34/0x75
>> [ 697.380646] [<c1002f30>] do_IRQ+0x92/0xa6
>> [ 697.384755] [<c131a4ec>] common_interrupt+0x2c/0x40
>> [ 697.389733] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
>> [ 697.395137] [<c126aee8>] cpuidle_enter+0xf/0x12
>> [ 697.399768] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
>> [ 697.405002] [<c101d553>] start_secondary+0x1a6/0x1ab
>> [ 697.410056] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74 53
>> 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 183
>> [ 697.430491] EIP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08
>> [sch_cake] SS:ESP 0068:f5c99ce8
>> [ 697.438891] CR2: 0000000000000008
>> [ 697.442220] ---[ end trace 4fdb119875d1f11d ]---
>> [ 697.446847] Kernel panic - not syncing: Fatal exception in interrupt
>> [ 697.453281] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>> 0xc0000000-0xf7ffdfff)
>> [ 697.461426] Rebooting in 30 seconds..
>>
>>
>>> On Jan 5, 2019, at 11:59 AM, Pete Heist <pete@heistp.net> wrote:
>>>
>>>
>>>> On Jan 5, 2019, at 11:06 AM, Toke Høiland-Jørgensen <toke@toke.dk>
>> wrote:
>>>>
>>>> Hmm, try this version for 3.16 - probably doesn't work on later
>> kernels.
>>>> I'll look into a proper backport once you've confirmed that it works
>> :)
>>>
>>> Thanks! Quick reminder, I’ve only seen this happen with hfsc, not
>> when cake is a leaf below htb, for whatever reason, but that aside...
>>>
>>> After the patch I was able to do an iperf3 upload through the
>> one-armed router (receive on default VLAN and send on tagged VLAN), but
>> when I ran iperf3 in reverse mode (receive on tagged VLAN and send on
>> default VLAN), this happened right away (also see compile warnings
>> below):
>>>
>>>
>>> root@apu1a:~# [ 341.268556] BUG: unable to handle kernel NULL
>> pointer dereference at 00000008
>>> [ 341.275801] IP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a
>> [sch_cake]
>>> [ 341.282290] *pde = 00000000
>>> [ 341.285203] Oops: 0000 [#1] SMP
>>> [ 341.288496] Modules linked in: em_meta cls_basic sch_hfsc
>> sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle
>> iptable_nat nfi
>>> [ 341.339568] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O
>> 3.16.7-ckt9-voyage #1
>>> [ 341.347576] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
>>> [ 341.353765] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
>>> [ 341.359173] EIP: 0060:[<fa0e8834>] EFLAGS: 00210206 CPU: 1
>>> [ 341.364669] EIP is at adjust_parent_qlen+0x37/0xf1a [sch_cake]
>>> [ 341.370508] EAX: f5d65000 EBX: ffffffe8 ECX: 00000000 EDX:
>> 00000003
>>> [ 341.376774] ESI: 00010000 EDI: f2900000 EBP: f5c99cf0 ESP:
>> f5c99ce8
>>> [ 341.383041] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>>> [ 341.388447] CR0: 8005003b CR2: 00000008 CR3: 35cde000 CR4:
>> 00000790
>>> [ 341.394713] Stack:
>>> [ 341.396734] 000001e1 f56f0000 f5c99d84 fa0e913c e0b706f9 0209090a
>> f2907840 000017a8
>>> [ 341.404678] 00000000 00000004 000001e0 000001e1 f56f0100 f2a90000
>> 000017a8 c131996d
>>> [ 341.412626] ca4edbd8 0000004d f54fca00 f54fca00 f5c99dbc f5c99d54
>> 0209090a 0209480a
>>> [ 341.420573] Call Trace:
>>> [ 341.423037] [<fa0e913c>] adjust_parent_qlen+0x93f/0xf1a
>> [sch_cake]
>>> [ 341.429322] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
>>> [ 341.434734] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
>>> [ 341.440054] [<c12a091e>] ? tc_classify+0x1a/0x8b
>>> [ 341.444767] [<fa0f4a8f>] 0xfa0f4a8e
>>> [ 341.448361] [<c128b628>] __dev_queue_xmit+0x210/0x35f
>>> [ 341.453504] [<c12af74e>] ? ip_fragment+0x79f/0x79f
>>> [ 341.458392] [<c128b78b>] dev_queue_xmit+0xa/0xc
>>> [ 341.463023] [<c12916f9>] neigh_resolve_output+0x12f/0x145
>>> [ 341.468517] [<c12afaa0>] ip_finish_output+0x352/0x73d
>>> [ 341.473664] [<c12b0e19>] ip_output+0x73/0xaf
>>> [ 341.478033] [<c12ad986>] ip_forward_finish+0x66/0x6b
>>> [ 341.483091] [<c12adc3b>] ip_forward+0x2b0/0x36d
>>> [ 341.487720] [<c12ac467>] ip_rcv_finish+0x267/0x29a
>>> [ 341.492607] [<c12aca4c>] ip_rcv+0x2b4/0x338
>>> [ 341.496894] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
>>> [ 341.502741] [<c1289674>] __netif_receive_skb+0x48/0x59
>>> [ 341.507975] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
>>> [ 341.513818] [<c1289d6c>] napi_gro_complete+0x87/0x8c
>>> [ 341.518878] [<c128a020>] napi_gro_flush+0x3e/0x53
>>> [ 341.523680] [<c128a04c>] napi_complete+0x17/0x27
>>> [ 341.528394] [<f80361a3>] 0xf80361a2
>>> [ 341.531985] [<c128a0b2>] net_rx_action+0x56/0x10e
>>> [ 341.536785] [<c102d689>] __do_softirq+0x91/0x175
>>> [ 341.541501] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
>>> [ 341.547685] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
>>> [ 341.553002] <IRQ>
>>> [ 341.554933] [<c102d8a9>] irq_exit+0x34/0x75
>>> [ 341.559445] [<c1002f30>] do_IRQ+0x92/0xa6
>>> [ 341.563554] [<c131a4ec>] common_interrupt+0x2c/0x40
>>> [ 341.568531] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
>>> [ 341.573936] [<c126aee8>] cpuidle_enter+0xf/0x12
>>> [ 341.578567] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
>>> [ 341.583802] [<c101d553>] start_secondary+0x1a6/0x1ab
>>> [ 341.588856] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74
>> 53 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 18e
>>> [ 341.609292] EIP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a
>> [sch_cake] SS:ESP 0068:f5c99ce8
>>> [ 341.617689] CR2: 0000000000000008
>>> [ 341.621012] ---[ end trace db8ecd998020cc49 ]---
>>> [ 341.625639] Kernel panic - not syncing: Fatal exception in
>> interrupt
>>> [ 341.632073] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>> 0xc0000000-0xf7ffdfff)
>>> [ 341.640215] Rebooting in 30 seconds..
>>>
>>>
>>>
>>> Probably less critically, some compile warnings:
>>>
>>> root@apu1a:~/src/sch_cake# make clean
>>> make[1]: Entering directory
>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>> CLEAN /root/src/sch_cake/.tmp_versions
>>> CLEAN /root/src/sch_cake/Module.symvers
>>> make[1]: Leaving directory
>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>> root@apu1a:~/src/sch_cake# make
>>> make[1]: Entering directory
>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>> CC [M] /root/src/sch_cake/sch_cake.o
>>> /root/src/sch_cake/sch_cake.c: In function ‘adjust_parent_qlen’:
>>> /root/src/sch_cake/sch_cake.c:1753:31: warning: passing argument 1 of
>> ‘qdisc_match_from_root’ from incompatible pointer type
>>> sch = qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
>>> ^
>>> /root/src/sch_cake/sch_cake.c:1727:22: note: expected ‘struct Qdisc
>> *’ but argument is of type ‘struct net_device *’
>>> static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32
>> handle)
>>> ^
>>> Building modules, stage 2.
>>> MODPOST 1 modules
>>> CC /root/src/sch_cake/sch_cake.mod.o
>>> LD [M] /root/src/sch_cake/sch_cake.ko
>>> make[1]: Leaving directory
>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 11:26 ` Pete Heist
@ 2019-01-05 11:35 ` Pete Heist
0 siblings, 0 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-05 11:35 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Giving you this also, for how it’s set up...
IFACE=eth0
RATE=100mbit
tc qdisc add dev $IFACE root handle 1: hfsc default 1
tc class add dev $IFACE parent 1: classid 1:1 hfsc ls rate $RATE ul rate $RATE
tc class add dev $IFACE parent 1: classid 1:2 hfsc ls rate $RATE ul rate $RATE
tc qdisc add dev $IFACE parent 1:1 cake besteffort dual-dsthost # no-split-gso
tc qdisc add dev $IFACE parent 1:2 cake besteffort dual-srchost # no-split-gso
tc filter add dev $IFACE parent 1:0 prio 1 protocol all \
basic match not "meta(vlan mask 0xfff gt 0x0)" flowid 1:1
tc filter add dev $IFACE parent 1:0 prio 2 protocol all \
basic match "meta(vlan mask 0xfff eq 0xce4)" flowid 1:2
> On Jan 5, 2019, at 12:26 PM, Pete Heist <pete@heistp.net> wrote:
>
> Nope, egress on both eth0 and eth0.3300.
>
> Dunce question, but I’m applying the patch manually because copying it from email didn’t seem to work- how to get patch to work?
>
> root@apu1a:~/src/sch_cake# patch sch_cake.c ../hfsc.patch
> patching file sch_cake.c
> patch: **** malformed patch at line 7: static void cake_reconfigure(struct Qdisc *sch);
> root@apu1a:~/src/sch_cake# git apply ../hfsc.patch
> fatal: corrupt patch at line 7
>
>> On Jan 5, 2019, at 12:18 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Reverse, is that with an ingress qdisc?
>>
>> -Toke
>>
>> On 5 January 2019 12:06:44 CET, Pete Heist <pete@heistp.net> wrote:
>>> Quick update to the trace because I had to apply the patch manually and
>>> missed one line to remove (qdisc_tree_reduce_backlog...), just so it
>>> doesn’t through off the addresses for you, but it still does the same
>>> thing:
>>>
>>> root@apu1a:~/src/sch_cake# [ 697.089814] BUG: unable to handle kernel
>>> NULL pointer dereference at 00000008
>>> [ 697.097009] IP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08
>>> [sch_cake]
>>> [ 697.103491] *pde = 00000000
>>> [ 697.106405] Oops: 0000 [#1] SMP
>>> [ 697.109697] Modules linked in: em_meta cls_basic sch_hfsc
>>> sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle
>>> iptable_nat nfn
>>> [ 697.160768] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O
>>> 3.16.7-ckt9-voyage #1
>>> [ 697.168776] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
>>> [ 697.174957] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
>>> [ 697.180366] EIP: 0060:[<f9f39834>] EFLAGS: 00210206 CPU: 1
>>> [ 697.185862] EIP is at adjust_parent_qlen+0x37/0xf08 [sch_cake]
>>> [ 697.191701] EAX: f5cdd000 EBX: ffffffe8 ECX: 00000000 EDX: 00000003
>>> [ 697.197977] ESI: 00010000 EDI: f2f00000 EBP: f5c99cf0 ESP: f5c99ce8
>>> [ 697.204250] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>>> [ 697.209648] CR0: 8005003b CR2: 00000008 CR3: 305c6000 CR4: 00000790
>>> [ 697.215913] Stack:
>>> [ 697.217932] 000000ba ef8f0000 f5c99d84 f9f3a12a e0b706f9 0209090a
>>> 000011be 000017a8
>>> [ 697.225879] 00000000 00000004 000000b9 f2f02e80 00200246 000000ba
>>> ef8f0100 c131996d
>>> [ 697.233828] 93ceb5ff 0000009f f5486e80 f5486e80 f5c99dbc f5c99d54
>>> 0209090a 0209480a
>>> [ 697.241776] Call Trace:
>>> [ 697.244239] [<f9f3a12a>] adjust_parent_qlen+0x92d/0xf08 [sch_cake]
>>> [ 697.250524] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
>>> [ 697.255936] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
>>> [ 697.261254] [<c12a091e>] ? tc_classify+0x1a/0x8b
>>> [ 697.265967] [<f9f45a8f>] 0xf9f45a8e
>>> [ 697.269564] [<c128b628>] __dev_queue_xmit+0x210/0x35f
>>> [ 697.274715] [<c12af74e>] ? ip_fragment+0x79f/0x79f
>>> [ 697.279601] [<c128b78b>] dev_queue_xmit+0xa/0xc
>>> [ 697.284231] [<c12916f9>] neigh_resolve_output+0x12f/0x145
>>> [ 697.289727] [<c12afaa0>] ip_finish_output+0x352/0x73d
>>> [ 697.294872] [<c12b0e19>] ip_output+0x73/0xaf
>>> [ 697.299240] [<c12ad986>] ip_forward_finish+0x66/0x6b
>>> [ 697.304301] [<c12adc3b>] ip_forward+0x2b0/0x36d
>>> [ 697.308930] [<c12ac467>] ip_rcv_finish+0x267/0x29a
>>> [ 697.313817] [<c12aca4c>] ip_rcv+0x2b4/0x338
>>> [ 697.318103] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
>>> [ 697.323944] [<c1289674>] __netif_receive_skb+0x48/0x59
>>> [ 697.329176] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
>>> [ 697.335017] [<c1289d6c>] napi_gro_complete+0x87/0x8c
>>> [ 697.340080] [<c128a020>] napi_gro_flush+0x3e/0x53
>>> [ 697.344880] [<c128a04c>] napi_complete+0x17/0x27
>>> [ 697.349594] [<f81161a3>] 0xf81161a2
>>> [ 697.353186] [<c128a0b2>] net_rx_action+0x56/0x10e
>>> [ 697.357986] [<c102d689>] __do_softirq+0x91/0x175
>>> [ 697.362701] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
>>> [ 697.368886] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
>>> [ 697.374203] <IRQ>
>>> [ 697.376136] [<c102d8a9>] irq_exit+0x34/0x75
>>> [ 697.380646] [<c1002f30>] do_IRQ+0x92/0xa6
>>> [ 697.384755] [<c131a4ec>] common_interrupt+0x2c/0x40
>>> [ 697.389733] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
>>> [ 697.395137] [<c126aee8>] cpuidle_enter+0xf/0x12
>>> [ 697.399768] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
>>> [ 697.405002] [<c101d553>] start_secondary+0x1a6/0x1ab
>>> [ 697.410056] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74 53
>>> 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 183
>>> [ 697.430491] EIP: [<f9f39834>] adjust_parent_qlen+0x37/0xf08
>>> [sch_cake] SS:ESP 0068:f5c99ce8
>>> [ 697.438891] CR2: 0000000000000008
>>> [ 697.442220] ---[ end trace 4fdb119875d1f11d ]---
>>> [ 697.446847] Kernel panic - not syncing: Fatal exception in interrupt
>>> [ 697.453281] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>>> 0xc0000000-0xf7ffdfff)
>>> [ 697.461426] Rebooting in 30 seconds..
>>>
>>>
>>>> On Jan 5, 2019, at 11:59 AM, Pete Heist <pete@heistp.net> wrote:
>>>>
>>>>
>>>>> On Jan 5, 2019, at 11:06 AM, Toke Høiland-Jørgensen <toke@toke.dk>
>>> wrote:
>>>>>
>>>>> Hmm, try this version for 3.16 - probably doesn't work on later
>>> kernels.
>>>>> I'll look into a proper backport once you've confirmed that it works
>>> :)
>>>>
>>>> Thanks! Quick reminder, I’ve only seen this happen with hfsc, not
>>> when cake is a leaf below htb, for whatever reason, but that aside...
>>>>
>>>> After the patch I was able to do an iperf3 upload through the
>>> one-armed router (receive on default VLAN and send on tagged VLAN), but
>>> when I ran iperf3 in reverse mode (receive on tagged VLAN and send on
>>> default VLAN), this happened right away (also see compile warnings
>>> below):
>>>>
>>>>
>>>> root@apu1a:~# [ 341.268556] BUG: unable to handle kernel NULL
>>> pointer dereference at 00000008
>>>> [ 341.275801] IP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a
>>> [sch_cake]
>>>> [ 341.282290] *pde = 00000000
>>>> [ 341.285203] Oops: 0000 [#1] SMP
>>>> [ 341.288496] Modules linked in: em_meta cls_basic sch_hfsc
>>> sch_cake(O) xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle
>>> iptable_nat nfi
>>>> [ 341.339568] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O
>>> 3.16.7-ckt9-voyage #1
>>>> [ 341.347576] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
>>>> [ 341.353765] task: f5c811d0 ti: f5c86000 task.ti: f5c86000
>>>> [ 341.359173] EIP: 0060:[<fa0e8834>] EFLAGS: 00210206 CPU: 1
>>>> [ 341.364669] EIP is at adjust_parent_qlen+0x37/0xf1a [sch_cake]
>>>> [ 341.370508] EAX: f5d65000 EBX: ffffffe8 ECX: 00000000 EDX:
>>> 00000003
>>>> [ 341.376774] ESI: 00010000 EDI: f2900000 EBP: f5c99cf0 ESP:
>>> f5c99ce8
>>>> [ 341.383041] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>>>> [ 341.388447] CR0: 8005003b CR2: 00000008 CR3: 35cde000 CR4:
>>> 00000790
>>>> [ 341.394713] Stack:
>>>> [ 341.396734] 000001e1 f56f0000 f5c99d84 fa0e913c e0b706f9 0209090a
>>> f2907840 000017a8
>>>> [ 341.404678] 00000000 00000004 000001e0 000001e1 f56f0100 f2a90000
>>> 000017a8 c131996d
>>>> [ 341.412626] ca4edbd8 0000004d f54fca00 f54fca00 f5c99dbc f5c99d54
>>> 0209090a 0209480a
>>>> [ 341.420573] Call Trace:
>>>> [ 341.423037] [<fa0e913c>] adjust_parent_qlen+0x93f/0xf1a
>>> [sch_cake]
>>>> [ 341.429322] [<c131996d>] ? _raw_spin_unlock_bh+0x13/0x15
>>>> [ 341.434734] [<c129f91c>] ? tc_classify_compat+0x2f/0x5f
>>>> [ 341.440054] [<c12a091e>] ? tc_classify+0x1a/0x8b
>>>> [ 341.444767] [<fa0f4a8f>] 0xfa0f4a8e
>>>> [ 341.448361] [<c128b628>] __dev_queue_xmit+0x210/0x35f
>>>> [ 341.453504] [<c12af74e>] ? ip_fragment+0x79f/0x79f
>>>> [ 341.458392] [<c128b78b>] dev_queue_xmit+0xa/0xc
>>>> [ 341.463023] [<c12916f9>] neigh_resolve_output+0x12f/0x145
>>>> [ 341.468517] [<c12afaa0>] ip_finish_output+0x352/0x73d
>>>> [ 341.473664] [<c12b0e19>] ip_output+0x73/0xaf
>>>> [ 341.478033] [<c12ad986>] ip_forward_finish+0x66/0x6b
>>>> [ 341.483091] [<c12adc3b>] ip_forward+0x2b0/0x36d
>>>> [ 341.487720] [<c12ac467>] ip_rcv_finish+0x267/0x29a
>>>> [ 341.492607] [<c12aca4c>] ip_rcv+0x2b4/0x338
>>>> [ 341.496894] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
>>>> [ 341.502741] [<c1289674>] __netif_receive_skb+0x48/0x59
>>>> [ 341.507975] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
>>>> [ 341.513818] [<c1289d6c>] napi_gro_complete+0x87/0x8c
>>>> [ 341.518878] [<c128a020>] napi_gro_flush+0x3e/0x53
>>>> [ 341.523680] [<c128a04c>] napi_complete+0x17/0x27
>>>> [ 341.528394] [<f80361a3>] 0xf80361a2
>>>> [ 341.531985] [<c128a0b2>] net_rx_action+0x56/0x10e
>>>> [ 341.536785] [<c102d689>] __do_softirq+0x91/0x175
>>>> [ 341.541501] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
>>>> [ 341.547685] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
>>>> [ 341.553002] <IRQ>
>>>> [ 341.554933] [<c102d8a9>] irq_exit+0x34/0x75
>>>> [ 341.559445] [<c1002f30>] do_IRQ+0x92/0xa6
>>>> [ 341.563554] [<c131a4ec>] common_interrupt+0x2c/0x40
>>>> [ 341.568531] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
>>>> [ 341.573936] [<c126aee8>] cpuidle_enter+0xf/0x12
>>>> [ 341.578567] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
>>>> [ 341.583802] [<c101d553>] start_secondary+0x1a6/0x1ab
>>>> [ 341.588856] Code: 70 24 85 f6 74 5e 66 31 f6 81 fe 00 00 ff ff 74
>>> 53 8b 40 34 8b 00 f6 40 08 01 75 05 39 70 20 74 16 8b 58 18 83 eb 18e
>>>> [ 341.609292] EIP: [<fa0e8834>] adjust_parent_qlen+0x37/0xf1a
>>> [sch_cake] SS:ESP 0068:f5c99ce8
>>>> [ 341.617689] CR2: 0000000000000008
>>>> [ 341.621012] ---[ end trace db8ecd998020cc49 ]---
>>>> [ 341.625639] Kernel panic - not syncing: Fatal exception in
>>> interrupt
>>>> [ 341.632073] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>>> 0xc0000000-0xf7ffdfff)
>>>> [ 341.640215] Rebooting in 30 seconds..
>>>>
>>>>
>>>>
>>>> Probably less critically, some compile warnings:
>>>>
>>>> root@apu1a:~/src/sch_cake# make clean
>>>> make[1]: Entering directory
>>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>>> CLEAN /root/src/sch_cake/.tmp_versions
>>>> CLEAN /root/src/sch_cake/Module.symvers
>>>> make[1]: Leaving directory
>>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>>> root@apu1a:~/src/sch_cake# make
>>>> make[1]: Entering directory
>>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>>> CC [M] /root/src/sch_cake/sch_cake.o
>>>> /root/src/sch_cake/sch_cake.c: In function ‘adjust_parent_qlen’:
>>>> /root/src/sch_cake/sch_cake.c:1753:31: warning: passing argument 1 of
>>> ‘qdisc_match_from_root’ from incompatible pointer type
>>>> sch = qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
>>>> ^
>>>> /root/src/sch_cake/sch_cake.c:1727:22: note: expected ‘struct Qdisc
>>> *’ but argument is of type ‘struct net_device *’
>>>> static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32
>>> handle)
>>>> ^
>>>> Building modules, stage 2.
>>>> MODPOST 1 modules
>>>> CC /root/src/sch_cake/sch_cake.mod.o
>>>> LD [M] /root/src/sch_cake/sch_cake.ko
>>>> make[1]: Leaving directory
>>> '/usr/src/linux-headers-3.16.7-ckt9-voyage'
>>>>
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 11:06 ` Pete Heist
2019-01-05 11:18 ` Toke Høiland-Jørgensen
@ 2019-01-05 12:38 ` Toke Høiland-Jørgensen
2019-01-05 12:51 ` Pete Heist
1 sibling, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 12:38 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
> Quick update to the trace because I had to apply the patch manually
> and missed one line to remove (qdisc_tree_reduce_backlog...), just so
> it doesn’t through off the addresses for you, but it still does the
> same thing:
Ah, my bad; you need to replace
qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
with
qdisc_match_from_root(qdisc_dev(sch)->qdisc, TC_H_MAJ(parentid));
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 12:38 ` Toke Høiland-Jørgensen
@ 2019-01-05 12:51 ` Pete Heist
2019-01-05 13:10 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 12:51 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Ok, that fixes the compiler warnings, but I get this now. Same as before it’s repeated until reboot, stack sometimes changes but I always see sch_hfsc.c:1427 at the beginning:
root@apu1a:~# [ 5972.967008] ------------[ cut here ]------------
[ 5972.971707] WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1427 0xfa02f4ef()
[ 5972.978812] Modules linked in: sch_cake(O) em_meta cls_basic sch_hfsc xt_ACCOUNT(O) ipt_REJECT xt_recent iptable_mangle iptable_nat nf]
[ 5973.032173] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.7-ckt9-voyage #1
[ 5973.040181] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 5973.046368] 00000000 00000000 f5c99d94 c13167e9 00000000 f5c99dac c102a7dd fa02f4ef
[ 5973.054323] f06b6c00 00000000 00000000 f5c99dbc c102a803 00000009 00000000 f5c99ddc
[ 5973.062279] fa02f4ef f06b6fc8 b48b3e3f 00000015 f06b6c00 00000000 00000040 f5c99df8
[ 5973.070226] Call Trace:
[ 5973.072703] [<c13167e9>] dump_stack+0x41/0x52
[ 5973.077160] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 5973.082485] [<fa02f4ef>] ? 0xfa02f4ee
[ 5973.086249] [<c102a803>] warn_slowpath_null+0xf/0x13
[ 5973.091308] [<fa02f4ef>] 0xfa02f4ee
[ 5973.094899] [<c129edf2>] __qdisc_run+0x81/0xf0
[ 5973.099441] [<c128b655>] __dev_queue_xmit+0x23d/0x35f
[ 5973.104594] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 5973.109224] [<c12afa93>] ip_finish_output+0x345/0x73d
[ 5973.114372] [<c12b0e19>] ip_output+0x73/0xaf
[ 5973.118741] [<c12ad986>] ip_forward_finish+0x66/0x6b
[ 5973.123801] [<c12adc3b>] ip_forward+0x2b0/0x36d
[ 5973.128429] [<c12ac467>] ip_rcv_finish+0x267/0x29a
[ 5973.133317] [<c12aca4c>] ip_rcv+0x2b4/0x338
[ 5973.137601] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 5973.143450] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 5973.148693] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 5973.154533] [<c1289d6c>] napi_gro_complete+0x87/0x8c
[ 5973.159594] [<c128a020>] napi_gro_flush+0x3e/0x53
[ 5973.164395] [<c128a04c>] napi_complete+0x17/0x27
[ 5973.169114] [<f80841a3>] 0xf80841a2
[ 5973.172709] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 5973.177517] [<c102d689>] __do_softirq+0x91/0x175
[ 5973.182233] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 5973.188421] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 5973.193744] <IRQ> [<c102d8a9>] irq_exit+0x34/0x75
[ 5973.198679] [<c1002f30>] do_IRQ+0x92/0xa6
[ 5973.202789] [<c131a4ec>] common_interrupt+0x2c/0x40
[ 5973.207765] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
[ 5973.213179] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 5973.217810] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 5973.223053] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 5973.228111] ---[ end trace fe3dc3fad8d2493c ]---
> On Jan 5, 2019, at 1:38 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>> Quick update to the trace because I had to apply the patch manually
>> and missed one line to remove (qdisc_tree_reduce_backlog...), just so
>> it doesn’t through off the addresses for you, but it still does the
>> same thing:
>
> Ah, my bad; you need to replace
>
> qdisc_match_from_root(qdisc_dev(sch), TC_H_MAJ(parentid));
>
> with
>
> qdisc_match_from_root(qdisc_dev(sch)->qdisc, TC_H_MAJ(parentid));
>
> -Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 12:51 ` Pete Heist
@ 2019-01-05 13:10 ` Toke Høiland-Jørgensen
2019-01-05 13:20 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 13:10 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
> Ok, that fixes the compiler warnings, but I get this now. Same as
> before it’s repeated until reboot, stack sometimes changes but I
> always see sch_hfsc.c:1427 at the beginning:
Hmm, that's odd. Could you try adding this debugging line in
adjust_parent_qlen(), right before the sch->q.qlen += n line:
net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
parentid, n, len);
And see if you actually get any of those lines in your dmesg?
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 13:10 ` Toke Høiland-Jørgensen
@ 2019-01-05 13:20 ` Pete Heist
2019-01-05 13:35 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 13:20 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Hmm, that's odd. Could you try adding this debugging line in
> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>
> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
> parentid, n, len);
>
> And see if you actually get any of those lines in your dmesg?
I do see the messages twice, then not after that in the rest of the output...
root@apu1a:~# [ 1740.883957] Adjusting parent qdisc 65537 with pkt += 3, len += 0
[ 1740.889856] ------------[ cut here ]------------
[ 1740.894710] WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1427 0xf9fe74ef()
[ 1740.901802] Modules linked in: em_meta cls_basic sch_hfsc sch_cake(O) xt_ACCOUNT(O) ipt_Rn
[ 1740.952881] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.7-ckt9-voyage #1
[ 1740.960891] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 1740.967079] 00000000 00000000 f5c99d94 c13167e9 00000000 f5c99dac c102a7dd f9fe74ef
[ 1740.975032] f547a400 00000000 00000000 f5c99dbc c102a803 00000009 00000000 f5c99ddc
[ 1740.982981] f9fe74ef f547a7c8 52ee0901 00000006 f547a400 00000000 00000040 f5c99df8
[ 1740.990936] Call Trace:
[ 1740.993411] [<c13167e9>] dump_stack+0x41/0x52
[ 1740.997870] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 1741.003196] [<f9fe74ef>] ? 0xf9fe74ee
[ 1741.006960] [<c102a803>] warn_slowpath_null+0xf/0x13
[ 1741.012018] [<f9fe74ef>] 0xf9fe74ee
[ 1741.015608] [<c129edf2>] __qdisc_run+0x81/0xf0
[ 1741.020149] [<c128b655>] __dev_queue_xmit+0x23d/0x35f
[ 1741.025296] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 1741.029927] [<c12afa93>] ip_finish_output+0x345/0x73d
[ 1741.035081] [<c12b0e19>] ip_output+0x73/0xaf
[ 1741.039449] [<c12ad986>] ip_forward_finish+0x66/0x6b
[ 1741.044511] [<c12adc3b>] ip_forward+0x2b0/0x36d
[ 1741.049138] [<c12ac467>] ip_rcv_finish+0x267/0x29a
[ 1741.054026] [<c12aca4c>] ip_rcv+0x2b4/0x338
[ 1741.058312] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 1741.064159] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 1741.069395] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 1741.075244] [<c1289d6c>] napi_gro_complete+0x87/0x8c
[ 1741.080305] [<c128a020>] napi_gro_flush+0x3e/0x53
[ 1741.085107] [<c128a04c>] napi_complete+0x17/0x27
[ 1741.089823] [<f81161a3>] 0xf81161a2
[ 1741.093420] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 1741.098230] [<c102d689>] __do_softirq+0x91/0x175
[ 1741.102953] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 1741.109149] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 1741.114463] <IRQ> [<c102d8a9>] irq_exit+0x34/0x75
[ 1741.119397] [<c1002f30>] do_IRQ+0x92/0xa6
[ 1741.123509] [<c131a4ec>] common_interrupt+0x2c/0x40
[ 1741.128485] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
[ 1741.133900] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 1741.138528] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 1741.143764] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 1741.148830] ---[ end trace 88c72563cbf4d106 ]---
[ 1741.153570] Adjusting parent qdisc 65537 with pkt += 5, len += 0
[ 1741.159457] ------------[ cut here ]------------
[ 1741.164287] WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1427 0xf9fe74ef()
[ 1741.171342] Modules linked in: em_meta cls_basic sch_hfsc sch_cake(O) xt_ACCOUNT(O) ipt_Rn
[ 1741.222421] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W O 3.16.7-ckt9-voyage #1
[ 1741.230423] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 1741.236611] 00000000 00000000 f5c99d70 c13167e9 00000000 f5c99d88 c102a7dd f9fe74ef
[ 1741.244566] f547a400 00000000 00000000 f5c99d98 c102a803 00000009 00000000 f5c99db8
[ 1741.252512] f9fe74ef f547a7c8 532e51d6 00000006 f547a400 00000000 00000040 f5c99dd4
[ 1741.260462] Call Trace:
[ 1741.262932] [<c13167e9>] dump_stack+0x41/0x52
[ 1741.267394] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 1741.272719] [<f9fe74ef>] ? 0xf9fe74ee
[ 1741.276487] [<c102a803>] warn_slowpath_null+0xf/0x13
[ 1741.281552] [<f9fe74ef>] 0xf9fe74ee
[ 1741.285141] [<c129edf2>] __qdisc_run+0x81/0xf0
[ 1741.289682] [<c128b655>] __dev_queue_xmit+0x23d/0x35f
[ 1741.294829] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 1741.299485] [<c12afa93>] ip_finish_output+0x345/0x73d
[ 1741.304644] [<c12b0e19>] ip_output+0x73/0xaf
[ 1741.309028] [<c12ad986>] ip_forward_finish+0x66/0x6b
[ 1741.314097] [<c12adc3b>] ip_forward+0x2b0/0x36d
[ 1741.318735] [<c12ac467>] ip_rcv_finish+0x267/0x29a
[ 1741.323628] [<c12aca4c>] ip_rcv+0x2b4/0x338
[ 1741.327922] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 1741.333781] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 1741.339034] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 1741.344891] [<c1289d6c>] napi_gro_complete+0x87/0x8c
[ 1741.349968] [<c1289f42>] dev_gro_receive+0x1d1/0x271
[ 1741.355040] [<c128a2b4>] napi_gro_receive+0x19/0x6d
[ 1741.360033] [<c10065ec>] ? text_poke_bp+0xa0/0xa0
[ 1741.364850] [<f811604a>] 0xf8116049
[ 1741.368460] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 1741.373273] [<c102d689>] __do_softirq+0x91/0x175
[ 1741.377998] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 1741.384203] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 1741.389533] <IRQ> [<c102d8a9>] irq_exit+0x34/0x75
[ 1741.394480] [<c1002f30>] do_IRQ+0x92/0xa6
[ 1741.398598] [<c131a4ec>] common_interrupt+0x2c/0x40
[ 1741.403590] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
[ 1741.409013] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 1741.413652] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 1741.418905] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 1741.423977] ---[ end trace 88c72563cbf4d107 ]---
[ 1741.428715] ------------[ cut here ]------------
[ 1741.433361] WARNING: CPU: 1 PID: 0 at net/sched/sch_hfsc.c:1427 0xf9fe74ef()
[ 1741.440425] Modules linked in: em_meta cls_basic sch_hfsc sch_cake(O) xt_ACCOUNT(O) ipt_Rn
[ 1741.491884] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W O 3.16.7-ckt9-voyage #1
[ 1741.499897] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 1741.506091] 00000000 00000000 f5c99dbc c13167e9 00000000 f5c99dd4 c102a7dd f9fe74ef
[ 1741.514083] f547a400 00000000 00000000 f5c99de4 c102a803 00000009 00000000 f5c99e04
[ 1741.522074] f9fe74ef f547a7c8 536e858e 00000006 f547a400 00000000 00000040 f5c99e20
[ 1741.530073] Call Trace:
[ 1741.532550] [<c13167e9>] dump_stack+0x41/0x52
[ 1741.537022] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 1741.542348] [<f9fe74ef>] ? 0xf9fe74ee
[ 1741.546121] [<c102a803>] warn_slowpath_null+0xf/0x13
[ 1741.551187] [<f9fe74ef>] 0xf9fe74ee
[ 1741.554785] [<c129edf2>] __qdisc_run+0x81/0xf0
[ 1741.559337] [<c128b655>] __dev_queue_xmit+0x23d/0x35f
[ 1741.564495] [<c128b78b>] dev_queue_xmit+0xa/0xc
[ 1741.569137] [<c12afe0c>] ip_finish_output+0x6be/0x73d
[ 1741.574298] [<c12b0e19>] ip_output+0x73/0xaf
[ 1741.578680] [<c12ad986>] ip_forward_finish+0x66/0x6b
[ 1741.583750] [<c12adc3b>] ip_forward+0x2b0/0x36d
[ 1741.588389] [<c12ac467>] ip_rcv_finish+0x267/0x29a
[ 1741.593284] [<c12aca4c>] ip_rcv+0x2b4/0x338
[ 1741.597577] [<c12895dd>] __netif_receive_skb_core+0x467/0x4b6
[ 1741.603435] [<c1289674>] __netif_receive_skb+0x48/0x59
[ 1741.608677] [<c1289cb9>] netif_receive_skb_internal+0x59/0x85
[ 1741.614529] [<c128a2cc>] napi_gro_receive+0x31/0x6d
[ 1741.619520] [<c10065ec>] ? text_poke_bp+0xa0/0xa0
[ 1741.624338] [<f811604a>] 0xf8116049
[ 1741.627935] [<c128a0b2>] net_rx_action+0x56/0x10e
[ 1741.632743] [<c102d689>] __do_softirq+0x91/0x175
[ 1741.637468] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 1741.643664] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 1741.648989] <IRQ> [<c102d8a9>] irq_exit+0x34/0x75
[ 1741.653931] [<c1002f30>] do_IRQ+0x92/0xa6
[ 1741.658052] [<c131a4ec>] common_interrupt+0x2c/0x40
[ 1741.663045] [<c126add1>] ? cpuidle_enter_state+0x37/0x96
[ 1741.668469] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 1741.673105] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 1741.678351] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 1741.683423] ---[ end trace 88c72563cbf4d108 ]---
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 13:20 ` Pete Heist
@ 2019-01-05 13:35 ` Toke Høiland-Jørgensen
2019-01-05 15:34 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 13:35 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Hmm, that's odd. Could you try adding this debugging line in
>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>
>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>> parentid, n, len);
>>
>> And see if you actually get any of those lines in your dmesg?
>
>
> I do see the messages twice, then not after that in the rest of the
> output...
Right. Looking at the HFSC code some more, I think the bug is actually
caused by another, but related, interaction between HFSC and CAKE.
Specifically, this line:
https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
where HFSC checks whether the child queue len is 1, which it interprets
as the event that activates that queue. However, because CAKE splits the
packet, this check will fail, and the HFSC class will not be activated.
This also explains why you only see the bug with HFSC, and not with HTB
(although I do think that we still need to update the hierarchy).
The good news it that it is a fairly simple to fix in HFSC. The bad news
is that it's something that's hard to work around from the out-of-tree
CAKE...
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 13:35 ` Toke Høiland-Jørgensen
@ 2019-01-05 15:34 ` Pete Heist
2019-01-05 15:52 ` Jonathan Morton
2019-01-05 16:32 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-05 15:34 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>
>>> Hmm, that's odd. Could you try adding this debugging line in
>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>
>>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>>> parentid, n, len);
>>>
>>> And see if you actually get any of those lines in your dmesg?
>>
>> I do see the messages twice, then not after that in the rest of the
>> output...
>
> Right. Looking at the HFSC code some more, I think the bug is actually
> caused by another, but related, interaction between HFSC and CAKE.
>
> Specifically, this line:
>
> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>
> where HFSC checks whether the child queue len is 1, which it interprets
> as the event that activates that queue. However, because CAKE splits the
> packet, this check will fail, and the HFSC class will not be activated.
> This also explains why you only see the bug with HFSC, and not with HTB
> (although I do think that we still need to update the hierarchy).
>
> The good news it that it is a fairly simple to fix in HFSC. The bad news
> is that it's something that's hard to work around from the out-of-tree
> CAKE...
Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe cbq, if I ever get a chance to try it (not hurrying yet). Ideally this interaction between qdiscs would be clarified somewhere, at some point. :)
Thanks a lot for doing the discovery though! We may not have hfsc+cake with GSO splitting on older kernels very soon, but what should we do with this? There’s nobody in MAINTAINERS for hfsc, so we may not get much of a response to any bug submissions...
Pete
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 15:34 ` Pete Heist
@ 2019-01-05 15:52 ` Jonathan Morton
2019-01-05 16:32 ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 47+ messages in thread
From: Jonathan Morton @ 2019-01-05 15:52 UTC (permalink / raw)
To: Pete Heist; +Cc: Toke Høiland-Jørgensen, Cake List
> On 5 Jan, 2019, at 5:34 pm, Pete Heist <pete@heistp.net> wrote:
>
> There’s nobody in MAINTAINERS for hfsc, so we may not get much of a response to any bug submissions...
Most likely it defaults to Eric Dumazet in that case.
- Jonathan Morton
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 15:34 ` Pete Heist
2019-01-05 15:52 ` Jonathan Morton
@ 2019-01-05 16:32 ` Toke Høiland-Jørgensen
2019-01-05 19:27 ` Sebastian Moeller
1 sibling, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 16:32 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>
>>>> Hmm, that's odd. Could you try adding this debugging line in
>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>>
>>>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>>>> parentid, n, len);
>>>>
>>>> And see if you actually get any of those lines in your dmesg?
>>>
>>> I do see the messages twice, then not after that in the rest of the
>>> output...
>>
>> Right. Looking at the HFSC code some more, I think the bug is actually
>> caused by another, but related, interaction between HFSC and CAKE.
>>
>> Specifically, this line:
>>
>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>>
>> where HFSC checks whether the child queue len is 1, which it interprets
>> as the event that activates that queue. However, because CAKE splits the
>> packet, this check will fail, and the HFSC class will not be activated.
>> This also explains why you only see the bug with HFSC, and not with HTB
>> (although I do think that we still need to update the hierarchy).
>>
>> The good news it that it is a fairly simple to fix in HFSC. The bad news
>> is that it's something that's hard to work around from the out-of-tree
>> CAKE...
>
> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe
> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this
> interaction between qdiscs would be clarified somewhere, at some
> point. :)
>
> Thanks a lot for doing the discovery though!
You're welcome, and thanks for you help :)
> We may not have hfsc+cake with GSO splitting on older kernels very
> soon, but what should we do with this? There’s nobody in MAINTAINERS
> for hfsc, so we may not get much of a response to any bug
> submissions...
$ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c
Jamal Hadi Salim <jhs@mojatatu.com> (maintainer:TC subsystem)
Cong Wang <xiyou.wangcong@gmail.com> (maintainer:TC subsystem)
Jiri Pirko <jiri@resnulli.us> (maintainer:TC subsystem)
"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
netdev@vger.kernel.org (open list:TC subsystem)
I'll submit a patch sometime next week, and also look into the qlen
adjustment for CAKE GSO splitting...
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 16:32 ` Toke Høiland-Jørgensen
@ 2019-01-05 19:27 ` Sebastian Moeller
2019-01-05 20:01 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Sebastian Moeller @ 2019-01-05 19:27 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Pete Heist, Cake List
Dear all,
I am most likely wrong, but did you have a look at https://bugs.openwrt.org/index.php?do=details&task_id=1136 yet?
Especially https://bugzilla.kernel.org/show_bug.cgi?id=109581 and https://www.spinics.net/lists/netdev/msg450655.html might be related to Pete's bug.
Then again, I might be wrong as the whole flurry of emails went past my head quickly.
Best Regards
Sebastian
> On Jan 5, 2019, at 17:32, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>
>>> Pete Heist <pete@heistp.net> writes:
>>>
>>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>>
>>>>> Hmm, that's odd. Could you try adding this debugging line in
>>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>>>
>>>>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>>>>> parentid, n, len);
>>>>>
>>>>> And see if you actually get any of those lines in your dmesg?
>>>>
>>>> I do see the messages twice, then not after that in the rest of the
>>>> output...
>>>
>>> Right. Looking at the HFSC code some more, I think the bug is actually
>>> caused by another, but related, interaction between HFSC and CAKE.
>>>
>>> Specifically, this line:
>>>
>>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>>>
>>> where HFSC checks whether the child queue len is 1, which it interprets
>>> as the event that activates that queue. However, because CAKE splits the
>>> packet, this check will fail, and the HFSC class will not be activated.
>>> This also explains why you only see the bug with HFSC, and not with HTB
>>> (although I do think that we still need to update the hierarchy).
>>>
>>> The good news it that it is a fairly simple to fix in HFSC. The bad news
>>> is that it's something that's hard to work around from the out-of-tree
>>> CAKE...
>>
>> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe
>> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this
>> interaction between qdiscs would be clarified somewhere, at some
>> point. :)
>>
>> Thanks a lot for doing the discovery though!
>
> You're welcome, and thanks for you help :)
>
>> We may not have hfsc+cake with GSO splitting on older kernels very
>> soon, but what should we do with this? There’s nobody in MAINTAINERS
>> for hfsc, so we may not get much of a response to any bug
>> submissions...
>
> $ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c
> Jamal Hadi Salim <jhs@mojatatu.com> (maintainer:TC subsystem)
> Cong Wang <xiyou.wangcong@gmail.com> (maintainer:TC subsystem)
> Jiri Pirko <jiri@resnulli.us> (maintainer:TC subsystem)
> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
> netdev@vger.kernel.org (open list:TC subsystem)
>
> I'll submit a patch sometime next week, and also look into the qlen
> adjustment for CAKE GSO splitting...
>
> -Toke
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 19:27 ` Sebastian Moeller
@ 2019-01-05 20:01 ` Pete Heist
2019-01-05 20:10 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 20:01 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: Toke Høiland-Jørgensen, Cake List
That first bug report looks decidedly similar to mine, but Toke would have to comment on the specifics. So far I see the patch to sch_codel.c you mentioned and another two-liner to remove the warning in hfsc.c (https://patchwork.ozlabs.org/patch/933611/). It would be really good to know that that warning is truly bogus, that it wasn’t put there by the author for good reason, as Toke may have been thinking of a different way to fix hfsc.
Thanks for bringing this up! I see that I ought to search OpenWRT/kernel.org next time… :)
> On Jan 5, 2019, at 8:27 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Dear all,
>
> I am most likely wrong, but did you have a look at https://bugs.openwrt.org/index.php?do=details&task_id=1136 yet?
> Especially https://bugzilla.kernel.org/show_bug.cgi?id=109581 and https://www.spinics.net/lists/netdev/msg450655.html might be related to Pete's bug.
> Then again, I might be wrong as the whole flurry of emails went past my head quickly.
>
> Best Regards
> Sebastian
>
>
>> On Jan 5, 2019, at 17:32, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>
>>>> Pete Heist <pete@heistp.net> writes:
>>>>
>>>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>>>
>>>>>> Hmm, that's odd. Could you try adding this debugging line in
>>>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>>>>
>>>>>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>>>>>> parentid, n, len);
>>>>>>
>>>>>> And see if you actually get any of those lines in your dmesg?
>>>>>
>>>>> I do see the messages twice, then not after that in the rest of the
>>>>> output...
>>>>
>>>> Right. Looking at the HFSC code some more, I think the bug is actually
>>>> caused by another, but related, interaction between HFSC and CAKE.
>>>>
>>>> Specifically, this line:
>>>>
>>>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>>>>
>>>> where HFSC checks whether the child queue len is 1, which it interprets
>>>> as the event that activates that queue. However, because CAKE splits the
>>>> packet, this check will fail, and the HFSC class will not be activated.
>>>> This also explains why you only see the bug with HFSC, and not with HTB
>>>> (although I do think that we still need to update the hierarchy).
>>>>
>>>> The good news it that it is a fairly simple to fix in HFSC. The bad news
>>>> is that it's something that's hard to work around from the out-of-tree
>>>> CAKE...
>>>
>>> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe
>>> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this
>>> interaction between qdiscs would be clarified somewhere, at some
>>> point. :)
>>>
>>> Thanks a lot for doing the discovery though!
>>
>> You're welcome, and thanks for you help :)
>>
>>> We may not have hfsc+cake with GSO splitting on older kernels very
>>> soon, but what should we do with this? There’s nobody in MAINTAINERS
>>> for hfsc, so we may not get much of a response to any bug
>>> submissions...
>>
>> $ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c
>> Jamal Hadi Salim <jhs@mojatatu.com> (maintainer:TC subsystem)
>> Cong Wang <xiyou.wangcong@gmail.com> (maintainer:TC subsystem)
>> Jiri Pirko <jiri@resnulli.us> (maintainer:TC subsystem)
>> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
>> netdev@vger.kernel.org (open list:TC subsystem)
>>
>> I'll submit a patch sometime next week, and also look into the qlen
>> adjustment for CAKE GSO splitting...
>>
>> -Toke
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 20:01 ` Pete Heist
@ 2019-01-05 20:10 ` Toke Høiland-Jørgensen
2019-01-05 20:31 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 20:10 UTC (permalink / raw)
To: Pete Heist, Sebastian Moeller; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
> That first bug report looks decidedly similar to mine, but Toke would
> have to comment on the specifics. So far I see the patch to
> sch_codel.c you mentioned and another two-liner to remove the warning
> in hfsc.c (https://patchwork.ozlabs.org/patch/933611/). It would be
> really good to know that that warning is truly bogus, that it wasn’t
> put there by the author for good reason, as Toke may have been
> thinking of a different way to fix hfsc.
Well, it's the same WARN_ON(), and if that patch had been applied,
debugging our issue would have been a lot harder, I think. But the
underlying issue is different, and we still need to fix HFSC (and
probably CAKE as well).
> Thanks for bringing this up! I see that I ought to search
> OpenWRT/kernel.org next time… :)
Yeah, nice find. I'll make sure that CAKE doesn't suffer from the same
issue that https://www.spinics.net/lists/netdev/msg450655.html fixes for
CoDel while I'm writing patches... :)
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 20:10 ` Toke Høiland-Jørgensen
@ 2019-01-05 20:31 ` Pete Heist
2019-01-05 22:27 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-05 20:31 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Sebastian Moeller, Cake List
> On Jan 5, 2019, at 9:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Well, it's the same WARN_ON(), and if that patch had been applied,
> debugging our issue would have been a lot harder, I think.
Yikes, this is what I mean. I’d rather suffer the warning than be troubleshooting flaky behavior. That patch is applied in the latest kernel, so hopefully it’s the right thing.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 20:31 ` Pete Heist
@ 2019-01-05 22:27 ` Toke Høiland-Jørgensen
2019-01-05 22:41 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-05 22:27 UTC (permalink / raw)
To: Pete Heist; +Cc: Sebastian Moeller, Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 5, 2019, at 9:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Well, it's the same WARN_ON(), and if that patch had been applied,
>> debugging our issue would have been a lot harder, I think.
>
> Yikes, this is what I mean. I’d rather suffer the warning than be
> troubleshooting flaky behavior. That patch is applied in the latest
> kernel, so hopefully it’s the right thing.
Well, if it causes false positives, getting rid of it is probably worth
it just to avoid spurious bug reports :)
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 22:27 ` Toke Høiland-Jørgensen
@ 2019-01-05 22:41 ` Pete Heist
2019-01-06 9:37 ` Pete Heist
2019-01-06 20:55 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-05 22:41 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Sebastian Moeller, Cake List
> On Jan 5, 2019, at 11:27 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>>> On Jan 5, 2019, at 9:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>
>>> Well, it's the same WARN_ON(), and if that patch had been applied,
>>> debugging our issue would have been a lot harder, I think.
>>
>> Yikes, this is what I mean. I’d rather suffer the warning than be
>> troubleshooting flaky behavior. That patch is applied in the latest
>> kernel, so hopefully it’s the right thing.
>
> Well, if it causes false positives, getting rid of it is probably worth
> it just to avoid spurious bug reports :)
If it helps finds bugs, I’d rather know about it.
But, a warning once in a while might have been better than a repeated one that sometimes makes a hard reboot necessary, causing need for a manual, offline fsck in order to boot again. Just sayin’… ;)
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 22:41 ` Pete Heist
@ 2019-01-06 9:37 ` Pete Heist
2019-01-06 20:56 ` Toke Høiland-Jørgensen
2019-01-06 20:55 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-06 9:37 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve been doing that for a while without any apparent issues, though I’m hesitating now to try that in a production environment.
Pete
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-05 22:41 ` Pete Heist
2019-01-06 9:37 ` Pete Heist
@ 2019-01-06 20:55 ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-06 20:55 UTC (permalink / raw)
To: Pete Heist; +Cc: Sebastian Moeller, Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 5, 2019, at 11:27 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>>> On Jan 5, 2019, at 9:10 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>
>>>> Well, it's the same WARN_ON(), and if that patch had been applied,
>>>> debugging our issue would have been a lot harder, I think.
>>>
>>> Yikes, this is what I mean. I’d rather suffer the warning than be
>>> troubleshooting flaky behavior. That patch is applied in the latest
>>> kernel, so hopefully it’s the right thing.
>>
>> Well, if it causes false positives, getting rid of it is probably worth
>> it just to avoid spurious bug reports :)
>
> If it helps finds bugs, I’d rather know about it.
>
> But, a warning once in a while might have been better than a repeated
> one that sometimes makes a hard reboot necessary, causing need for a
> manual, offline fsck in order to boot again. Just sayin’… ;)
Yes, that is why WARN_ON tends to be frowned upon ;)
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-06 9:37 ` Pete Heist
@ 2019-01-06 20:56 ` Toke Høiland-Jørgensen
2019-01-07 0:30 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-06 20:56 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
> been doing that for a while without any apparent issues, though I’m
> hesitating now to try that in a production environment.
Hmm, that's a good question. I would expect so; but I would also expect
the issue to show up pretty much straight away, so if you haven't hit it
yet, I may be wrong. I'll do some more digging... Should probably also
try to replicate all this stuff on my own machine :)
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-06 20:56 ` Toke Høiland-Jørgensen
@ 2019-01-07 0:30 ` Pete Heist
2019-01-07 2:11 ` Dave Taht
2019-01-07 11:30 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-07 0:30 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
>> been doing that for a while without any apparent issues, though I’m
>> hesitating now to try that in a production environment.
>
> Hmm, that's a good question. I would expect so; but I would also expect
> the issue to show up pretty much straight away, so if you haven't hit it
> yet, I may be wrong. I'll do some more digging... Should probably also
> try to replicate all this stuff on my own machine :)
Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely not putting split GSO into production. I just turned it on and off three times and here’s what I got:
Split GSO on:
https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
Split GSO off:
https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
I’ve seen these square waves before with htb and wondered where they came from, and I think we may finally have an answer! What manner of thing causes this I don’t know, but there’s a chance you may end up finding out… :)
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-07 0:30 ` Pete Heist
@ 2019-01-07 2:11 ` Dave Taht
2019-01-07 11:30 ` Toke Høiland-Jørgensen
1 sibling, 0 replies; 47+ messages in thread
From: Dave Taht @ 2019-01-07 2:11 UTC (permalink / raw)
To: Pete Heist; +Cc: Toke Høiland-Jørgensen, Cake List
another answer is to disable gro/tso in the driver via ethtool.
On Sun, Jan 6, 2019 at 4:30 PM Pete Heist <pete@heistp.net> wrote:
>
>
> > On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> >
> > Pete Heist <pete@heistp.net> writes:
> >
> >> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
> >> been doing that for a while without any apparent issues, though I’m
> >> hesitating now to try that in a production environment.
> >
> > Hmm, that's a good question. I would expect so; but I would also expect
> > the issue to show up pretty much straight away, so if you haven't hit it
> > yet, I may be wrong. I'll do some more digging... Should probably also
> > try to replicate all this stuff on my own machine :)
>
>
> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely not putting split GSO into production. I just turned it on and off three times and here’s what I got:
>
> Split GSO on:
>
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
>
> Split GSO off:
>
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
>
> I’ve seen these square waves before with htb and wondered where they came from, and I think we may finally have an answer! What manner of thing causes this I don’t know, but there’s a chance you may end up finding out… :)
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-07 0:30 ` Pete Heist
2019-01-07 2:11 ` Dave Taht
@ 2019-01-07 11:30 ` Toke Høiland-Jørgensen
2019-01-07 15:07 ` Pete Heist
1 sibling, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-07 11:30 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
>>> been doing that for a while without any apparent issues, though I’m
>>> hesitating now to try that in a production environment.
>>
>> Hmm, that's a good question. I would expect so; but I would also expect
>> the issue to show up pretty much straight away, so if you haven't hit it
>> yet, I may be wrong. I'll do some more digging... Should probably also
>> try to replicate all this stuff on my own machine :)
>
>
> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely
> not putting split GSO into production. I just turned it on and off
> three times and here’s what I got:
>
> Split GSO on:
>
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
>
> Split GSO off:
>
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
>
> I’ve seen these square waves before with htb and wondered where they
> came from, and I think we may finally have an answer! What manner of
> thing causes this I don’t know, but there’s a chance you may end up
> finding out… :)
Is this without the patch to CAKE that adjusts the qlen? And have you
tried running with that patch (with HTB)?
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-07 11:30 ` Toke Høiland-Jørgensen
@ 2019-01-07 15:07 ` Pete Heist
2019-01-08 20:03 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-07 15:07 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Sorry, that’s without the patch, will give that a try when I have a chance and post the results, probably tomorrow...
> On Jan 7, 2019, at 12:30 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>>> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>
>>> Pete Heist <pete@heistp.net> writes:
>>>
>>>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
>>>> been doing that for a while without any apparent issues, though I’m
>>>> hesitating now to try that in a production environment.
>>>
>>> Hmm, that's a good question. I would expect so; but I would also expect
>>> the issue to show up pretty much straight away, so if you haven't hit it
>>> yet, I may be wrong. I'll do some more digging... Should probably also
>>> try to replicate all this stuff on my own machine :)
>>
>>
>> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely
>> not putting split GSO into production. I just turned it on and off
>> three times and here’s what I got:
>>
>> Split GSO on:
>>
>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
>>
>> Split GSO off:
>>
>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
>>
>> I’ve seen these square waves before with htb and wondered where they
>> came from, and I think we may finally have an answer! What manner of
>> thing causes this I don’t know, but there’s a chance you may end up
>> finding out… :)
>
> Is this without the patch to CAKE that adjusts the qlen? And have you
> tried running with that patch (with HTB)?
>
> -Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-07 15:07 ` Pete Heist
@ 2019-01-08 20:03 ` Pete Heist
2019-01-08 20:44 ` Dave Taht
2019-01-08 22:27 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-08 20:03 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Here’s the re-test with the patched version and HTB. Looks like success, nice work!
Split GSO on:
https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split1.svg
https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split2.svg
https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split3.svg
Split GSO off:
https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split1.svg
https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split2.svg
https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split3.svg
Your patch in the latest kernels looks simpler. Bringing the patch back to prior kernel versions would be appreciated, but I can understand how 3.16 becomes less and less relevant as time goes on, although, it’s not at end of life yet. :)
Interesting how download rate control in each of the graphs with GSO splitting on looks accurate to the point where flent’s throughput graph scale is at 0.02 Mbit per step, and one can see that values coming back from netperf are probably quantized to 0.01 Mbit...
> On Jan 7, 2019, at 4:07 PM, Pete Heist <pete@heistp.net> wrote:
>
> Sorry, that’s without the patch, will give that a try when I have a chance and post the results, probably tomorrow...
>
>> On Jan 7, 2019, at 12:30 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>>> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>
>>>> Pete Heist <pete@heistp.net> writes:
>>>>
>>>>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
>>>>> been doing that for a while without any apparent issues, though I’m
>>>>> hesitating now to try that in a production environment.
>>>>
>>>> Hmm, that's a good question. I would expect so; but I would also expect
>>>> the issue to show up pretty much straight away, so if you haven't hit it
>>>> yet, I may be wrong. I'll do some more digging... Should probably also
>>>> try to replicate all this stuff on my own machine :)
>>>
>>>
>>> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely
>>> not putting split GSO into production. I just turned it on and off
>>> three times and here’s what I got:
>>>
>>> Split GSO on:
>>>
>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
>>>
>>> Split GSO off:
>>>
>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
>>>
>>> I’ve seen these square waves before with htb and wondered where they
>>> came from, and I think we may finally have an answer! What manner of
>>> thing causes this I don’t know, but there’s a chance you may end up
>>> finding out… :)
>>
>> Is this without the patch to CAKE that adjusts the qlen? And have you
>> tried running with that patch (with HTB)?
>>
>> -Toke
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-08 20:03 ` Pete Heist
@ 2019-01-08 20:44 ` Dave Taht
2019-01-08 22:01 ` Pete Heist
2019-01-08 22:27 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 47+ messages in thread
From: Dave Taht @ 2019-01-08 20:44 UTC (permalink / raw)
To: Pete Heist; +Cc: Toke Høiland-Jørgensen, Cake List
On Tue, Jan 8, 2019 at 12:03 PM Pete Heist <pete@heistp.net> wrote:
>
> Here’s the re-test with the patched version and HTB. Looks like success, nice work!
I note that I'm big on having the flent.gz files around also. In this
case, by eyeball, split-gso appears to have about 130us less latency,
but a cdf comparison of split vs no-split woud show that more easily.
> Split GSO on:
>
> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split1.svg
> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split2.svg
> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split3.svg
>
> Split GSO off:
>
> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split1.svg
> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split2.svg
> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split3.svg
>
> Your patch in the latest kernels looks simpler. Bringing the patch back to prior kernel versions would be appreciated, but I can understand how 3.16 becomes less and less relevant as time goes on, although, it’s not at end of life yet. :)
>
> Interesting how download rate control in each of the graphs with GSO splitting on looks accurate to the point where flent’s throughput graph scale is at 0.02 Mbit per step, and one can see that values coming back from netperf are probably quantized to 0.01 Mbit...
>
> > On Jan 7, 2019, at 4:07 PM, Pete Heist <pete@heistp.net> wrote:
> >
> > Sorry, that’s without the patch, will give that a try when I have a chance and post the results, probably tomorrow...
> >
> >> On Jan 7, 2019, at 12:30 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> >>
> >> Pete Heist <pete@heistp.net> writes:
> >>
> >>>> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> >>>>
> >>>> Pete Heist <pete@heistp.net> writes:
> >>>>
> >>>>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
> >>>>> been doing that for a while without any apparent issues, though I’m
> >>>>> hesitating now to try that in a production environment.
> >>>>
> >>>> Hmm, that's a good question. I would expect so; but I would also expect
> >>>> the issue to show up pretty much straight away, so if you haven't hit it
> >>>> yet, I may be wrong. I'll do some more digging... Should probably also
> >>>> try to replicate all this stuff on my own machine :)
> >>>
> >>>
> >>> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely
> >>> not putting split GSO into production. I just turned it on and off
> >>> three times and here’s what I got:
> >>>
> >>> Split GSO on:
> >>>
> >>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
> >>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
> >>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
> >>>
> >>> Split GSO off:
> >>>
> >>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
> >>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
> >>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
> >>>
> >>> I’ve seen these square waves before with htb and wondered where they
> >>> came from, and I think we may finally have an answer! What manner of
> >>> thing causes this I don’t know, but there’s a chance you may end up
> >>> finding out… :)
> >>
> >> Is this without the patch to CAKE that adjusts the qlen? And have you
> >> tried running with that patch (with HTB)?
> >>
> >> -Toke
> >
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-08 20:44 ` Dave Taht
@ 2019-01-08 22:01 ` Pete Heist
2019-01-08 22:33 ` Dave Taht
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-08 22:01 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List
I should have done that: https://www.heistp.net/downloads/htb_split_gso_patched/
Note that I changed the names in the plots to match the convention of my first email, but it should be clear which is which and I left all plots in. The text output is there too as I sometimes like to open several up in different browser tabs and switch between tabs to compare values.
It looks like about 100 usec to me. Throughput also looks consistently about 0.3 Mbit higher (~1.3%) in the split results.
> On Jan 8, 2019, at 9:44 PM, Dave Taht <dave.taht@gmail.com> wrote:
>
> On Tue, Jan 8, 2019 at 12:03 PM Pete Heist <pete@heistp.net> wrote:
>>
>> Here’s the re-test with the patched version and HTB. Looks like success, nice work!
>
> I note that I'm big on having the flent.gz files around also. In this
> case, by eyeball, split-gso appears to have about 130us less latency,
> but a cdf comparison of split vs no-split woud show that more easily.
>
>> Split GSO on:
>>
>> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split1.svg
>> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split2.svg
>> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split3.svg
>>
>> Split GSO off:
>>
>> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split1.svg
>> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split2.svg
>> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split3.svg
>>
>> Your patch in the latest kernels looks simpler. Bringing the patch back to prior kernel versions would be appreciated, but I can understand how 3.16 becomes less and less relevant as time goes on, although, it’s not at end of life yet. :)
>>
>> Interesting how download rate control in each of the graphs with GSO splitting on looks accurate to the point where flent’s throughput graph scale is at 0.02 Mbit per step, and one can see that values coming back from netperf are probably quantized to 0.01 Mbit...
>>
>>> On Jan 7, 2019, at 4:07 PM, Pete Heist <pete@heistp.net> wrote:
>>>
>>> Sorry, that’s without the patch, will give that a try when I have a chance and post the results, probably tomorrow...
>>>
>>>> On Jan 7, 2019, at 12:30 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>
>>>> Pete Heist <pete@heistp.net> writes:
>>>>
>>>>>> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>>>>>
>>>>>> Pete Heist <pete@heistp.net> writes:
>>>>>>
>>>>>>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
>>>>>>> been doing that for a while without any apparent issues, though I’m
>>>>>>> hesitating now to try that in a production environment.
>>>>>>
>>>>>> Hmm, that's a good question. I would expect so; but I would also expect
>>>>>> the issue to show up pretty much straight away, so if you haven't hit it
>>>>>> yet, I may be wrong. I'll do some more digging... Should probably also
>>>>>> try to replicate all this stuff on my own machine :)
>>>>>
>>>>>
>>>>> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely
>>>>> not putting split GSO into production. I just turned it on and off
>>>>> three times and here’s what I got:
>>>>>
>>>>> Split GSO on:
>>>>>
>>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
>>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
>>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
>>>>>
>>>>> Split GSO off:
>>>>>
>>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
>>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
>>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
>>>>>
>>>>> I’ve seen these square waves before with htb and wondered where they
>>>>> came from, and I think we may finally have an answer! What manner of
>>>>> thing causes this I don’t know, but there’s a chance you may end up
>>>>> finding out… :)
>>>>
>>>> Is this without the patch to CAKE that adjusts the qlen? And have you
>>>> tried running with that patch (with HTB)?
>>>>
>>>> -Toke
>>>
>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-08 20:03 ` Pete Heist
2019-01-08 20:44 ` Dave Taht
@ 2019-01-08 22:27 ` Toke Høiland-Jørgensen
2019-01-09 5:29 ` Pete Heist
1 sibling, 1 reply; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-08 22:27 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
> Your patch in the latest kernels looks simpler. Bringing the patch
> back to prior kernel versions would be appreciated, but I can
> understand how 3.16 becomes less and less relevant as time goes on,
> although, it’s not at end of life yet. :)
Could you try the latest git version of the out-of-tree version of cake;
that has the same patch as upstream, and I *think* it will work on old
kernels as well (for the HTB issue; can't fix HFSC without patching it).
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-08 22:01 ` Pete Heist
@ 2019-01-08 22:33 ` Dave Taht
2019-01-09 6:13 ` Pete Heist
0 siblings, 1 reply; 47+ messages in thread
From: Dave Taht @ 2019-01-08 22:33 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
On Tue, Jan 8, 2019 at 2:01 PM Pete Heist <pete@heistp.net> wrote:
>
> I should have done that: https://www.heistp.net/downloads/htb_split_gso_patched/
>
> Note that I changed the names in the plots to match the convention of my first email, but it should be clear which is which and I left all plots in. The text output is there too as I sometimes like to open several up in different browser tabs and switch between tabs to compare values.
>
> It looks like about 100 usec to me. Throughput also looks consistently about 0.3 Mbit higher (~1.3%) in the split results.
My guess is with ecn on would have the highest latency and the same
throughput. ?
Since we started this effort in an era when seconds of added latency
was common, a mere 100us improvement seems insignificant, except that
that's a 10% improvement over the present-day baseline, and *that's
worth it*. ;) This is also a function of the number of flows, kernel
scheduling time, etc, etc. Theoretically,
were there no other delays in the system, we seem to actually be at
the minimal RTT achievable (4*130ms in each direction for the fat
flows = 1040us) but given the fast/slow queue abstraction the best
possible result would be 10s of us for the measurement flows.
I have noticed that BQL's values can get quite large with cake doing
the shaping, btw, much larger than they do with htb.
> > On Jan 8, 2019, at 9:44 PM, Dave Taht <dave.taht@gmail.com> wrote:
> >
> > On Tue, Jan 8, 2019 at 12:03 PM Pete Heist <pete@heistp.net> wrote:
> >>
> >> Here’s the re-test with the patched version and HTB. Looks like success, nice work!
> >
> > I note that I'm big on having the flent.gz files around also. In this
> > case, by eyeball, split-gso appears to have about 130us less latency,
> > but a cdf comparison of split vs no-split woud show that more easily.
> >
> >> Split GSO on:
> >>
> >> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split1.svg
> >> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split2.svg
> >> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_split3.svg
> >>
> >> Split GSO off:
> >>
> >> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split1.svg
> >> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split2.svg
> >> https://www.heistp.net/downloads/htb_split_gso_patched/htb_cakep_no_split3.svg
> >>
> >> Your patch in the latest kernels looks simpler. Bringing the patch back to prior kernel versions would be appreciated, but I can understand how 3.16 becomes less and less relevant as time goes on, although, it’s not at end of life yet. :)
> >>
> >> Interesting how download rate control in each of the graphs with GSO splitting on looks accurate to the point where flent’s throughput graph scale is at 0.02 Mbit per step, and one can see that values coming back from netperf are probably quantized to 0.01 Mbit...
> >>
> >>> On Jan 7, 2019, at 4:07 PM, Pete Heist <pete@heistp.net> wrote:
> >>>
> >>> Sorry, that’s without the patch, will give that a try when I have a chance and post the results, probably tomorrow...
> >>>
> >>>> On Jan 7, 2019, at 12:30 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> >>>>
> >>>> Pete Heist <pete@heistp.net> writes:
> >>>>
> >>>>>> On Jan 6, 2019, at 9:56 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> >>>>>>
> >>>>>> Pete Heist <pete@heistp.net> writes:
> >>>>>>
> >>>>>>> Lastly, is using cake as a leaf to htb risky until a fix is made? I’ve
> >>>>>>> been doing that for a while without any apparent issues, though I’m
> >>>>>>> hesitating now to try that in a production environment.
> >>>>>>
> >>>>>> Hmm, that's a good question. I would expect so; but I would also expect
> >>>>>> the issue to show up pretty much straight away, so if you haven't hit it
> >>>>>> yet, I may be wrong. I'll do some more digging... Should probably also
> >>>>>> try to replicate all this stuff on my own machine :)
> >>>>>
> >>>>>
> >>>>> Ok, after what I’m seeing on my APU1 tests on 3.16.7, I’m definitely
> >>>>> not putting split GSO into production. I just turned it on and off
> >>>>> three times and here’s what I got:
> >>>>>
> >>>>> Split GSO on:
> >>>>>
> >>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso.svg
> >>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso2.svg
> >>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_split_gso3.svg
> >>>>>
> >>>>> Split GSO off:
> >>>>>
> >>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso.svg
> >>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso2.svg
> >>>>> https://www.heistp.net/downloads/htb_split_gso/htb_cake_no_split_gso3.svg
> >>>>>
> >>>>> I’ve seen these square waves before with htb and wondered where they
> >>>>> came from, and I think we may finally have an answer! What manner of
> >>>>> thing causes this I don’t know, but there’s a chance you may end up
> >>>>> finding out… :)
> >>>>
> >>>> Is this without the patch to CAKE that adjusts the qlen? And have you
> >>>> tried running with that patch (with HTB)?
> >>>>
> >>>> -Toke
> >>>
> >>
> >> _______________________________________________
> >> Cake mailing list
> >> Cake@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/cake
> >
> >
> >
> > --
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-205-9740
>
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-08 22:27 ` Toke Høiland-Jørgensen
@ 2019-01-09 5:29 ` Pete Heist
2019-01-09 8:36 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 47+ messages in thread
From: Pete Heist @ 2019-01-09 5:29 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 825 bytes --]
> On Jan 8, 2019, at 11:27 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>> Your patch in the latest kernels looks simpler. Bringing the patch
>> back to prior kernel versions would be appreciated, but I can
>> understand how 3.16 becomes less and less relevant as time goes on,
>> although, it’s not at end of life yet. :)
>
> Could you try the latest git version of the out-of-tree version of cake;
> that has the same patch as upstream, and I *think* it will work on old
> kernels as well (for the HTB issue; can't fix HFSC without patching it).
Yes, it compiles fine on 3.16, and gives statistically identical results for HTB. :)
https://www.heistp.net/downloads/htb_split_gso_patched2/ <https://www.heistp.net/downloads/htb_split_gso_patched2/>
[-- Attachment #2: Type: text/html, Size: 1549 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-08 22:33 ` Dave Taht
@ 2019-01-09 6:13 ` Pete Heist
0 siblings, 0 replies; 47+ messages in thread
From: Pete Heist @ 2019-01-09 6:13 UTC (permalink / raw)
To: Dave Taht; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 3192 bytes --]
> On Jan 8, 2019, at 11:33 PM, Dave Taht <dave.taht@gmail.com> wrote:
>
> On Tue, Jan 8, 2019 at 2:01 PM Pete Heist <pete@heistp.net <mailto:pete@heistp.net>> wrote:
>>
>> I should have done that: https://www.heistp.net/downloads/htb_split_gso_patched/ <https://www.heistp.net/downloads/htb_split_gso_patched/>
>>
>> Note that I changed the names in the plots to match the convention of my first email, but it should be clear which is which and I left all plots in. The text output is there too as I sometimes like to open several up in different browser tabs and switch between tabs to compare values.
>>
>> It looks like about 100 usec to me. Throughput also looks consistently about 0.3 Mbit higher (~1.3%) in the split results.
>
> My guess is with ecn on would have the highest latency and the same
> throughput. ?
I see a slight increase in icmp/udp rtt (1.04ms-1.02ms = 20us) and a slight increase in throughput (180.12Mbit - 179.50Mbit = 620kbit).
https://www.heistp.net/downloads/htb_cakep2_ecnon/
https://www.heistp.net/downloads/htb_cakep2_ecnon2/
These results are not five sigma. :)
> Since we started this effort in an era when seconds of added latency
> was common, a mere 100us improvement seems insignificant, except that
> that's a 10% improvement over the present-day baseline, and *that's
> worth it*. ;) This is also a function of the number of flows, kernel
Who knows how many timeouts are avoided in the future, just because of this 100us. :)
> I have noticed that BQL's values can get quite large with cake doing
> the shaping, btw, much larger than they do with htb.
Wonder why that is.
So far it’s harder to use cake’s shaper in the current setup I’m working on, because it appears “better" to have eth0 and eth0.3300 under one link sharing hierarchy for eth0:
tc qdisc add dev eth0 root handle 1: htb default 1
tc class add dev eth0 parent 1: classid 1:1 htb rate $RATE
tc class add dev eth0 parent 1: classid 1:2 htb rate $RATE
tc qdisc add dev eth0 parent 1:1 cake besteffort
tc qdisc add dev eth0 parent 1:2 cake besteffort
tc filter add dev eth0 parent 1:0 prio 1 protocol all \
basic match not "meta(vlan mask 0xfff gt 0x0)" flowid 1:1
tc filter add dev eth0 parent 1:0 prio 2 protocol all \
basic match "meta(vlan mask 0xfff eq `printf \"0x%x\" $VLAN_TAG`)" flowid 1:2
but I can’t do that with cake’s shaper. It’s better in the sense that when you’re out of CPU, latency doesn’t increase suddenly. I _can_ use cake’s shaper by adding cake to the VLAN interface and filtering out vlan traffic from the main interface:
tc qdisc add dev eth0 root handle 1: prio bands 2 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
tc qdisc add dev eth0 parent 1:1 handle 10: cake besteffort bandwidth $RATE
tc filter add dev eth0 parent 1:0 prio 1 protocol all \
basic match not "meta(vlan mask 0xfff gt 0x0)" flowid 10:1
tc qdisc add dev eth0.3300 root cake besteffort bandwidth $RATE
but when you’re out of CPU, you get starvation and inter-flow latency increases. So so far, hfsc or htb is working better in this case. It might be something to think about for an “ISP Cake”…
[-- Attachment #2: Type: text/html, Size: 12777 bytes --]
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Cake] cake infinite loop(?) with hfsc on one-armed router
2019-01-09 5:29 ` Pete Heist
@ 2019-01-09 8:36 ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 47+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-09 8:36 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 8, 2019, at 11:27 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> Pete Heist <pete@heistp.net> writes:
>>
>>> Your patch in the latest kernels looks simpler. Bringing the patch
>>> back to prior kernel versions would be appreciated, but I can
>>> understand how 3.16 becomes less and less relevant as time goes on,
>>> although, it’s not at end of life yet. :)
>>
>> Could you try the latest git version of the out-of-tree version of cake;
>> that has the same patch as upstream, and I *think* it will work on old
>> kernels as well (for the HTB issue; can't fix HFSC without patching it).
>
> Yes, it compiles fine on 3.16, and gives statistically identical
> results for HTB. :)
Excellent! That's at least one bug squashed :)
-Toke
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2019-01-09 8:36 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-27 23:30 [Cake] cake infinite loop(?) with hfsc on one-armed router Pete Heist
2018-12-28 12:58 ` Pete Heist
2018-12-28 21:22 ` Pete Heist
2018-12-28 22:07 ` Jonathan Morton
2018-12-28 22:42 ` Pete Heist
2019-01-04 21:34 ` Toke Høiland-Jørgensen
2019-01-04 22:10 ` Pete Heist
2019-01-04 22:12 ` Pete Heist
2019-01-04 22:34 ` Toke Høiland-Jørgensen
2019-01-05 5:58 ` Pete Heist
2019-01-05 10:06 ` Toke Høiland-Jørgensen
2019-01-05 10:59 ` Pete Heist
2019-01-05 11:06 ` Pete Heist
2019-01-05 11:18 ` Toke Høiland-Jørgensen
2019-01-05 11:26 ` Pete Heist
2019-01-05 11:35 ` Pete Heist
2019-01-05 12:38 ` Toke Høiland-Jørgensen
2019-01-05 12:51 ` Pete Heist
2019-01-05 13:10 ` Toke Høiland-Jørgensen
2019-01-05 13:20 ` Pete Heist
2019-01-05 13:35 ` Toke Høiland-Jørgensen
2019-01-05 15:34 ` Pete Heist
2019-01-05 15:52 ` Jonathan Morton
2019-01-05 16:32 ` Toke Høiland-Jørgensen
2019-01-05 19:27 ` Sebastian Moeller
2019-01-05 20:01 ` Pete Heist
2019-01-05 20:10 ` Toke Høiland-Jørgensen
2019-01-05 20:31 ` Pete Heist
2019-01-05 22:27 ` Toke Høiland-Jørgensen
2019-01-05 22:41 ` Pete Heist
2019-01-06 9:37 ` Pete Heist
2019-01-06 20:56 ` Toke Høiland-Jørgensen
2019-01-07 0:30 ` Pete Heist
2019-01-07 2:11 ` Dave Taht
2019-01-07 11:30 ` Toke Høiland-Jørgensen
2019-01-07 15:07 ` Pete Heist
2019-01-08 20:03 ` Pete Heist
2019-01-08 20:44 ` Dave Taht
2019-01-08 22:01 ` Pete Heist
2019-01-08 22:33 ` Dave Taht
2019-01-09 6:13 ` Pete Heist
2019-01-08 22:27 ` Toke Høiland-Jørgensen
2019-01-09 5:29 ` Pete Heist
2019-01-09 8:36 ` Toke Høiland-Jørgensen
2019-01-06 20:55 ` Toke Høiland-Jørgensen
2019-01-05 10:44 ` Jonathan Morton
2019-01-05 11:17 ` Toke Høiland-Jørgensen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox