* [Cake] lockup with multiple cake instances on 3.16.7
@ 2019-01-30 8:58 Pete Heist
2019-01-30 22:59 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 11+ messages in thread
From: Pete Heist @ 2019-01-30 8:58 UTC (permalink / raw)
To: Cake List
[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]
I’m seeing a lockup with cake on 3.16.7 with 64-bit Intel when adding more than 4 cake instances. I’m using cake and tc-adv from head. Does anyone have an idea what this could be? It can be reproduced as follows:
-----
# Either htb of hfsc can be used, but it doesn’t happen with fq_codel or sfq.
# Also, when magic number “5” below is changed to “4”, no lockup occurs.
tc qdisc add dev eth0 root handle 1: htb default 1
for i in {1..5}; do
tc class add dev eth0 parent 1: classid 1:$i htb rate 100mbit ceil 100mbit
tc qdisc add dev eth0 parent 1:$i cake
done
# lockup occurs on execution of this command
tc qdisc show dev eth0
-----
When the lockup occurs, all network activity stops and the serial console freezes. Eventually the machine either reboots itself or needs to be power cycled. I caught the attached output to the console. I do not see it on kernel 3.10.107 with mipsle, or any other later kernels that I have, which are all 4.x and later.
Otherwise, it looks like we’ll have to upgrade these older kernels at FreeNet to use cake… :)
[-- Attachment #2: cake_lockup.txt --]
[-- Type: text/plain, Size: 10857 bytes --]
[192765.830059] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=60002 )
[192765.841407] INFO: Stall ended before state dump start
[192825.841319] INFO: rcu_sched self-detected stall on CPU[192825.842328] INFO: rcu_sched d)
[192825.842345] sending NMI to all CPUs:
[192825.842352] NMI backtrace for cpu 0
[192825.842363] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.16.7-ckt9-voyage #1
[192825.842366] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[192825.842372] task: c1433878 ti: c1426000 task.ti: c1426000
[192825.842378] EIP: 0060:[<c1020035>] EFLAGS: 00200006 CPU: 0
[192825.842396] EIP is at arch_trigger_all_cpu_backtrace+0x66/0x95
[192825.842400] EAX: 00000800 EBX: c143e940 ECX: c13b6e5a EDX: 00000800
[192825.842404] ESI: f616f72c EDI: 000013bb EBP: c1427e4c ESP: c1427e44
[192825.842409] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[192825.842413] CR0: 8005003b CR2: b7764750 CR3: 3544f000 CR4: 00000790
[192825.842414] Stack:
[192825.842417] c13d0cec c143e940 c1427e74 c1064a26 00000001 00200083 c143e940 00000000
[192825.842430] 00000001 c1433878 00000000 00000000 c1427e88 c103203b f616f8e0 f326674a
[192825.842444] 0000af63 c1427e90 c106c2b4 c1427ea8 c106c4a8 c1427f28 f616f8e0 f616f514
[192825.842457] Call Trace:
[192825.842462] [<c1064a26>] rcu_check_callbacks+0x373/0x41c
[192825.842478] [<c103203b>] update_process_times+0x2d/0x50
[192825.842490] [<c106c2b4>] tick_sched_handle+0x30/0x32
[192825.842503] [<c106c4a8>] tick_sched_timer+0x31/0x55
[192825.842511] [<c103fcca>] __run_hrtimer.isra.24+0x3e/0x94
[192825.842524] [<c1040417>] hrtimer_interrupt+0xeb/0x1d9
[192825.842534] [<c101e781>] local_apic_timer_interrupt+0x41/0x46
[192825.842543] [<c101ea9c>] smp_apic_timer_interrupt+0x2e/0x3d
[192825.842552] [<c131a6dd>] apic_timer_interrupt+0x2d/0x40
[192825.842564] [<c1202441>] ? acpi_safe_halt+0x17/0x1a
[192825.842575] [<c120297f>] acpi_idle_enter_c1+0x4e/0x74
[192825.842583] [<c126adc6>] cpuidle_enter_state+0x2c/0x96
[192825.842595] [<c126aee8>] cpuidle_enter+0xf/0x12
[192825.842603] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[192825.842616] [<c1313161>] rest_init+0x5d/0x5f
[192825.842625] [<c1462a40>] start_kernel+0x359/0x360
[192825.842635] [<c14622af>] i386_start_kernel+0x79/0x7d
[192825.842644] Code: ba 54 6e 3b c1 74 05 ba ec 0c 3d c1 52 68 5a 6e 3b c1 e8 98 5a 2f 00 d
[192825.842742] NMI backtrace for cpu 1
[192825.842751] CPU: 1 PID: 25469 Comm: tc Tainted: G O 3.16.7-ckt9-voyage #1
[192825.842753] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[192825.842757] task: f5e20470 ti: f3ea6000 task.ti: f3ea6000
[192825.842762] EIP: 0060:[<c1221236>] EFLAGS: 00000002 CPU: 1
[192825.842773] EIP is at io_serial_in+0xd/0x12
[192825.842777] EAX: c14f5100 EBX: c14f51ec ECX: 00000000 EDX: 000003fd
[192825.842780] ESI: 0000270b EDI: 00000020 EBP: f3ea7a44 ESP: f3ea7a44
[192825.842784] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[192825.842787] CR0: 80050033 CR2: 080c6b74 CR3: 35d89000 CR4: 00000790
[192825.842788] Stack:
[192825.842790] f3ea7a58 c1221650 c14f51ec 00000031 c12216a6 f3ea7a68 c12216b9 c14e25a1
[192825.842802] c14f51ec f3ea7a80 c121e838 c14e25ce c14f51ec 00000001 c122123b f3ea7aa8
[192825.842812] c12224c6 c12216a6 0000003a c14e2594 00000005 00000046 c144dec4 c1222422
[192825.842823] Call Trace:
[192825.842827] [<c1221650>] wait_for_xmitr+0x19/0x6f
[192825.842836] [<c12216a6>] ? wait_for_xmitr+0x6f/0x6f
[192825.842843] [<c12216b9>] serial8250_console_putchar+0x13/0x20
[192825.842851] [<c121e838>] uart_console_write+0x2b/0x36
[192825.842857] [<c122123b>] ? io_serial_in+0x12/0x12
[192825.842863] [<c12224c6>] serial8250_console_write+0xa4/0xe4
[192825.842870] [<c12216a6>] ? wait_for_xmitr+0x6f/0x6f
[192825.842876] [<c1222422>] ? serial8250_modem_status+0x9c/0x9c
[192825.842883] [<c105bb11>] call_console_drivers.constprop.24+0x63/0x6d
[192825.842892] [<c105c769>] console_unlock+0x13a/0x371
[192825.842900] [<c105ce55>] vprintk_emit+0x348/0x354
[192825.842907] [<c1315ad0>] printk+0x17/0x19
[192825.842915] [<c10647ad>] rcu_check_callbacks+0xfa/0x41c
[192825.842924] [<c103203b>] update_process_times+0x2d/0x50
[192825.842931] [<c106c2b4>] tick_sched_handle+0x30/0x32
[192825.842939] [<c106c4a8>] tick_sched_timer+0x31/0x55
[192825.842946] [<c103fcca>] __run_hrtimer.isra.24+0x3e/0x94
[192825.842954] [<c1040417>] hrtimer_interrupt+0xeb/0x1d9
[192825.842962] [<c101e781>] local_apic_timer_interrupt+0x41/0x46
[192825.842970] [<c101ea9c>] smp_apic_timer_interrupt+0x2e/0x3d
[192825.842977] [<c131a6dd>] apic_timer_interrupt+0x2d/0x40
[192825.842985] [<c1319ab5>] ? _raw_spin_lock_bh+0x27/0x2b
[192825.842991] [<c128519e>] gnet_stats_start_copy_compat+0x27/0x8d
[192825.842999] [<c129fd72>] tc_fill_qdisc+0x195/0x225
[192825.843007] [<c129fe91>] tc_dump_qdisc_root+0x8f/0xc4
[192825.843014] [<c129ff45>] tc_dump_qdisc+0x7f/0xd6
[192825.843020] [<c12a51bb>] netlink_dump+0xb4/0x201
[192825.843027] [<c12a5488>] netlink_recvmsg+0x180/0x28b
[192825.843035] [<c1279652>] sock_recvmsg+0x59/0x72
[192825.843042] [<c11a5b1a>] ? _copy_from_user+0x29/0x3f
[192825.843049] [<c128334b>] ? verify_iovec+0x4b/0x80
[192825.843055] [<c12795f9>] ? sock_sendmsg_nosec+0x6a/0x6a
[192825.843060] [<c1279ee6>] ___sys_recvmsg.part.27+0xd1/0x160
[192825.843067] [<c12795f9>] ? sock_sendmsg_nosec+0x6a/0x6a
[192825.843073] [<c119c781>] ? cpumask_any_but+0x21/0x30
[192825.843080] [<c108a655>] ? lru_cache_add+0x8/0xa
[192825.843088] [<c109f2fd>] ? page_add_new_anon_rmap+0x77/0xb5
[192825.843096] [<c109a143>] ? handle_mm_fault+0x2ba/0x674
[192825.843103] [<c11a5b1a>] ? _copy_from_user+0x29/0x3f
[192825.843109] [<c127b2d0>] __sys_recvmsg+0x3b/0x59
[192825.843116] [<c127b68e>] SyS_socketcall+0x186/0x1b2
[192825.843123] [<c1002911>] ? smp_threshold_interrupt+0x5/0x5
[192825.843130] [<c1024ee8>] ? vmalloc_sync_all+0x106/0x106
[192825.843138] [<c1319f5e>] syscall_call+0x7/0x7
[192825.843145] Code: 0f b6 48 35 d3 e2 03 50 08 89 1a 5b 5d c3 55 0f b6 48 35 89 e5 d3 e2 3
[192826.399601] { 1} (t=60561 jiffies g=1443066 c=1443065 q=5051)
[192826.405893] sending NMI to all CPUs:
[192826.409560] NMI backtrace for cpu 1
[192826.413149] CPU: 1 PID: 25469 Comm: tc Tainted: G O 3.16.7-ckt9-voyage #1
[192826.420981] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[192826.427247] task: f5e20470 ti: f3ea6000 task.ti: f3ea6000
[192826.432732] EIP: 0060:[<c102003b>] EFLAGS: 00000006 CPU: 1
[192826.438309] EIP is at arch_trigger_all_cpu_backtrace+0x6c/0x95
[192826.444234] EAX: 00000800 EBX: 00002710 ECX: c13b6e5a EDX: 00000800
[192826.450585] ESI: f617b72c EDI: 000013bb EBP: f3ea7b28 ESP: f3ea7b24
[192826.456940] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[192826.462435] CR0: 80050033 CR2: 080c6b74 CR3: 35d89000 CR4: 00000790
[192826.468794] Stack:
[192826.470899] c143e940 f3ea7b64 c1064832 c13bf23b 0000ec91 001604fa 001604f9 000013bb
[192826.478933] 00000001 f5e20470 c143e940 00000001 0b7911e7 f5e20470 00000000 00000001
[192826.486966] f3ea7b78 c103203b f617b8e0 f31720a2 0000af63 f3ea7b80 c106c2b4 f3ea7b98
[192826.495003] Call Trace:
[192826.497549] [<c1064832>] rcu_check_callbacks+0x17f/0x41c
[192826.503041] [<c103203b>] update_process_times+0x2d/0x50
[192826.508450] [<c106c2b4>] tick_sched_handle+0x30/0x32
[192826.513597] [<c106c4a8>] tick_sched_timer+0x31/0x55
[192826.518659] [<c103fcca>] __run_hrtimer.isra.24+0x3e/0x94
[192826.524152] [<c1040417>] hrtimer_interrupt+0xeb/0x1d9
[192826.529388] [<c101e781>] local_apic_timer_interrupt+0x41/0x46
[192826.535315] [<c101ea9c>] smp_apic_timer_interrupt+0x2e/0x3d
[192826.541070] [<c131a6dd>] apic_timer_interrupt+0x2d/0x40
[192826.546478] [<c1319ab5>] ? _raw_spin_lock_bh+0x27/0x2b
[192826.551800] [<c128519e>] gnet_stats_start_copy_compat+0x27/0x8d
[192826.557902] [<c129fd72>] tc_fill_qdisc+0x195/0x225
[192826.562875] [<c129fe91>] tc_dump_qdisc_root+0x8f/0xc4
[192826.568109] [<c129ff45>] tc_dump_qdisc+0x7f/0xd6
[192826.572912] [<c12a51bb>] netlink_dump+0xb4/0x201
[192826.577713] [<c12a5488>] netlink_recvmsg+0x180/0x28b
[192826.582862] [<c1279652>] sock_recvmsg+0x59/0x72
[192826.587576] [<c11a5b1a>] ? _copy_from_user+0x29/0x3f
[192826.592721] [<c128334b>] ? verify_iovec+0x4b/0x80
[192826.597610] [<c12795f9>] ? sock_sendmsg_nosec+0x6a/0x6a
[192826.603020] [<c1279ee6>] ___sys_recvmsg.part.27+0xd1/0x160
[192826.608686] [<c12795f9>] ? sock_sendmsg_nosec+0x6a/0x6a
[192826.614096] [<c119c781>] ? cpumask_any_but+0x21/0x30
[192826.619247] [<c108a655>] ? lru_cache_add+0x8/0xa
[192826.624046] [<c109f2fd>] ? page_add_new_anon_rmap+0x77/0xb5
[192826.629800] [<c109a143>] ? handle_mm_fault+0x2ba/0x674
[192826.635121] [<c11a5b1a>] ? _copy_from_user+0x29/0x3f
[192826.640270] [<c127b2d0>] __sys_recvmsg+0x3b/0x59
[192826.645069] [<c127b68e>] SyS_socketcall+0x186/0x1b2
[192826.650130] [<c1002911>] ? smp_threshold_interrupt+0x5/0x5
[192826.655801] [<c1024ee8>] ? vmalloc_sync_all+0x106/0x106
[192826.661208] [<c1319f5e>] syscall_call+0x7/0x7
[192826.665743] Code: 05 ba ec 0c 3d c1 52 68 5a 6e 3b c1 e8 98 5a 2f 00 8b 0d dc b4 43 c1 8
[192826.686274] NMI backtrace for cpu 0
[192826.689875] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.16.7-ckt9-voyage #1
[192826.697969] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[192826.704246] task: c1433878 ti: c1426000 task.ti: c1426000
[192826.709741] EIP: 0060:[<c1202441>] EFLAGS: 00200246 CPU: 0
[192826.715328] EIP is at acpi_safe_halt+0x17/0x1a
[192826.719869] EAX: 00000000 EBX: f5cec464 ECX: 00000001 EDX: 00000001
[192826.726232] ESI: 00000001 EDI: f5cec400 EBP: c1427f64 ESP: c1427f64
[192826.732592] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[192826.738087] CR0: 8005003b CR2: b7764750 CR3: 35007000 CR4: 00000790
[192826.744445] Stack:
[192826.746553] c1427f78 c120297f c1449560 f5ef2800 c14495bc c1427f98 c126adc6 00000001
[192826.754587] 14f241e3 0000af64 c1449560 c1427fec 00000001 c1427fa4 c126aee8 f5ef2800
[192826.762618] c1427fcc c1051e54 c1426000 00037191 c1427fec c1427fec c1449560 00037191
[192826.770655] Call Trace:
[192826.773214] [<c120297f>] acpi_idle_enter_c1+0x4e/0x74
[192826.778457] [<c126adc6>] cpuidle_enter_state+0x2c/0x96
[192826.783783] [<c126aee8>] cpuidle_enter+0xf/0x12
[192826.788502] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[192826.793828] [<c1313161>] rest_init+0x5d/0x5f
[192826.798284] [<c1462a40>] start_kernel+0x359/0x360
[192826.803170] [<c14622af>] i386_start_kernel+0x79/0x7d
[192826.808312] Code: f8 43 4f c1 89 e5 b8 0f 00 00 00 e8 4e 1b ff ff 31 c0 5d c3 64 a1 e8 f
[192826.829309] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long tos
[-- Attachment #3: cake_lockup2.txt --]
[-- Type: text/plain, Size: 2004 bytes --]
[ 1035.517370] ------------[ cut here ]------------
[ 1035.522085] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0xf4/0x18)
[ 1035.530493] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[ 1035.536938] Modules linked in: em_meta cls_basic sch_cake(O) sch_hfsc xt_ACCOUNT(O) ipt_n
[ 1035.588357] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.16.7-ckt9-voyage #1
[ 1035.596366] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[ 1035.602555] 00000000 00000000 f5c99f30 c13167e9 c13f2100 f5c99f48 c102a7dd c129e917
[ 1035.610537] f5fe1000 00000000 0009f2e0 f5c99f60 c102a858 00000009 f5c99f58 c13f2100
[ 1035.618527] f5c99f74 f5c99f90 c129e917 c13f2139 00000108 c13f2100 f5fe1000 f80f9a93
[ 1035.626499] Call Trace:
[ 1035.628975] [<c13167e9>] dump_stack+0x41/0x52
[ 1035.633442] [<c102a7dd>] warn_slowpath_common+0x5c/0x73
[ 1035.638770] [<c129e917>] ? dev_watchdog+0xf4/0x186
[ 1035.643666] [<c102a858>] warn_slowpath_fmt+0x26/0x2a
[ 1035.648737] [<c129e917>] dev_watchdog+0xf4/0x186
[ 1035.653465] [<c129e823>] ? pfifo_fast_dequeue+0xaa/0xaa
[ 1035.658799] [<c103160e>] call_timer_fn.isra.30+0x12/0x61
[ 1035.664213] [<c103187f>] run_timer_softirq+0x12b/0x171
[ 1035.669460] [<c102d689>] __do_softirq+0x91/0x175
[ 1035.674184] [<c102d5f8>] ? __hrtimer_tasklet_trampoline+0x1a/0x1a
[ 1035.680378] [<c10033c3>] do_softirq_own_stack+0x1d/0x23
[ 1035.685702] <IRQ> [<c102d8a9>] irq_exit+0x34/0x75
[ 1035.690642] [<c101eaa1>] smp_apic_timer_interrupt+0x33/0x3d
[ 1035.696318] [<c131a6dd>] apic_timer_interrupt+0x2d/0x40
[ 1035.701648] [<c1202441>] ? acpi_safe_halt+0x17/0x1a
[ 1035.706630] [<c120297f>] acpi_idle_enter_c1+0x4e/0x74
[ 1035.711787] [<c126adc6>] cpuidle_enter_state+0x2c/0x96
[ 1035.717029] [<c126aee8>] cpuidle_enter+0xf/0x12
[ 1035.721666] [<c1051e54>] cpu_startup_entry+0x135/0x1e1
[ 1035.726909] [<c101d553>] start_secondary+0x1a6/0x1ab
[ 1035.731978] ---[ end trace 915dc201bbcccd42 ]---
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-30 8:58 [Cake] lockup with multiple cake instances on 3.16.7 Pete Heist
@ 2019-01-30 22:59 ` Toke Høiland-Jørgensen
2019-01-31 7:15 ` Pete Heist
0 siblings, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-30 22:59 UTC (permalink / raw)
To: Pete Heist, Cake List
Pete Heist <pete@heistp.net> writes:
> I’m seeing a lockup with cake on 3.16.7 with 64-bit Intel when adding
> more than 4 cake instances. I’m using cake and tc-adv from head. Does
> anyone have an idea what this could be?
Looks like it's a compatibility issue with old kernels. Could you please
check if the latest commit in the github repo fixes this? :)
The explanation, copied from that commit:
> The qdisc stats dumping was changed in 4.8 (by commit
> edb09eb17ed89eaa82a52dd306beac93e292b485) to be lockless and moving the
> responsibility for taking the qdisc root lock into the qdisc itself. Cake
> inherited this from FQ-CoDel, but it leads to deadlocks on old kernels
> where the core stats dumping code still takes the root qdisc lock.
-Toke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-30 22:59 ` Toke Høiland-Jørgensen
@ 2019-01-31 7:15 ` Pete Heist
2019-01-31 14:53 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 11+ messages in thread
From: Pete Heist @ 2019-01-31 7:15 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
Thanks much for the try, only that didn’t fix it yet. I also tried changing two more calls to sch_tree_lock/unlock in cake_change to the maybe versions (recognizing that those may have still been there on purpose), but I still see the lockup. Even commented all the lock calls out to make sure.
I noticed that tc-adv is without xtables support when I compile it (TC_CONFIG_NO_XT:=y) in case that makes any difference. No libxtables-dev package available on this distro.
Pete
> On Jan 30, 2019, at 11:59 PM, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Pete Heist <pete@heistp.net> writes:
>
>> I’m seeing a lockup with cake on 3.16.7 with 64-bit Intel when adding
>> more than 4 cake instances. I’m using cake and tc-adv from head. Does
>> anyone have an idea what this could be?
>
> Looks like it's a compatibility issue with old kernels. Could you please
> check if the latest commit in the github repo fixes this? :)
>
> The explanation, copied from that commit:
>
>> The qdisc stats dumping was changed in 4.8 (by commit
>> edb09eb17ed89eaa82a52dd306beac93e292b485) to be lockless and moving the
>> responsibility for taking the qdisc root lock into the qdisc itself. Cake
>> inherited this from FQ-CoDel, but it leads to deadlocks on old kernels
>> where the core stats dumping code still takes the root qdisc lock.
>
>
> -Toke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 7:15 ` Pete Heist
@ 2019-01-31 14:53 ` Toke Høiland-Jørgensen
2019-01-31 20:25 ` Pete Heist
0 siblings, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-31 14:53 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
> Thanks much for the try, only that didn’t fix it yet.
Hmm, bugger :/
> I also tried
> changing two more calls to sch_tree_lock/unlock in cake_change to the
> maybe versions (recognizing that those may have still been there on
> purpose), but I still see the lockup. Even commented all the lock
> calls out to make sure.
No, the other calls I left in on purpose; when changing configuration
the lock *should* be held.
Well, the backtrace is definitely hanging on that lock in
gnet_stats_start_copy_compat(). If it's not related to the CAKE logging,
I guess it must be a bug in the upstream kernel; which we probably can't
fix from the cake side anyway.
I don't suppose you can reproduce this on a newer kernel?
-Toke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 14:53 ` Toke Høiland-Jørgensen
@ 2019-01-31 20:25 ` Pete Heist
2019-01-31 23:09 ` Toke Høiland-Jørgensen
2019-01-31 23:10 ` Pete Heist
0 siblings, 2 replies; 11+ messages in thread
From: Pete Heist @ 2019-01-31 20:25 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 31, 2019, at 3:53 PM, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Well, the backtrace is definitely hanging on that lock in
> gnet_stats_start_copy_compat(). If it's not related to the CAKE logging,
> I guess it must be a bug in the upstream kernel; which we probably can't
> fix from the cake side anyway.
>
> I don't suppose you can reproduce this on a newer kernel?
So far it works fine on 3.10.107 (with mipsel EdgeOS build) and 4.9.0-8 amd64, haven’t tried anything in between, but…
printk tells me that it’s not locking up in cake_dump_class_stats, but after a failure in cake_dump_stats. When “tc qdisc” is run after adding the fifth cake instance, line 2974 is failing:
PUT_TSTAT_U32(TARGET_US,
ktime_to_us(ns_to_ktime(b->cparams.target)));
So the call to nla_put_u32 returns nonzero. Then it ends up at nla_put_failure where nla_nest_cancel is called. The function returns, but the lock is not being released by the kernel in the failure case. The following patch “fixes it”:
diff --git a/sch_cake.c b/sch_cake.c
index 3a26db0..ae3e16c 100644
--- a/sch_cake.c
+++ b/sch_cake.c
@@ -3010,6 +3010,7 @@ static int cake_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
nla_put_failure:
nla_nest_cancel(d->skb, stats);
+ sch_tree_unlock(sch);
return -1;
}
Two questions:
1) Why is nla_put_u32 suddenly failing for TARGET_US after adding five cake instances?
2) Is calling sch_tree_unlock the right thing to do in the failure case, or am I working around a kernel bug, and doing something that would fail in other kernels?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 20:25 ` Pete Heist
@ 2019-01-31 23:09 ` Toke Høiland-Jørgensen
2019-01-31 23:18 ` Pete Heist
2019-01-31 23:10 ` Pete Heist
1 sibling, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-31 23:09 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
Pete Heist <pete@heistp.net> writes:
>> On Jan 31, 2019, at 3:53 PM, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Well, the backtrace is definitely hanging on that lock in
>> gnet_stats_start_copy_compat(). If it's not related to the CAKE logging,
>> I guess it must be a bug in the upstream kernel; which we probably can't
>> fix from the cake side anyway.
>>
>> I don't suppose you can reproduce this on a newer kernel?
>
> So far it works fine on 3.10.107 (with mipsel EdgeOS build) and
> 4.9.0-8 amd64, haven’t tried anything in between, but…
>
> printk tells me that it’s not locking up in cake_dump_class_stats, but
> after a failure in cake_dump_stats. When “tc qdisc” is run after
> adding the fifth cake instance, line 2974 is failing:
>
> PUT_TSTAT_U32(TARGET_US,
> ktime_to_us(ns_to_ktime(b->cparams.target)));
>
> So the call to nla_put_u32 returns nonzero. Then it ends up at
> nla_put_failure where nla_nest_cancel is called. The function returns,
> but the lock is not being released by the kernel in the failure case.
> The following patch “fixes it”:
Ah, good find!
> diff --git a/sch_cake.c b/sch_cake.c
> index 3a26db0..ae3e16c 100644
> --- a/sch_cake.c
> +++ b/sch_cake.c
> @@ -3010,6 +3010,7 @@ static int cake_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
>
> nla_put_failure:
> nla_nest_cancel(d->skb, stats);
> + sch_tree_unlock(sch);
> return -1;
> }
>
> Two questions:
>
> 1) Why is nla_put_u32 suddenly failing for TARGET_US after adding five
> cake instances?
Probably because it's running out of kernel memory? How much system
memory do you have on the system you are testing this on?
> 2) Is calling sch_tree_unlock the right thing to do in the failure
> case, or am I working around a kernel bug, and doing something that
> would fail in other kernels?
Yes, I think you are working around a kernel bug. See
https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_api.c#L1330
The lock is taken in gnet_stats_start_copy_compat() and released in
gnet_stats_finish_copy(). The latter is skipped in the failure path. It
seems this bug is present all the way up to Eric's change to remove the
locking entirely (which went into 4.8). So I guess you could get a patch
accepted for the stable trees in 3.16 and 4.4; not that this would help
you much if you are stuck on 3.16.7...
-Toke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 20:25 ` Pete Heist
2019-01-31 23:09 ` Toke Høiland-Jørgensen
@ 2019-01-31 23:10 ` Pete Heist
2019-01-31 23:18 ` Toke Høiland-Jørgensen
1 sibling, 1 reply; 11+ messages in thread
From: Pete Heist @ 2019-01-31 23:10 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Jan 31, 2019, at 9:25 PM, Pete Heist <pete@heistp.net> wrote:
>
> 1) Why is nla_put_u32 suddenly failing for TARGET_US after adding five cake instances?
nla_put_u32 is returning -EMSGSIZE, so the skb space in tailroom isn’t large enough (per nla_put doc).
After it fails, cake_dump_stats is called a second time right away, which succeeds. I _think_ what’s happening here is that after it sees -EMSGSIZE, the kernel allocates more tailroom and calls cake_dump_stats again. This doesn’t happen for kernel 4.9.0, it always succeeds, so presumably the initial size is larger.
> 2) Is calling sch_tree_unlock the right thing to do in the failure case, or am I working around a kernel bug, and doing something that would fail in other kernels?
I don’t think we’d want to do this after that same edb09eb17 commit. :) So at a minimum, to unlock after error:
nla_put_failure:
nla_nest_cancel(d->skb, stats);
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 8, 0)
+ sch_tree_unlock(sch);
+#endif
return -1;
}
The question still is, would that break other kernel versions if an error occurs? If the tree is unlocked again elsewhere in some kernel versions after an error, that would end badly. It looks like in tc_fill_qdisc (sch_api.c) that gnet_stats_finish_copy, which unlocks, is not called if q->ops->dump_stats returns < 0. So I’m not sure how it ever unlocked properly in case of error. Hrm.
Do you think this is a correct patch?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 23:10 ` Pete Heist
@ 2019-01-31 23:18 ` Toke Høiland-Jørgensen
2019-01-31 23:23 ` Pete Heist
0 siblings, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-31 23:18 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
(crossed streams; but yeah, basically the same conclusion)
> Do you think this is a correct patch?
Well, the "correct patch" would be to get upstream fixed :)
But yeah, I don't think it would hurt to carry this as a workaround
patch until that happens...
I guess you could try running it for a while and see if anything else
blows up? You seem to have quite a knack for getting it to fail... ;)
-Toke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 23:09 ` Toke Høiland-Jørgensen
@ 2019-01-31 23:18 ` Pete Heist
0 siblings, 0 replies; 11+ messages in thread
From: Pete Heist @ 2019-01-31 23:18 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1565 bytes --]
> On Feb 1, 2019, at 12:09 AM, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> 1) Why is nla_put_u32 suddenly failing for TARGET_US after adding five
>> cake instances?
>
> Probably because it's running out of kernel memory? How much system
> memory do you have on the system you are testing this on?
Plenty of memory (used 131308, free 1911900). I’m guessing this was by design where earlier kernels allocated a smaller initial size for tail space, but that’s only a guess as I haven’t found where that’s done.
>> 2) Is calling sch_tree_unlock the right thing to do in the failure
>> case, or am I working around a kernel bug, and doing something that
>> would fail in other kernels?
>
> Yes, I think you are working around a kernel bug. See
> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_api.c#L1330 <https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_api.c#L1330>
>
> The lock is taken in gnet_stats_start_copy_compat() and released in
> gnet_stats_finish_copy(). The latter is skipped in the failure path. It
> seems this bug is present all the way up to Eric's change to remove the
> locking entirely (which went into 4.8). So I guess you could get a patch
> accepted for the stable trees in 3.16 and 4.4; not that this would help
> you much if you are stuck on 3.16.7…
Hehe, “crossing the streams” here. :) That’s what I gathered after looking at that code for a while, but I’m glad to be sure about it.
Would you accept my workaround in cake_dump_stats, or rather not?
[-- Attachment #2: Type: text/html, Size: 10752 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 23:18 ` Toke Høiland-Jørgensen
@ 2019-01-31 23:23 ` Pete Heist
2019-01-31 23:31 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 11+ messages in thread
From: Pete Heist @ 2019-01-31 23:23 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: Cake List
> On Feb 1, 2019, at 12:18 AM, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> (crossed streams; but yeah, basically the same conclusion)
Ok, crossed-crossed streams references it is… :)
>> Do you think this is a correct patch?
>
> Well, the "correct patch" would be to get upstream fixed :)
>
> But yeah, I don't think it would hurt to carry this as a workaround
> patch until that happens...
>
> I guess you could try running it for a while and see if anything else
> blows up? You seem to have quite a knack for getting it to fail... ;)
That sounds reasonable, let me see if I can break it in some different cases.
My guess is, past qdiscs didn’t manage to fill the tailroom space and cause this problem, but I’m not sure…
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Cake] lockup with multiple cake instances on 3.16.7
2019-01-31 23:23 ` Pete Heist
@ 2019-01-31 23:31 ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-01-31 23:31 UTC (permalink / raw)
To: Pete Heist; +Cc: Cake List
[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]
On Fri, 1 Feb 2019, 00:23 Pete Heist <pete@heistp.net wrote:
>
> > On Feb 1, 2019, at 12:18 AM, Toke Høiland-Jørgensen <toke@redhat.com>
> wrote:
> >
> > (crossed streams; but yeah, basically the same conclusion)
>
> Ok, crossed-crossed streams references it is… :)
>
> >> Do you think this is a correct patch?
> >
> > Well, the "correct patch" would be to get upstream fixed :)
> >
> > But yeah, I don't think it would hurt to carry this as a workaround
> > patch until that happens...
> >
> > I guess you could try running it for a while and see if anything else
> > blows up? You seem to have quite a knack for getting it to fail... ;)
>
> That sounds reasonable, let me see if I can break it in some different
> cases.
>
Cool.
My guess is, past qdiscs didn’t manage to fill the tailroom space and cause
> this problem, but I’m not sure…
>
Wouldn't be the first bug we've exposed in this code:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=d5a672ac9f48f81b20b1cad1d9ed7bbf4e418d4c
-Toke
>
[-- Attachment #2: Type: text/html, Size: 2116 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-01-31 23:31 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 8:58 [Cake] lockup with multiple cake instances on 3.16.7 Pete Heist
2019-01-30 22:59 ` Toke Høiland-Jørgensen
2019-01-31 7:15 ` Pete Heist
2019-01-31 14:53 ` Toke Høiland-Jørgensen
2019-01-31 20:25 ` Pete Heist
2019-01-31 23:09 ` Toke Høiland-Jørgensen
2019-01-31 23:18 ` Pete Heist
2019-01-31 23:10 ` Pete Heist
2019-01-31 23:18 ` Toke Høiland-Jørgensen
2019-01-31 23:23 ` Pete Heist
2019-01-31 23:31 ` Toke Høiland-Jørgensen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox