> On Feb 1, 2019, at 12:09 AM, Toke Høiland-Jørgensen wrote: >> >> 1) Why is nla_put_u32 suddenly failing for TARGET_US after adding five >> cake instances? > > Probably because it's running out of kernel memory? How much system > memory do you have on the system you are testing this on? Plenty of memory (used 131308, free 1911900). I’m guessing this was by design where earlier kernels allocated a smaller initial size for tail space, but that’s only a guess as I haven’t found where that’s done. >> 2) Is calling sch_tree_unlock the right thing to do in the failure >> case, or am I working around a kernel bug, and doing something that >> would fail in other kernels? > > Yes, I think you are working around a kernel bug. See > https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_api.c#L1330 > > The lock is taken in gnet_stats_start_copy_compat() and released in > gnet_stats_finish_copy(). The latter is skipped in the failure path. It > seems this bug is present all the way up to Eric's change to remove the > locking entirely (which went into 4.8). So I guess you could get a patch > accepted for the stable trees in 3.16 and 4.4; not that this would help > you much if you are stuck on 3.16.7… Hehe, “crossing the streams” here. :) That’s what I gathered after looking at that code for a while, but I’m glad to be sure about it. Would you accept my workaround in cake_dump_stats, or rather not?