[Cake] NLA_F_NESTED is missing
Toke Høiland-Jørgensen
toke at toke.dk
Wed Nov 4 06:27:53 EST 2020
Dean Scarff <dos at scarff.id.au> writes:
> On Tue, 03 Nov 2020 12:00:55 +0100, Toke Høiland-Jørgensen wrote:
>> Dean Scarff <dos at scarff.id.au> writes:
>>
>>> On Mon, 02 Nov 2020 13:37:00 +0100, Toke wrote:
>>>> Dean Scarff <dos at scarff.id.au> writes:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've been happily running the out-of-tree sch_cake on my
>>>>> Raspberry
>>>>> Pi
>>>>> since 2015. However, I recently upgraded my kernel (to 5.4.72
>>>>> from
>>>>> Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
>>>>> sch_cake in mainline. Now, when running:
>>>>>
>>>>> sudo /sbin/tc qdisc add dev ppp0 root cake
>>>>>
>>>>> I get the error:
>>>>>
>>>>> Error: NLA_F_NESTED is missing.
>>>>>
>>>>> I get this error with the sch_cake in mainline, and also with
>>>>> sch_cake
>>>>> built out-of-tree. I also get the error with both Debian's
>>>>> iproute2
>>>>> 5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's
>>>>> tc-adv
>>>>> repo.
>>>>>
>>>>> Any ideas on what this error means and how to fix it?
>>>>
>>>> I just tried building a 5.4.72 kernel and couldn't reproduce this,
>>>> so
>>>> it
>>>> seems it's a fault with the raspberry pi kernel; I guess opening a
>>>> bug
>>>> against that would be the way to go?
>>>>
>>>> As for what's actually causing this, I couldn't find anything
>>>> obvious
>>>> that touches this code in the qdisc layer; but I suppose it has
>>>> something to do with the core qdisc netlink parsing code?
>>>>
>>>> -Toke
>>>
>>> Thanks for the data point.
>>>
>>> For the record, the relevant kernel source is:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/net/netlink.h?h=v5.4.72#n1143
>>> and the Pi branch:
>>>
>>> https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20201022-1/include/net/netlink.h#L1143
>>>
>>> It seems very unlikely that the Pi folks are patching the netlink
>>> stuff, so I don't think I'll get much traction there unless I can
>>> call
>>> out something specifically wrong with their patchset.
>>
>> Well, something odd is certainly going on. The error message you're
>> quoting comes form a part of the netlink parsing code (in the kernel)
>> that shouldn't even be hit by the qdisc addition: NLA_F_NESTED
>> parsing
>> is only enabled in 'strict' validation mode, which is not used for
>> qdiscs.
>>
>> So IDK, maybe a compiler issue or a bit that gets set wrong
>> somewhere?
>> Bisecting the kernel may be the only option here, I don't think
>> you're
>> going to find anything in userspace...
>
> Yeah, I came to the same conclusion. I verified the userspace was sane
> via gdb (see earlier post), and I also read through the sch_api.c and
> nlattr.c kernel code and it sure looks impossible for the strict
> validation to be getting hit.
>
> Safe to say this was random corruption: I downgraded the kernel, things
> worked as expected, then I upgraded back to the 5.4.72 and it worked
> too! Interestingly, the problem persisted across reboots (so it wasn't
> just RAM corruption), and all the kernel files also matched their "dpkg"
> MD5s (so it wasn't like the binaries were obviously corrupt on disk).
> I've replaced the Pi's microSD card just to be safe, though... kernel
> corruption is scary.
Ugh, Heisenbugs are the worst! Great to hear you managed to resolve it,
though :)
-Toke
More information about the Cake
mailing list