[Cake] NLA_F_NESTED is missing

Dean Scarff dos at scarff.id.au
Wed Nov 4 00:48:27 EST 2020


 On Tue, 03 Nov 2020 12:00:55 +0100, Toke Høiland-Jørgensen wrote:
> Dean Scarff <dos at scarff.id.au> writes:
>
>>  On Mon, 02 Nov 2020 13:37:00 +0100, Toke wrote:
>>> Dean Scarff <dos at scarff.id.au> writes:
>>>
>>>>  Hi,
>>>>
>>>>  I've been happily running the out-of-tree sch_cake on my 
>>>> Raspberry
>>>> Pi
>>>>  since 2015.  However, I recently upgraded my kernel (to 5.4.72 
>>>> from
>>>>  Raspbian's raspberrypi-kernel 1.20201022-1), which comes with the
>>>>  sch_cake in mainline.  Now, when running:
>>>>
>>>>    sudo /sbin/tc qdisc add dev ppp0 root cake
>>>>
>>>>  I get the error:
>>>>
>>>>    Error: NLA_F_NESTED is missing.
>>>>
>>>>  I get this error with the sch_cake in mainline, and also with
>>>> sch_cake
>>>>  built out-of-tree.  I also get the error with both Debian's
>>>> iproute2
>>>>  5.9.0-1 (built myself via debian/rules) and "tc" from dtaht's
>>>> tc-adv
>>>>  repo.
>>>>
>>>>  Any ideas on what this error means and how to fix it?
>>>
>>> I just tried building a 5.4.72 kernel and couldn't reproduce this, 
>>> so
>>> it
>>> seems it's a fault with the raspberry pi kernel; I guess opening a
>>> bug
>>> against that would be the way to go?
>>>
>>> As for what's actually causing this, I couldn't find anything 
>>> obvious
>>> that touches this code in the qdisc layer; but I suppose it has
>>> something to do with the core qdisc netlink parsing code?
>>>
>>> -Toke
>>
>>  Thanks for the data point.
>>
>>  For the record, the relevant kernel source is:
>>  
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/net/netlink.h?h=v5.4.72#n1143
>>  and the Pi branch:
>>  
>> https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20201022-1/include/net/netlink.h#L1143
>>
>>  It seems very unlikely that the Pi folks are patching the netlink
>>  stuff, so I don't think I'll get much traction there unless I can 
>> call
>>  out something specifically wrong with their patchset.
>
> Well, something odd is certainly going on. The error message you're
> quoting comes form a part of the netlink parsing code (in the kernel)
> that shouldn't even be hit by the qdisc addition: NLA_F_NESTED 
> parsing
> is only enabled in 'strict' validation mode, which is not used for
> qdiscs.
>
> So IDK, maybe a compiler issue or a bit that gets set wrong 
> somewhere?
> Bisecting the kernel may be the only option here, I don't think 
> you're
> going to find anything in userspace...

 Yeah, I came to the same conclusion.  I verified the userspace was sane 
 via gdb (see earlier post), and I also read through the sch_api.c and 
 nlattr.c kernel code and it sure looks impossible for the strict 
 validation to be getting hit.

 Safe to say this was random corruption: I downgraded the kernel, things 
 worked as expected, then I upgraded back to the 5.4.72 and it worked 
 too!  Interestingly, the problem persisted across reboots (so it wasn't 
 just RAM corruption), and all the kernel files also matched their "dpkg" 
 MD5s (so it wasn't like the binaries were obviously corrupt on disk).  
 I've replaced the Pi's microSD card just to be safe, though... kernel 
 corruption is scary.



More information about the Cake mailing list