On Jul 2, 2018, at 7:04 PM, Pete Heist <pete@heistp.net> wrote:



On Jul 2, 2018, at 6:14 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:

Aha! I think I figured out what is going on:

The gen_stats facility will add an nlattr header at the beginning of the
qdisc stats, which is the toplevel TLV that contains all stats (and that
we put our stats inside). It stores a reference to this header, and when
all the per-qdisc callbacks have finished adding their stats, it goes
back and fixes up the length of the containing header.

The problem is that on architectures that need padding, the padding TLV
is added *first*, which means that the nlattr pointer that is stored
before the callbacks are performed points to the padding TLV and not the
stats TLV. And so, when the header is fixed up, the result (from the
parser's perspective) is just a very big padding TLV.

The options TLV is before the stats TLV, so the bug only occurs if the
options happen to have a length that means the stats will need padding.
Which is why messing with the number of options "fixes" the bug.

Could you try applying the patch below (to the kernel) and see if that
resolves the issue, please?

Awesome Toke! It looks like from Kevin’s email that it works for him, but it didn’t work for me the first time around. This may have to do with how I added the patch as I’m still not that familiar with OpenWRT’s build system (first kernel patch I tried). I wasn’t sure if it should go into generic or platform, for one, so I tried generic…is that right?

Ok, I got it to work after re-flashing with tftp. :) It looks like the OM2P is not always successfully performing sysupgrades, perhaps due to its limited memory (64M), but I’m not sure.

I still have my debugging in place and do still have one question. The pointer in TCA_STATS2 is now valid, but there is still a pointer value in TCA_PAD, which is pointing to a place 32 bits before TCA_STATS2. Is that expected?

root@OpenWrt:/tmp# tc -s -d qdisc show dev eth0
TCA_STATS2 val=00000007
TCA_PAD val=00000009
tb[TCA_UNSPEC]=00000000
tb[TCA_KIND]=773a73fc
tb[TCA_OPTIONS]=773a7408
tb[TCA_STATS]=773a772c
tb[TCA_XSTATS]=00000000
tb[TCA_RATE]=00000000
tb[TCA_FCNT]=00000000
tb[TCA_STATS2]=773a7494
tb[TCA_STAB]=00000000
tb[TCA_PAD]=773a7490
tb[TCA_DUMP_INVISIBLE]=00000000
tb[TCA_CHAIN]=00000000
tb[TCA_HW_OFFLOAD]=00000000
tb[TCA_INGRESS_BLOCK]=00000000
tb[TCA_EGRESS_BLOCK]=00000000
qdisc cake 8001: root refcnt 2 bandwidth unlimited diffserv3 triple-isolate rtt 100.0ms raw overhead 0 
tca_stats 2000320300 tca_stats2 2000319636 tca_xstats 0

Also, if you agree, I’d like to see this tested on 32-bit ARM (George, or my Raspi?) and 64-bit Intel, at least. What do you think?