[Cake] Cake on openwrt - falling behind

Toke Høiland-Jørgensen toke at toke.dk
Mon Jul 2 15:33:09 EDT 2018


Kevin Darbyshire-Bryant <kevin at darbyshire-bryant.me.uk> writes:

>> On 2 Jul 2018, at 17:59, Kevin Darbyshire-Bryant via Cake <cake at lists.bufferbloat.net> wrote:
>> 
>> 
>> From: Kevin Darbyshire-Bryant <kevin at darbyshire-bryant.me.uk>
>> Subject: Re: [Cake] Cake on openwrt - falling behind
>> Date: 2 July 2018 at 17:59:36 BST
>> To: Toke Høiland-Jørgensen <toke at toke.dk>
>> Cc: Pete Heist <pete at heistp.net>, Cake List <cake at lists.bufferbloat.net>
>> 
>> 
>> 
>> 
>>> On 2 Jul 2018, at 17:14, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>> 
>>> Pete Heist <pete at heistp.net> writes:
>>> 
>>>>> On Jul 2, 2018, at 2:03 PM, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>>>> 
>>>>> Pete Heist <pete at heistp.net> writes:
>>>>> 
>>>>>> But, um, I find it curious that tb[TCA_PAD] has valid looking values
>>>>>> in it, and if I just go:
>>>>>> 
>>>>>> tb[TCA_STATS2] = tb[TCA_PAD]
>>>>>> 
>>>>>> right after parse_rtattr is called, I start getting tin stats printed
>>>>>> that look valid. There should be zeroes or invalid values at
>>>>>> tb[TCA_PAD], as that’s just supposed to be padding, right?
>>>>> 
>>>>> Hmm, that's interesting. Sounds like you are on the right track. What
>>>>> are the numerical values of TCA_STATS2 and TCA_PAD in the kernel and
>>>>> iproute2, respectively?
>>>> 
>>>> I never would’ve guessed they could be different when compiled at once
>>>> :) but true, there are at least five different versions of rtnetlink.h
>>>> under build_dir based on their md5sums, and I see one belongs to
>>>> iproute2-full. In all of those, it looks in the source like
>>>> TCA_STATS2=7 and TCA_PAD=9.
>>> 
>>> Aha! I think I figured out what is going on:
>>> 
>>> The gen_stats facility will add an nlattr header at the beginning of the
>>> qdisc stats, which is the toplevel TLV that contains all stats (and that
>>> we put our stats inside). It stores a reference to this header, and when
>>> all the per-qdisc callbacks have finished adding their stats, it goes
>>> back and fixes up the length of the containing header.
>>> 
>>> The problem is that on architectures that need padding, the padding TLV
>>> is added *first*, which means that the nlattr pointer that is stored
>>> before the callbacks are performed points to the padding TLV and not the
>>> stats TLV. And so, when the header is fixed up, the result (from the
>>> parser's perspective) is just a very big padding TLV.
>>> 
>>> The options TLV is before the stats TLV, so the bug only occurs if the
>>> options happen to have a length that means the stats will need padding.
>>> Which is why messing with the number of options "fixes" the bug.
>>> 
>>> Could you try applying the patch below (to the kernel) and see if that
>>> resolves the issue, please?
>>> 
>>> -Toke
>>> 
>>> 
>>> @@ -77,8 +77,12 @@ gnet_stats_start_copy_compat(struct sk_buff *skb, int type, int tc_stats_type,
>>> 		d->lock = lock;
>>> 		spin_lock_bh(lock);
>>> 	}
>>> -	if (d->tail)
>>> -		return gnet_stats_copy(d, type, NULL, 0, padattr);
>>> +	if (d->tail) {
>>> +		int ret = gnet_stats_copy(d, type, NULL, 0, padattr);
>>> +		if (!ret)
>>> +			d->tail = ((struct nlattr *)skb_tail_pointer(skb)) - 1;
>>> +		return ret;
>>> +	}
>>> 
>>> 	return 0;
>>> }
>> o/cake
>
> Would you like me to bash this into openwrt? Plans for this to go
> upstream? Perhaps now take off list or onto IRC?

I'll submit a patch for netdev. If you want to pack it up and submit it
for openwrt, that would be helpful! :)

-Toke


More information about the Cake mailing list