From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [52.28.52.200]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 3DF483BA8E for ; Mon, 2 Jul 2018 12:13:56 -0400 (EDT) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1530548034; bh=pO2Ei+46PTxOitMmUQ5FfBrELmRqItZY8pmcPF6j1yk=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=lF/DmOpnDU2YfnQ884nRDbtRQSRFY0m2e9L4iYufEg27aCSIsp3IVhEU0kix2zUPK FsvYHe3FhkGY0EhEw3ZtZtkIVfRdl3Ns9Gmr6ZEn59FiC2jkeFjw+L17ZezGOrxKEA axT+FbG2yAS6QDMHmObxc5n9CqwEvtcf4yKMrU2azUWmmymJKpBrHtzgmA1sF62cly l/aSgqVwuX55FpazXMLPgmWZcS/MGmFn+mwPqSqCHNFwsrq3ESSFw4rBCs5xS5vk49 asSwGYbEDDDZffZCb3YfkVc+P8X9wv6OR3Lv/pNGU//qNT6Ok6w7VpubNSaiZ2uJrf u/RakGtyifmFA== To: Pete Heist Cc: Kevin Darbyshire-Bryant , Cake List In-Reply-To: References: <6DF9A5E0-EFD5-4519-9889-BC0A7B9BD48E@darbyshire-bryant.me.uk> <1A8BA286-6B31-4581-86C9-6855AC28C245@heistp.net> <673EAD3F-AB09-4B90-88BB-5DCE0BD65534@heistp.net> <6FE8D434-01BE-41A1-BD6B-EFFD67AC8784@heistp.net> <94C9790F-E9BC-4D59-9845-17C305E4B910@darbyshire-bryant.me.uk> <17AF79A0-0213-44E3-95B9-62795A644A47@heistp.net> <87lgatj13k.fsf@toke.dk> Date: Mon, 02 Jul 2018 18:14:04 +0200 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87fu11ipir.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] Cake on openwrt - falling behind X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jul 2018 16:13:56 -0000 Pete Heist writes: >> On Jul 2, 2018, at 2:03 PM, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>=20 >> Pete Heist writes: >>=20 >>> But, um, I find it curious that tb[TCA_PAD] has valid looking values >>> in it, and if I just go: >>>=20 >>> tb[TCA_STATS2] =3D tb[TCA_PAD] >>>=20 >>> right after parse_rtattr is called, I start getting tin stats printed >>> that look valid. There should be zeroes or invalid values at >>> tb[TCA_PAD], as that=E2=80=99s just supposed to be padding, right? >>=20 >> Hmm, that's interesting. Sounds like you are on the right track. What >> are the numerical values of TCA_STATS2 and TCA_PAD in the kernel and >> iproute2, respectively? > > I never would=E2=80=99ve guessed they could be different when compiled at= once > :) but true, there are at least five different versions of rtnetlink.h > under build_dir based on their md5sums, and I see one belongs to > iproute2-full. In all of those, it looks in the source like > TCA_STATS2=3D7 and TCA_PAD=3D9. Aha! I think I figured out what is going on: The gen_stats facility will add an nlattr header at the beginning of the qdisc stats, which is the toplevel TLV that contains all stats (and that we put our stats inside). It stores a reference to this header, and when all the per-qdisc callbacks have finished adding their stats, it goes back and fixes up the length of the containing header. The problem is that on architectures that need padding, the padding TLV is added *first*, which means that the nlattr pointer that is stored before the callbacks are performed points to the padding TLV and not the stats TLV. And so, when the header is fixed up, the result (from the parser's perspective) is just a very big padding TLV. The options TLV is before the stats TLV, so the bug only occurs if the options happen to have a length that means the stats will need padding. Which is why messing with the number of options "fixes" the bug. Could you try applying the patch below (to the kernel) and see if that resolves the issue, please? -Toke @@ -77,8 +77,12 @@ gnet_stats_start_copy_compat(struct sk_buff *skb, int ty= pe, int tc_stats_type, d->lock =3D lock; spin_lock_bh(lock); } - if (d->tail) - return gnet_stats_copy(d, type, NULL, 0, padattr); + if (d->tail) { + int ret =3D gnet_stats_copy(d, type, NULL, 0, padattr); + if (!ret) + d->tail =3D ((struct nlattr *)skb_tail_pointer(skb)) - 1; + return ret; + } =20 return 0; }