From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-x241.google.com (mail-wr0-x241.google.com [IPv6:2a00:1450:400c:c0c::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 499423BA8E for ; Mon, 2 Jul 2018 14:24:44 -0400 (EDT) Received: by mail-wr0-x241.google.com with SMTP id a12-v6so16471709wro.1 for ; Mon, 02 Jul 2018 11:24:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=f37Y5nhtlMPNnzv6h/6+H0ZQpQFRaW8KphutcFs4D8E=; b=cC3vPYGWXJVXZEoBeTjP1ULUob65zg+eP2cPsKfn3RiOtiDEKmnzs8YYVJ0qa2AS4/ wEmZ/IjYSifNiM0F5wYcMBfcWxlsngKsnu+Sl7xxAv5pKuQPbCUu0o5vsvuPC9A22sMz SnB10zNhr0qRtKJBU91SQ+Gk0UlRmsVhrQvrBsr1UqKHgfGJZWoZCz4dZVIgaRZOYtwZ ZU+R2TpUoRQgp1UgP6/QXm90xTCautIQBI1TWzJ7kIVj3tuqTo9mM7XkO6FQEpEFfyUe s6k7STZh3hIZQUSAK3siZLhaHsMysLuE0rcwaX5/rBEuNj0yZz5zmNCb7IgrVHNZdSsu RJ+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=f37Y5nhtlMPNnzv6h/6+H0ZQpQFRaW8KphutcFs4D8E=; b=UXoAofxEx+hyzd8AoDjSE2QAvfqhT7rpdM0iCPmDi0BLKuHFru0W50Y6LvOUBOA7IZ YYPZdvTqr+NEdcw1/A+l0sXgfjOcjbg1AgeljoJ2A1kPBFm3FpzqUlGiwqgmiBNFkf4O xljb35kNos/+BYd/KmdhFqrRHmS96d0iQ4aEZhH1D+gf8UeDhuL+bqi56pkuwZbR0LkP M6N84n0MAolfHbOA4sKhCz4xb/t/WOO4dp2la2lHAdXdgZ/clgw7J68y5pKFiplpgIAL bzQ5/YsJPMGy0DwWWTPzIoJXQ3B4gaBfdAjSm6tAu6+dI4XpBEa1UoHqpSJ1fAvyi2bP 3ctg== X-Gm-Message-State: APt69E18sLq0b3OQuRI54wqqXIDCW9K0ZP7N6d+VA0PhQgg5D5Lq9V4h e2+515yB9/yfOwxAtXSALePI6ON6RTk= X-Google-Smtp-Source: AAOMgpcfgV9K3CxNhZ0wGggw8t5sFIj8gxGOrTYDOxt90g3/JENqpCmgbNVM8UTwbL5f2DqUX8Ks0Q== X-Received: by 2002:adf:be0f:: with SMTP id n15-v6mr12462242wrh.267.1530555883297; Mon, 02 Jul 2018 11:24:43 -0700 (PDT) Received: from tron.luk.heistp.net (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id e126-v6sm15715335wmd.43.2018.07.02.11.24.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 02 Jul 2018 11:24:42 -0700 (PDT) From: Pete Heist Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_E2EC8718-C2D9-4915-A151-ECB8D06E4A76" Mime-Version: 1.0 (Mac OS X Mail 11.4 \(3445.8.2\)) Date: Mon, 2 Jul 2018 20:24:41 +0200 In-Reply-To: <8815D90E-DEAB-4211-B4B4-7058178DEA47@heistp.net> Cc: Kevin Darbyshire-Bryant , Cake List To: =?utf-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= References: <6DF9A5E0-EFD5-4519-9889-BC0A7B9BD48E@darbyshire-bryant.me.uk> <1A8BA286-6B31-4581-86C9-6855AC28C245@heistp.net> <673EAD3F-AB09-4B90-88BB-5DCE0BD65534@heistp.net> <6FE8D434-01BE-41A1-BD6B-EFFD67AC8784@heistp.net> <94C9790F-E9BC-4D59-9845-17C305E4B910@darbyshire-bryant.me.uk> <17AF79A0-0213-44E3-95B9-62795A644A47@heistp.net> <87lgatj13k.fsf@toke.dk> <87fu11ipir.fsf@toke.dk> <8815D90E-DEAB-4211-B4B4-7058178DEA47@heistp.net> X-Mailer: Apple Mail (2.3445.8.2) Subject: Re: [Cake] Cake on openwrt - falling behind X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jul 2018 18:24:44 -0000 --Apple-Mail=_E2EC8718-C2D9-4915-A151-ECB8D06E4A76 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jul 2, 2018, at 7:04 PM, Pete Heist wrote: >=20 >=20 >=20 >> On Jul 2, 2018, at 6:14 PM, Toke H=C3=B8iland-J=C3=B8rgensen = > wrote: >>=20 >> Aha! I think I figured out what is going on: >>=20 >> The gen_stats facility will add an nlattr header at the beginning of = the >> qdisc stats, which is the toplevel TLV that contains all stats (and = that >> we put our stats inside). It stores a reference to this header, and = when >> all the per-qdisc callbacks have finished adding their stats, it goes >> back and fixes up the length of the containing header. >>=20 >> The problem is that on architectures that need padding, the padding = TLV >> is added *first*, which means that the nlattr pointer that is stored >> before the callbacks are performed points to the padding TLV and not = the >> stats TLV. And so, when the header is fixed up, the result (from the >> parser's perspective) is just a very big padding TLV. >>=20 >> The options TLV is before the stats TLV, so the bug only occurs if = the >> options happen to have a length that means the stats will need = padding. >> Which is why messing with the number of options "fixes" the bug. >>=20 >> Could you try applying the patch below (to the kernel) and see if = that >> resolves the issue, please? >=20 > Awesome Toke! It looks like from Kevin=E2=80=99s email that it works = for him, but it didn=E2=80=99t work for me the first time around. This = may have to do with how I added the patch as I=E2=80=99m still not that = familiar with OpenWRT=E2=80=99s build system (first kernel patch I = tried). I wasn=E2=80=99t sure if it should go into generic or platform, = for one, so I tried generic=E2=80=A6is that right? Ok, I got it to work after re-flashing with tftp. :) It looks like the = OM2P is not always successfully performing sysupgrades, perhaps due to = its limited memory (64M), but I=E2=80=99m not sure. I still have my debugging in place and do still have one question. The = pointer in TCA_STATS2 is now valid, but there is still a pointer value = in TCA_PAD, which is pointing to a place 32 bits before TCA_STATS2. Is = that expected? root@OpenWrt:/tmp# tc -s -d qdisc show dev eth0 TCA_STATS2 val=3D00000007 TCA_PAD val=3D00000009 tb[TCA_UNSPEC]=3D00000000 tb[TCA_KIND]=3D773a73fc tb[TCA_OPTIONS]=3D773a7408 tb[TCA_STATS]=3D773a772c tb[TCA_XSTATS]=3D00000000 tb[TCA_RATE]=3D00000000 tb[TCA_FCNT]=3D00000000 tb[TCA_STATS2]=3D773a7494 tb[TCA_STAB]=3D00000000 tb[TCA_PAD]=3D773a7490 tb[TCA_DUMP_INVISIBLE]=3D00000000 tb[TCA_CHAIN]=3D00000000 tb[TCA_HW_OFFLOAD]=3D00000000 tb[TCA_INGRESS_BLOCK]=3D00000000 tb[TCA_EGRESS_BLOCK]=3D00000000 qdisc cake 8001: root refcnt 2 bandwidth unlimited diffserv3 = triple-isolate rtt 100.0ms raw overhead 0=20 tca_stats 2000320300 tca_stats2 2000319636 tca_xstats 0 =E2=80=A6 Also, if you agree, I=E2=80=99d like to see this tested on 32-bit ARM = (George, or my Raspi?) and 64-bit Intel, at least. What do you think?= --Apple-Mail=_E2EC8718-C2D9-4915-A151-ECB8D06E4A76 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Jul 2, 2018, at 7:04 PM, Pete Heist <pete@heistp.net> = wrote:



On Jul 2, 2018, at 6:14 PM, Toke H=C3=B8iland-J=C3=B8rgensen = <toke@toke.dk> = wrote:

Aha! I think I figured out what is going on:

The gen_stats facility will add = an nlattr header at the beginning of the
qdisc stats, which is the toplevel TLV that contains all = stats (and that
we put our stats inside). It stores a reference to this = header, and when
all the per-qdisc callbacks have finished adding their stats, = it goes
back and = fixes up the length of the containing header.

The problem is that on = architectures that need padding, the padding TLV
is added *first*, which means = that the nlattr pointer that is stored
before the callbacks are performed points to the padding TLV = and not the
stats TLV. = And so, when the header is fixed up, the result (from the
parser's perspective) is just a = very big padding TLV.

The options TLV is before the stats TLV, so the bug only = occurs if the
options = happen to have a length that means the stats will need = padding.
Which is why = messing with the number of options "fixes" the bug.

Could you try applying the patch = below (to the kernel) and see if that
resolves the issue, please?

Awesome Toke! It looks like from = Kevin=E2=80=99s email that it works for him, but it didn=E2=80=99t work = for me the first time around. This may have to do with how I added the = patch as I=E2=80=99m still not that familiar with OpenWRT=E2=80=99s = build system (first kernel patch I tried). I wasn=E2=80=99t sure if it = should go into generic or platform, for one, so I tried generic=E2=80=A6is= that right?

Ok, I got it to work after re-flashing with tftp. :) It looks = like the OM2P is not always successfully performing sysupgrades, perhaps = due to its limited memory (64M), but I=E2=80=99m not sure.

I still have my = debugging in place and do still have one question. The pointer in = TCA_STATS2 is now valid, but there is still a pointer value in TCA_PAD, = which is pointing to a place 32 bits before TCA_STATS2. Is that = expected?

root@OpenWrt:/tmp# tc -s -d qdisc show dev eth0
TCA_STATS2 val=3D00000007
TCA_PAD = val=3D00000009
tb[TCA_UNSPEC]=3D00000000
tb[TCA_KIND]=3D773a73fc
tb[TCA_OPTIONS]=3D773a7408
tb[TCA_STATS]=3D773a772c
tb[TCA_XSTATS]=3D00000000
tb[TCA_RATE]=3D00000000
tb[TCA_FCNT]=3D00000000tb[TCA_STATS2]=3D773a7494
tb[TCA_STAB]=3D00000000
tb[TCA_PAD]=3D773a7490tb[TCA_DUMP_INVISIBLE]=3D00000000
tb[TCA_CHAIN]=3D00000000
tb[TCA_HW_OFFLOAD]=3D00000000
tb[TCA_INGRESS_BLOCK]=3D00000000
tb[TCA_EGRESS_BLOCK]=3D00000000
qdisc cake = 8001: root refcnt 2 bandwidth unlimited diffserv3 triple-isolate rtt = 100.0ms raw overhead 0 
tca_stats 2000320300 = tca_stats2 2000319636 tca_xstats 0
=E2=80=A6

Also, if you agree, = I=E2=80=99d like to see this tested on 32-bit ARM (George, or my Raspi?) = and 64-bit Intel, at least. What do you think?
= --Apple-Mail=_E2EC8718-C2D9-4915-A151-ECB8D06E4A76--