On 14 Dec 2019, at 11:56, Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> wrote:



On 14 Dec 2019, at 10:35, Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> wrote:



On 14 Dec 2019, at 10:01, Thibaut <hacks@slashdirt.org> wrote:




That's extremely odd.  That commit should only affect traffic carrying the LE DSCP, which is not the default.

Perhaps it was not actually the code change, but triggering a rebuild of the module?

No. I tried with and without multiple times: I built, installed, manually unloaded the module, made sure it was unloaded, loaded the new build; just to make sure as I noticed the module doesn’t print anything in dmesg when it’s loaded (feature request: print the current build version when loading, that would be most helpful in these circumstances).

There is absolutely no doubt that on my router, with this commit CAKE is broken, without it isn’t.

Here’s tc -s output with the broken version:

tc -s qdisc show dev wan
qdisc cake 800f: root refcnt 2 bandwidth 1200Kbit diffserv3 dual-srchost nat nowash ack-filter split-gso rtt 100.0ms atm overhead 48 no-sce
Sent 7711782 bytes 5454 pkt (dropped 144, overlimits 15493 requeues 0)
backlog 1616b 2p requeues 0
memory used: 140864b of 4Mb
capacity estimate: 1200Kbit
min/max network layer size:           40 /    1500
min/max overhead-adjusted size:      106 /    1749
average network hdr offset:           14

                Bulk  Best Effort        Voice
thresh         75Kbit     1200Kbit      300Kbit
target        242.2ms       15.1ms       60.6ms
interval      484.5ms      110.1ms      155.6ms
pk_delay          0us       60.0ms       26.8ms
av_delay          0us       36.7ms        2.0ms
sp_delay          0us       17.8ms        1.7ms
backlog            0b        1514b         102b
pkts                0         5467          133
bytes               0      7913444        17970
way_inds            0            0            0
way_miss            0           44            2
way_cols            0            0            0
sce                 0            0            0
marks               0            0            0
drops               0          144            0
ack_drop            0            0            0
sp_flows            0            0            1
bk_flows            0            1            0
un_flows            0            0            0
max_len             0         3028         1118
quantum           300          300          300

qdisc ingress ffff: parent ffff:fff1 ----------------
Sent 218759 bytes 3710 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Here’s the same output with the unbroken version:

tc -s qdisc show dev wan
qdisc cake 8011: root refcnt 2 bandwidth 1200Kbit diffserv3 dual-srchost nat nowash ack-filter split-gso rtt 100.0ms atm overhead 48 no-sce
Sent 3342962 bytes 2328 pkt (dropped 110, overlimits 6422 requeues 0)
backlog 4542b 3p requeues 0
memory used: 83328b of 4Mb
capacity estimate: 1200Kbit
min/max network layer size:           40 /    1500
min/max overhead-adjusted size:      106 /    1749
average network hdr offset:           14

                Bulk  Best Effort        Voice
thresh         75Kbit     1200Kbit      300Kbit
target        242.2ms       15.1ms       60.6ms
interval      484.5ms      110.1ms      155.6ms
pk_delay          0us       56.8ms        9.9ms
av_delay          0us       36.7ms        854us
sp_delay          0us        9.4ms        680us
backlog            0b        4542b           0b
pkts                0         2403           38
bytes               0      3509764         4280
way_inds            0            0            0
way_miss            0           17            1
way_cols            0            0            0
sce                 0            0            0
marks               0            0            0
drops               0          110            0
ack_drop            0            0            0
sp_flows            0            0            1
bk_flows            0            1            0
un_flows            0            0            0
max_len             0         1514          294
quantum           300          300          300

qdisc ingress ffff: parent ffff:fff1 ----------------
Sent 106781 bytes 1896 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0


HTH
Thibaut

Which shows most traffic going through Best Effort, whereas the LE DSCP would put it in Bulk, so at this point I’m failing to see the connection between that commit (which changes 3 lookup tables) and the behaviour change.

Can we see output from ’tc -s qdisc’ for the non-broken case please?

Brain fart!  The 2 different versions are there and we soe no difference in traffic/tin allocation.  However, could we see the ifb4wan instances of cake for both b0rken and unb0rken cases please?

The plot thickens. I was eventually able to reproduce the same buggy behavior without the HEAD commit, *sigh*

It appears that the bug happens randomly between consecutive module loads/unloads. It also appears that once the module is loaded in a “working state” it keeps working fine.

I’m wondering if this could be an “use of uninitialized data” type of bug.

Still digging.
Thibaut