From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9AB933BA8E for ; Sat, 6 Jan 2018 17:46:36 -0500 (EST) Received: from hms-beagle2.lan ([79.210.208.213]) by mail.gmx.com (mrgmx001 [212.227.17.190]) with ESMTPSA (Nemesis) id 0MYfX0-1eLFut3xq6-00VLjZ; Sat, 06 Jan 2018 23:46:34 +0100 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) From: Sebastian Moeller In-Reply-To: Date: Sat, 6 Jan 2018 23:46:32 +0100 Cc: Ryan Mounce , Cake List Content-Transfer-Encoding: quoted-printable Message-Id: References: <31d49a5d-02a2-3dc8-a455-52d453b83bdf@gmail.com> <3b255661-1b16-cc29-958f-bbbedbcbab9e@gmail.com> <8FB76CCB-1AAB-42F6-AEF8-D0D8A438EA91@gmx.de> <7ca86dce-7645-38e8-df4e-148245e9991c@gmail.com> <3B4D3F22-DA08-4A8A-A1E2-C31A2B627727@gmx.de> <7416D2DC-A95B-40EA-B7AB-000BF9D113F8@gmx.de> To: Jonathan Morton X-Mailer: Apple Mail (2.3273) X-Provags-ID: V03:K0:b1UpyJgFZ9J3Y3Z2ZTJCrHkhHj5oGrHcqH0mQEtxs0ID4yRwQAH WeW5gFLr6t6SAzrhy+eSb4tS0IQjNhHR3gZxdBMiY+S4dYEM4keZrfOCkjT5VeXhjRXcF23 s3IbgH3kTCMNejz9CrDYwgIQvnINt31vQ/brQoxRsLrllATuE9NkisWb7v/YO1l1/Oh1CjO MivPk+wEIKBSoW7Ys14PQ== X-UI-Out-Filterresults: notjunk:1;V01:K0:N2QN8GOv5QE=:eEGgshIGrL3OHfkF4wQ7s6 jKatCHyTFWsEP3eVpCBdd+9j+ku4aCLqhjD4UmjzWSwS3pyqpVb2J2/0w79M1fhqhjHgrHY3d gMbUUgxf3ipeIfZehxLtVFCL9RgMYGQ8VDzQu10gtKtykaw1kIKV9eLRFUbItZ/kYIeb1XCTO fNebX4nFlqfei67IEJTYMbFGpOsiwL4bGHEWXWPUwUHYZrvf4n5nsUZ+GKroYObRREhzTWrTj j/DfJmGELMHVP5ozfrm3znXQLj48vEFwb3Cx+hpzElZxEuMsLzSEZ+FogkjdmUGZbauBEZvcn SUUqJdXhlCI+WnKci8jMxiwjoKRiO49+tBoR53qQp80PoapUFCsHBl5tZouzxvmLFqtwn0fai 3OhapqeRaFxptN9JnvUHc8gWwaJ5M6I1T/Wj1+EYO7eho5E5OV+e+ulcjIXQUoZeIaSOKSt9s cwYibJRTBJmEOFwKQs2/ApPNEeg0pBYjFlMIlAj9iMRUOz9UOvYyOcDXce4xJiPBFL8owY3J7 foaEFuHWPZDMxZXyEX5qDqKEkQ4iRzJjoqhJi59K/ZLfzSkn1LLq2ZgYe5I83KO9r7Xm3h+Im Cu0vRSMkacyIyC+jQLU5mjYMkW97pxT1TnpKy7shutfp3bd652wMxsAmteIPTTK6RpTXDgxjr OQtJ0vvzubFDv4GUZ77FAnzMFIq2J62vDZUSTJZnRu+MgKGqkhtln5X9Qn0WB+CJb0IUOH6JV gmF0kYLf8UuYxseWdhjRn72ggYU5f3e68xx2ZC/zrZz55ZlJjAMkMi51q5Y1c9IdlxLGTKpD9 pGLgmIJjpKMMjHPZeJiC9m0x4N5UhsfqIJqHVEBYMGc5mHq24Q= Subject: Re: [Cake] overheads or rate calculation changed? X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jan 2018 22:46:36 -0000 Hi Jonathan, > On Jan 6, 2018, at 21:44, Jonathan Morton = wrote: >=20 >> On 23 Dec, 2017, at 11:03 pm, Sebastian Moeller = wrote: >>=20 >> just had a look for hard_header_len in the linux kernel: >> linux/include/linux/netdevice.h: >> * @hard_header_len: Maximum hardware header length. >> * @min_header_len: Minimum hardware header length >>=20 >> this seems to corroborate our observation that hard_header_len is not = a veridical representation of the actual hardware header length, so I = assume the values cake returns are actually true. It also indicates that = except for pure ethernet interfaces hard_header_len is _not_ the right = parameter to evaluate for what cake is evaluating it for... >=20 > Turns out min_header_len is always either zero or 14, and is scarcely = used anywhere. It seems to be completely ignored by non-Ethernet = interfaces. Yepp, min_header_len also does not sound to offer the guarantees = we want. >=20 > However, it appears that the correct value is stored implicitly in = each skb, and can be obtained through skb_network_offset(skb) - that's = the offset from the beginning of the packet to the IP header (assuming = it's an IP packet). =20 That sounds great. > This suggests to me that the via-ethernet keyword can be retired, in = favour of unconditionally subtracting that value from each packet length = before applying overhead compensation, and setting the *default* = overhead compensation to hard_header_len (to emulate the current default = behaviour). No, the current behaviour of cake is an outlier, no other qdisc = does this: qdisc cake 8009: dev pppoe-wan root refcnt 2 bandwidth 9545Kbit = diffserv3 dual-srchost nat rtt 100.0ms noatm overhead 34 via-ethernet = total_overhead 34 hard_header_len 26 mpu 64=20 Sent 673788150 bytes 5116677 pkt (dropped 535, overlimits 721725 = requeues 0)=20 backlog 0b 0p requeues 0=20 memory used: 224192b of 4Mb capacity estimate: 9545Kbit Bulk Best Effort Voice thresh 596560bit 9545Kbit 2386Kbit target 30.5ms 5.0ms 7.6ms interval 125.5ms 100.0ms 102.6ms pk_delay 0us 1.0ms 282us av_delay 0us 47us 20us sp_delay 0us 7us 9us pkts 0 5019757 97455 bytes 0 655197066 19357299 way_inds 0 81110 7 way_miss 0 432498 1536 way_cols 0 0 0 drops 0 535 0 marks 0 58 0 ack_drop 0 0 0 sp_flows 0 0 0 bk_flows 0 1 0 un_flows 0 0 0 max_len 0 1492 576 hard_header_len is 26 which fits with the fact that the same packet will = first see 8 bytes pppoe overhead added, then 4 byte vlan and 14 byte = kernel-ethernet for a total of 26, but as you see from max_len the = packet size indicates that the kernel auto-added nothing (MTU1500 - 8 = Byte PPPoE overhead =3D 1492).=20 On the ingress side (of the same interface!) I see: qdisc cake 800a: dev ifb4pppoe-wan root refcnt 2 bandwidth 46246Kbit = diffserv3 dual-dsthost nat ingress rtt 100.0ms noatm overhead 34 = via-ethernet total_overhead 34 hard_header_len 14 mpu 64=20 Sent 17412911204 bytes 13164414 pkt (dropped 3720, overlimits 19196199 = requeues 0)=20 backlog 0b 0p requeues 0=20 memory used: 490048b of 4Mb capacity estimate: 46246Kbit Bulk Best Effort Voice thresh 2890Kbit 46246Kbit 11561Kbit target 6.3ms 5.0ms 5.0ms interval 101.3ms 100.0ms 100.0ms pk_delay 0us 458us 165us av_delay 0us 44us 22us sp_delay 0us 15us 14us pkts 0 13088522 79612 bytes 0 17401576669 16506510 way_inds 0 80004 0 way_miss 0 416460 17 way_cols 0 83 0 drops 0 3720 0 marks 0 1151 0 ack_drop 0 0 0 sp_flows 0 0 0 bk_flows 0 1 0 un_flows 0 0 0 max_len 0 1492 1492 So for the ifb the kernel ignored PPPoE and vlan overhead, but again = max_len tells us that the kernel did not add anything at all. Traditional "tc stab" would in both cases have done the correct thing = (albeit accidentally as the kernel simply did not "tamper" with = skb->len) with the requested "overhead 34". However, cake accidentally accounted for 8 bytes on egress and 20 bytes = on ingress instead. This is less than ideal especially since the = mis-accounting is different for the two directions of the same interface = (disclaimer, sqm-scripts currently only allows to configure a single = overhead value that is indiscriminately applied to both ingress and = egress, this really does not work well different pre-adjustments = performed on both directions; you could argue that this should be fixed = in sqm-scripts ;) ) I thibk cake should offer a mode in which it behaves as all other qdiscs = currently do and not do auto correction at all and a mode where it = corrects for the right amount, but keeping the current ake cbehavior = will not help anybody.... but most likely i misunderstood your proposal = in that regard. >=20 > What we would lose that way is the present capability to add an = overhead to the raw packet length as reported by Linux. That is what we could aim to keep (at least with the raw keyword = or a new rawoverhead keyword); the least we need to do is keep reporting = how many bytes were auto-adjusted (only after dumping hard_header_len in = the output it became clear to me that cake's assumptions about skb-> = hard_header_len only held true for plain ethernet interfaces).=20 Or put differently, if we can run with the new mode you propose and keep = reporting the amount of auto adjustment (like what is currently reported = as hard_header_len, but with a more appropriate name) and see that it = does the right thing, we could remove the old way... Question: if the user does not specify an overhead at all (via keyword = or explicitly as overhead NN) even the proposed new mode will not do = anything but simply take skb->len? In that case maybe keeping the "raw" = behaviour as cake currently does might not be required. > However, since that doesn't reliably correspond to an actual packet = length on the wire, that's not really a useful capability to keep, = except for direct comparison with other overhead compensation methods. Yes, we might keep that capability to easily compare the = different methods, as long as the kernel has 3 different methods we = might as well used them to "calibrate"/sanity-check each other ;) (but = my thinking is more like if you figure out the correct solution for the = accounting in cake, it might make sense to teach stab the same trick) >=20 > Comments? For all the words above, I really only would like to see three things in = regards to the auto overhead accounting: 1) report what cake does routinely in the output of tc -s qdisc = (including the actual amount of the adjustment) 2) do the right thing, so if at all reliably possible correct for the = non-IP packet part of skb->len (that is not directly caused by running = atm/ptm 53/48 or 65/64 expansion) 3) allow the user some way to disable that auto-accounting (but I think = only doing that if no overhead keyword of any kind was specified is finw = with me, that way cake still stays compatible with the generic stab = method) Thanks for doing this great work! Best Regards >=20 > - Jonathan Morton >=20