[Cake] overheads or rate calculation changed?
Sebastian Moeller
moeller0 at gmx.de
Sat Jan 6 17:46:32 EST 2018
Hi Jonathan,
> On Jan 6, 2018, at 21:44, Jonathan Morton <chromatix99 at gmail.com> wrote:
>
>> On 23 Dec, 2017, at 11:03 pm, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>
>> just had a look for hard_header_len in the linux kernel:
>> linux/include/linux/netdevice.h:
>> * @hard_header_len: Maximum hardware header length.
>> * @min_header_len: Minimum hardware header length
>>
>> this seems to corroborate our observation that hard_header_len is not a veridical representation of the actual hardware header length, so I assume the values cake returns are actually true. It also indicates that except for pure ethernet interfaces hard_header_len is _not_ the right parameter to evaluate for what cake is evaluating it for...
>
> Turns out min_header_len is always either zero or 14, and is scarcely used anywhere. It seems to be completely ignored by non-Ethernet interfaces.
Yepp, min_header_len also does not sound to offer the guarantees we want.
>
> However, it appears that the correct value is stored implicitly in each skb, and can be obtained through skb_network_offset(skb) - that's the offset from the beginning of the packet to the IP header (assuming it's an IP packet).
That sounds great.
> This suggests to me that the via-ethernet keyword can be retired, in favour of unconditionally subtracting that value from each packet length before applying overhead compensation, and setting the *default* overhead compensation to hard_header_len (to emulate the current default behaviour).
No, the current behaviour of cake is an outlier, no other qdisc does this:
qdisc cake 8009: dev pppoe-wan root refcnt 2 bandwidth 9545Kbit diffserv3 dual-srchost nat rtt 100.0ms noatm overhead 34 via-ethernet total_overhead 34 hard_header_len 26 mpu 64
Sent 673788150 bytes 5116677 pkt (dropped 535, overlimits 721725 requeues 0)
backlog 0b 0p requeues 0
memory used: 224192b of 4Mb
capacity estimate: 9545Kbit
Bulk Best Effort Voice
thresh 596560bit 9545Kbit 2386Kbit
target 30.5ms 5.0ms 7.6ms
interval 125.5ms 100.0ms 102.6ms
pk_delay 0us 1.0ms 282us
av_delay 0us 47us 20us
sp_delay 0us 7us 9us
pkts 0 5019757 97455
bytes 0 655197066 19357299
way_inds 0 81110 7
way_miss 0 432498 1536
way_cols 0 0 0
drops 0 535 0
marks 0 58 0
ack_drop 0 0 0
sp_flows 0 0 0
bk_flows 0 1 0
un_flows 0 0 0
max_len 0 1492 576
hard_header_len is 26 which fits with the fact that the same packet will first see 8 bytes pppoe overhead added, then 4 byte vlan and 14 byte kernel-ethernet for a total of 26, but as you see from max_len the packet size indicates that the kernel auto-added nothing (MTU1500 - 8 Byte PPPoE overhead = 1492).
On the ingress side (of the same interface!) I see:
qdisc cake 800a: dev ifb4pppoe-wan root refcnt 2 bandwidth 46246Kbit diffserv3 dual-dsthost nat ingress rtt 100.0ms noatm overhead 34 via-ethernet total_overhead 34 hard_header_len 14 mpu 64
Sent 17412911204 bytes 13164414 pkt (dropped 3720, overlimits 19196199 requeues 0)
backlog 0b 0p requeues 0
memory used: 490048b of 4Mb
capacity estimate: 46246Kbit
Bulk Best Effort Voice
thresh 2890Kbit 46246Kbit 11561Kbit
target 6.3ms 5.0ms 5.0ms
interval 101.3ms 100.0ms 100.0ms
pk_delay 0us 458us 165us
av_delay 0us 44us 22us
sp_delay 0us 15us 14us
pkts 0 13088522 79612
bytes 0 17401576669 16506510
way_inds 0 80004 0
way_miss 0 416460 17
way_cols 0 83 0
drops 0 3720 0
marks 0 1151 0
ack_drop 0 0 0
sp_flows 0 0 0
bk_flows 0 1 0
un_flows 0 0 0
max_len 0 1492 1492
So for the ifb the kernel ignored PPPoE and vlan overhead, but again max_len tells us that the kernel did not add anything at all.
Traditional "tc stab" would in both cases have done the correct thing (albeit accidentally as the kernel simply did not "tamper" with skb->len) with the requested "overhead 34".
However, cake accidentally accounted for 8 bytes on egress and 20 bytes on ingress instead. This is less than ideal especially since the mis-accounting is different for the two directions of the same interface (disclaimer, sqm-scripts currently only allows to configure a single overhead value that is indiscriminately applied to both ingress and egress, this really does not work well different pre-adjustments performed on both directions; you could argue that this should be fixed in sqm-scripts ;) )
I thibk cake should offer a mode in which it behaves as all other qdiscs currently do and not do auto correction at all and a mode where it corrects for the right amount, but keeping the current ake cbehavior will not help anybody.... but most likely i misunderstood your proposal in that regard.
>
> What we would lose that way is the present capability to add an overhead to the raw packet length as reported by Linux.
That is what we could aim to keep (at least with the raw keyword or a new rawoverhead keyword); the least we need to do is keep reporting how many bytes were auto-adjusted (only after dumping hard_header_len in the output it became clear to me that cake's assumptions about skb-> hard_header_len only held true for plain ethernet interfaces).
Or put differently, if we can run with the new mode you propose and keep reporting the amount of auto adjustment (like what is currently reported as hard_header_len, but with a more appropriate name) and see that it does the right thing, we could remove the old way...
Question: if the user does not specify an overhead at all (via keyword or explicitly as overhead NN) even the proposed new mode will not do anything but simply take skb->len? In that case maybe keeping the "raw" behaviour as cake currently does might not be required.
> However, since that doesn't reliably correspond to an actual packet length on the wire, that's not really a useful capability to keep, except for direct comparison with other overhead compensation methods.
Yes, we might keep that capability to easily compare the different methods, as long as the kernel has 3 different methods we might as well used them to "calibrate"/sanity-check each other ;) (but my thinking is more like if you figure out the correct solution for the accounting in cake, it might make sense to teach stab the same trick)
>
> Comments?
For all the words above, I really only would like to see three things in regards to the auto overhead accounting:
1) report what cake does routinely in the output of tc -s qdisc (including the actual amount of the adjustment)
2) do the right thing, so if at all reliably possible correct for the non-IP packet part of skb->len (that is not directly caused by running atm/ptm 53/48 or 65/64 expansion)
3) allow the user some way to disable that auto-accounting (but I think only doing that if no overhead keyword of any kind was specified is finw with me, that way cake still stays compatible with the generic stab method)
Thanks for doing this great work!
Best Regards
>
> - Jonathan Morton
>
More information about the Cake
mailing list