[Cake] overheads or rate calculation changed?

Sebastian Moeller moeller0 at gmx.de
Sat Jan 6 17:46:32 EST 2018


Hi Jonathan,

> On Jan 6, 2018, at 21:44, Jonathan Morton <chromatix99 at gmail.com> wrote:
> 
>> On 23 Dec, 2017, at 11:03 pm, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> 
>> just had a look for hard_header_len in the linux kernel:
>> linux/include/linux/netdevice.h:
>> *      @hard_header_len: Maximum hardware header length.
>> *      @min_header_len:  Minimum hardware header length
>> 
>> this seems to corroborate our observation that hard_header_len is not a veridical representation of the actual hardware header length, so I assume the values cake returns are actually true. It also indicates that except for pure ethernet interfaces hard_header_len is _not_ the right parameter to evaluate for what cake is evaluating it for...
> 
> Turns out min_header_len is always either zero or 14, and is scarcely used anywhere.  It seems to be completely ignored by non-Ethernet interfaces.

	Yepp, min_header_len also does not sound to offer the guarantees we want.

> 
> However, it appears that the correct value is stored implicitly in each skb, and can be obtained through skb_network_offset(skb) - that's the offset from the beginning of the packet to the IP header (assuming it's an IP packet).  

	That sounds great.

> This suggests to me that the via-ethernet keyword can be retired, in favour of unconditionally subtracting that value from each packet length before applying overhead compensation, and setting the *default* overhead compensation to hard_header_len (to emulate the current default behaviour).

	No, the current behaviour of cake is an outlier, no other qdisc does this:

qdisc cake 8009: dev pppoe-wan root refcnt 2 bandwidth 9545Kbit diffserv3 dual-srchost nat rtt 100.0ms noatm overhead 34 via-ethernet total_overhead 34 hard_header_len 26 mpu 64 
 Sent 673788150 bytes 5116677 pkt (dropped 535, overlimits 721725 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 224192b of 4Mb
 capacity estimate: 9545Kbit
                   Bulk  Best Effort        Voice
  thresh      596560bit     9545Kbit     2386Kbit
  target         30.5ms        5.0ms        7.6ms
  interval      125.5ms      100.0ms      102.6ms
  pk_delay          0us        1.0ms        282us
  av_delay          0us         47us         20us
  sp_delay          0us          7us          9us
  pkts                0      5019757        97455
  bytes               0    655197066     19357299
  way_inds            0        81110            7
  way_miss            0       432498         1536
  way_cols            0            0            0
  drops               0          535            0
  marks               0           58            0
  ack_drop            0            0            0
  sp_flows            0            0            0
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0         1492          576

hard_header_len is 26 which fits with the fact that the same packet will first see 8 bytes pppoe overhead added, then 4 byte vlan and 14 byte kernel-ethernet for a total of 26, but as you see from max_len the packet size indicates that the kernel auto-added nothing (MTU1500 - 8 Byte PPPoE overhead = 1492). 

On the ingress side (of the same interface!) I see:
qdisc cake 800a: dev ifb4pppoe-wan root refcnt 2 bandwidth 46246Kbit diffserv3 dual-dsthost nat ingress rtt 100.0ms noatm overhead 34 via-ethernet total_overhead 34 hard_header_len 14 mpu 64 
 Sent 17412911204 bytes 13164414 pkt (dropped 3720, overlimits 19196199 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 490048b of 4Mb
 capacity estimate: 46246Kbit
                   Bulk  Best Effort        Voice
  thresh       2890Kbit    46246Kbit    11561Kbit
  target          6.3ms        5.0ms        5.0ms
  interval      101.3ms      100.0ms      100.0ms
  pk_delay          0us        458us        165us
  av_delay          0us         44us         22us
  sp_delay          0us         15us         14us
  pkts                0     13088522        79612
  bytes               0  17401576669     16506510
  way_inds            0        80004            0
  way_miss            0       416460           17
  way_cols            0           83            0
  drops               0         3720            0
  marks               0         1151            0
  ack_drop            0            0            0
  sp_flows            0            0            0
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0         1492         1492

So for the ifb the kernel ignored PPPoE and vlan overhead, but again max_len tells us that the kernel did not add anything at all.

Traditional "tc stab" would in both cases have done the correct thing (albeit accidentally as the kernel simply did not "tamper" with skb->len) with the requested "overhead 34".
However, cake accidentally accounted for 8 bytes on egress and 20 bytes on ingress instead. This is less than ideal especially since the mis-accounting is different for the two directions of the same interface (disclaimer, sqm-scripts currently only allows to configure a single overhead value that is indiscriminately applied to both ingress and egress, this really does not work well different pre-adjustments performed on both directions; you could argue that this should be fixed in sqm-scripts ;) )

I thibk cake should offer a mode in which it behaves as all other qdiscs currently do and not do auto correction at all and a mode where it corrects for the right amount, but keeping the current ake cbehavior will not help anybody.... but most likely i misunderstood your proposal in that regard.


> 
> What we would lose that way is the present capability to add an overhead to the raw packet length as reported by Linux.

	That is what we could aim to keep (at least with the raw keyword or a new rawoverhead keyword); the least we need to do is keep reporting how many bytes were auto-adjusted (only after dumping hard_header_len in the output it became clear to me that cake's assumptions about skb-> hard_header_len only held true for plain ethernet interfaces). 
Or put differently, if we can run with the new mode you propose and keep reporting the amount of auto adjustment (like what is currently reported as hard_header_len, but with a more appropriate name) and see that it does the right thing, we could remove the old way...

Question: if the user does not specify an overhead at all (via keyword or explicitly as overhead NN) even the proposed new mode will not do anything but simply take skb->len? In that case maybe keeping the "raw" behaviour as cake currently does might not be required.

>  However, since that doesn't reliably correspond to an actual packet length on the wire, that's not really a useful capability to keep, except for direct comparison with other overhead compensation methods.

	Yes, we might keep that capability to easily compare the different methods, as long as the kernel has 3 different methods we might as well used them to "calibrate"/sanity-check each other ;) (but my thinking is more like if you figure out the correct solution for the accounting in cake, it might make sense to teach stab the same trick)

> 
> Comments?


For all the words above, I really only would like to see three things in regards to the auto overhead accounting:
1) report what cake does routinely in the output of tc -s qdisc (including the actual amount of the adjustment)

2) do the right thing, so if at all reliably possible correct for the non-IP packet part of skb->len (that is not directly caused by running atm/ptm 53/48 or 65/64 expansion)

3) allow the user some way to disable that auto-accounting (but I think only doing that if no overhead keyword of any kind was specified is finw with me, that way cake still stays compatible with the generic stab method)

Thanks for doing this great work!

Best Regards

> 
> - Jonathan Morton
> 



More information about the Cake mailing list