[Cake] Today's cake is fishcake a la creme. Enjoy your meal.

Thu May 28 17:23:47 EDT 2015

Today I pushed several significant updates to cake.  They compile, and don’t appear to cause kernel panics, but there be dragons here.  Tread carefully, and be ready to cast a Jedi Mind Trick[1] if you have acquired that ability.

Iproute2 now actually has the extra shortcut keywords I talked about previously.  They are more-or-less self-explanatory, but are not yet mentioned in the help text.

The GSO peeling code is more refined:

- Refactored for efficiency, with several things moved out of the loop.
- Unlimited mode unconditionally invokes peeling, just as ATM compensation does.
- Peeling threshold inversely proportional to number of active flows; 1 flow remains at 1ms.

The flow DRR quantum is now dynamically calculated in the same way as the peeling threshold.  It is clamped to limits at 1514 and 300.  At 4 MB/s (the previous threshold, choosing between those two extrema), the quantum is now 976.

The set-associative hash has been tweaked to be more careful about keeping sparse and semi-sparse flows isolated from each other.  A queue has to be evicted from the flowchain before it’s considered empty, not simply to have no packets in it.  This should tend to improve flow isolation even further, but it’s possible that a number of spurious hard collisions might now occur under transient conditions.  Under heavy load, expect to see a larger number of indirect hits and a smaller number of misses than before.

I also tweaked the version of Codel that cake uses:

- Recovery now halves the count rather than subtracting two, to ensure that the signalling frequency doesn’t grow out of hand (since at high signalling frequencies, it’s likely that count will increment many times per RTT).  The extra Newton step there has been replaced by a simple fixed-point multiplication by sqrt(2); this should always get close enough to the true value that a single Newton step (or a cache lookup) is sufficient to complete the correction.

- The trigger threshold has been seriously modified, so that it takes less time to trigger if the queue grew rapidly than if it is growing slowly.  There is roughly a 6:1 dynamic range in this respect.  Given the standard 100/5 parameters, it can now trigger as quickly as 35ms after a large burst arrives all at once, or as slowly as 200ms if the sojourn time persistently remains just barely over 5ms for that long.  The interval parameter still controls the initial marking frequency as before.

The rationale for the latter change is that slow-start needs to be signalled very quickly in order to be brought under control; in fact, the ideal moment for slow-start to be signalled at is precisely the moment the target is breached.  Consider that under conventional slow-start (ie. Reno), cwnd doubles on each RTT, and halves upon receiving a congestion signal, and it takes precisely one RTT for the signal to have an effect on the queue input.  Therefore, the cwnd will initially reduce to the level corresponding to the packet on which the signal was sent.

I note that according to Big-O terminology, there are several grades between Exponential (slow start) and Linear (congestion avoidance) growth: Polynomial (n^k) and Linearithmic (n log n) are the best known.  Exponential is considered a very extreme growth factor, usually beaten only by Factorial.  This explains why slow-start is so difficult to control accurately using algorithms tuned for the congestion-avoidance phase.

The existence of the new trigger mechanism also makes it possible to cause Codel to trigger instantly when the target is breached.  I have not yet added anything which actually does that, but the facility may prove useful for ingress shaping, making cake behave more like a less-than-military sort of policer.

 - Jonathan Morton

[1] - Beware of the Infinite Loops.