[Cake] Proposing COBALT
moeller0 at gmx.de
Fri May 20 07:37:41 EDT 2016
> On May 20, 2016, at 12:04 , Jonathan Morton <chromatix99 at gmail.com> wrote:
> With the recent debate over handling unresponsive flows in fq_codel, I had a brainwave involving constructing a hybrid AQM which preserves Codel’s excellent properties on responsive flows, while also reacting appropriately when faced with a UDP flood. The key difficulty was deciding when to switch over from the Codel behaviour to a PIE or RED like behaviour.
> It turns out that BLUE is a perfect fit for this job, because it activates when the queue is completely full - an unambiguous signal that Codel has lost the plot and is unable to control the queue alone. BLUE was one of the more promising AQMs in the days immediately prior to Codel’s ascendance, so it should be effective outside Codel’s speciality.
> The name COBALT, as well as referring to a nice shade of blue, can read “Codel-BLUE Alternate”.
That is important, alwas start with a good acronym ;) (now really there are some EU funding programs that actually require you to supply an acronym if applying for a grant).
> It is unnecessary to explicitly “switch over” between Codel and BLUE; they can work in parallel, since their operating characteristics are independent. It may be feasible to simplify the Codel implementation, since it will no longer need to handle overload conditions as robustly. For example, the Codel section should use ECN marking whenever possible, and never drop an ECN-Capable packet; the BLUE section should ignore ECN capability and simply drop packets, since the traffic is evidently not responding to any ECN signals if BLUE is triggered.
> One of the major reasons why Codel fails on UDP floods is that its drop schedule is time-based. This is the correct behaviour for TCP flows, which respond adequately to one congestion signal per RTT, regardless of the packet rate. However, it means it is easily overwhelmed by high-packet-rate unresponsive (or anti-responsive, as with TCP acks) floods, which an attacker or lab test can easily produce on a high-bandwidth ingress, especially using small packets.
In essence I agree, but want to point out that the protocol itself does not really matter but rather the observed behavior of a flow. Civilized UDP applications (that expect their data to be carried over the best-effort internet) will also react to drops similar to decent TCP flows, and crappy TCP implementations might not. I would guess with the maturity of TCP stacks misbehaving TCP flows will be rarer than misbehaving UDP flows (which might be for example well-behaved fixed-rate isochronous flows that simply should never have been sent over the internet).
> BLUE, by contrast, uses a drop *probability*, so its effectiveness on floods is independent of the packet rate. If necessary, its drop rate can increase to 100% in a reasonable amount of time.
> A couple of details are necessary to integrate BLUE with a flow-isolating qdisc:
> BLUE’s up-trigger should be on a packet drop due to overflow (only) targeting the individual subqueue managed by that particular BLUE instance. It is not correct to trigger BLUE globally when an overall overflow occurs. Note also that BLUE has a timeout between triggers, which should I think be scaled according to the estimated RTT.
That sounds nice in that no additional state is required. But with the current fq_codel I believe, the packet causing the memory limit overrun, is not necessarily from the flow that actually caused the problem to beginn with, and I doesn’t fq_codel actuall search the fattest flow and drops from there. But I guess that selection procedure could be run with blue as as well.
> BLUE’s down-trigger is on the subqueue being empty when a packet is requested from it, again on a timeout. To ensure this occurs, it may be necessary to retain subqueues in the DRR list while BLUE’s drop probability is nonzero.
Question, doesn’t this mean the affected flow will be throttled quite harshly? Will blue slowly decrease the drop probability p if the flow behaves? If so, blue could just disengage if p drops below a threshold?
> Note that this does nothing to improve the situation regarding fragmented packets. I think the correct solution in that case is to divert all fragments (including the first) into a particular queue dependent only on the host pair, by assuming zero for src and dst ports and a “special” protocol number.
I believe the RFC recommends using the SRC IP, DST IP, Protocol, Identity tuple, as otherwise all fragmented flows between a host pair will hash into the same bucket…
> This has the distinct advantages of keeping related fragments together, and ensuring they can’t take up a disproportionate share of bandwidth in competition with normal traffic.
> - Jonathan Morton
> Cake mailing list
> Cake at lists.bufferbloat.net
More information about the Cake