[Bloat] [Codel] [Cake] Control theory and congestion control

Sebastian Moeller moeller0 at gmx.de
Sun May 10 13:00:54 EDT 2015


Hi Jonathan,


On May 10, 2015, at 08:55 , Jonathan Morton <chromatix99 at gmail.com> wrote:

> 
>> On 10 May, 2015, at 06:35, Dave Taht <dave.taht at gmail.com> wrote:
>> 
>> On Sat, May 9, 2015 at 12:02 PM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>>>> The "right" amount of buffering is *1* packet, all the time (the goal is
>>>> nearly 0 latency with 100% utilization). We are quite far from achieving
>>>> that on anything...
>>> 
>>> And control theory shows, I think, that we never will unless the mechanisms
>>> available to us for signalling congestion improve. ECN is good, but it's not
>>> sufficient to achieve that ultimate goal. I'll try to explain why.
>> 
>> The conex and dctcp work explored using ecn for multi-bit signalling.
> 
> A quick glance at those indicates that they’re focusing on the echo path - getting the data back from the receiver to the sender.  That’s the *easy* part; all you need is a small TCP option, which can be slotted into the padding left by TCP Timestamps and/or SACK, so it doesn’t even take any extra space.
> 
> But they do nothing to address the problem of allowing routers to provide a “hold” signal.  Even a single ECN mark has to be taken to mean “back off”; being able to signal that more than one ECN mark happened in one RTT simply means that you now have a way to say “back off harder”.
> 
> The problem is that we need a three-bit signal (five new-style signalling states, two states indicating legacy ECN support, and one “ECN unsupported” state) at the IP layer to do it properly, and we’re basically out of bits there, at least in IPv4.  The best solution I can think of right now is to use both of the ECT states somehow, but we’d have to make sure that doesn’t conflict too badly with existing uses of ECT(1), such as the “nonce sum”.  Backwards and forwards compatibility here is essential.

	On the danger of sounding like I had a tin of snark for breakfast; what about re-dedicating 3 of the 6 TOS bits for this ;) (if I understand correctly ethernet and MPLS transports only allow 3 bits anyway, so the 6 bits are fiction anyway, outside of l3-routers) And the BCP still is to re-color the TOS bits in ingress, so I guess 3 bits should be plenty.

Best Regards
	Sebastian

> 
> I’m thinking about the problem.
> 
>>> Bufferbloat is fundamentally about having insufficient information at the
>>> endpoints about conditions in the network.
>> 
>> Well said.
>> 
>>> We've done a lot to improve that,
>>> by moving from zero information to one bit per RTT. But to achieve that holy
>>> grail, we need more information still.
>> 
>> context being aqm + ecn, fq, fq+aqm, fq+aqm+ecn, dctcp, conex, etc.
>> 
>>> Specifically, we need to know when we're at the correct BDP, not just when
>>> it's too high. And it'd be nice if we also knew if we were close to it. But
>>> there is currently no way to provide that information from the network to
>>> the endpoints.
>> 
>> This is where I was pointing out that FQ and the behavior of multiple
>> flows in their two phases (slow start and congestion avoidance)
>> provides a few pieces of useful information  that could possibly be
>> used to get closer to the ideal.
> 
> There certainly is enough information available in fq_codel and cake to derive a five-state congestion signal, rather than a two-state one, with very little extra effort.
> 
> Flow is sparse -> “Fast up”
> Flow is saturating, but no standing queue -> “Slow up”
> Flow is saturating, with small standing queue -> “Hold”
> Flow is saturating, with large standing queue -> “Slow down”
> Flow is saturating, with large, *persistent* standing queue -> “Fast down”
> 
> In simple terms, “fast” here means “multiplicative” and “slow” means “additive”, in the sense of AIMD being the current standard for TCP behaviour.  AIMD itself is a result of the two-state “bang-bang” control model introduced back in the 1980s.
> 
> It’s worth remembering that the Great Internet Congestion Collapse Event was 30 years ago, and ECN was specified 15 years ago.
> 
>> A control theory-ish issue with codel is that it depends on an arbitrary ideal (5ms) as a definition for "good queue", where "a
>> gooder queue” is, in my definition at the moment, "1 packet outstanding ever closer to 100% of the time while there is 100% utilization”.
> 
> As the above table shows, Codel reacts (by design) only to the most extreme situation that we would want to plug into an improved congestion-control model.  It’s really quite remarkable, in that context, that it works as well as it does.  I don’t think we can hope to do significantly better until a better signalling mechanism is available.
> 
> But it does highlight that the correct meaning of an ECN mark is “back off hard, now”.  That’s how it’s currently interpreted by TCPs, in accordance with the ECN RFCs, and Codel relies on that behaviour too.  We have to use some other, deliberately softer signal to give a “hold” or even a “slow down” indication.
> 
> - Jonathan Morton
> 
> _______________________________________________
> Codel mailing list
> Codel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel




More information about the Bloat mailing list