[Codel] [Cake] Control theory and congestion control

Sat May 9 23:35:55 EDT 2015

On Sat, May 9, 2015 at 12:02 PM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>> The "right" amount of buffering is *1* packet, all the time (the goal is
>> nearly 0 latency with 100% utilization). We are quite far from achieving
>> that on anything...
>
> And control theory shows, I think, that we never will unless the mechanisms
> available to us for signalling congestion improve. ECN is good, but it's not
> sufficient to achieve that ultimate goal. I'll try to explain why.

The conex and dctcp work explored using ecn for multi-bit signalling.

While this is a great set of analogies below (and why I am broadening
the cc) there are two things missing from it.

>
> Aside from computer networking, I also dabble in computer simulated trains.
> Some of my bigger projects involve detailed simulations of what goes on
> inside them, especially the older ones which are relatively simple. These
> were built at a time when the idea of putting anything as delicate as a
> transistor inside what was effectively a megawatt-class power station was
> unthinkable, so the control gear tended to be electromechanical or even
> electropneumatic. The control laws therefore tended to be the simplest ones
> they could get away with.
>
> The bulk of the generated power went into the main traction circuit, where a
> dedicated main generator is connected rather directly to the traction motors
> through a small amount of switchgear (mainly to reverse the fields on the
> motors at either end off the line). Control of the megawatts of power
> surging through this circuit was effected by varying the excitation of the
> main generator. Excitation is in turn provided by shunting the auxiliary
> voltage through an automatic rheostat known as the Load Regulator before it
> reaches the field winding of the generator. Without field current, the
> generator produces no power.
>
> The load regulator is what I want to focus on here. Its job was to adjust
> the output of the generator to match the power - more precisely the torque -
> that the engine was capable of producing (or, in English Electric locos at
> least, the torque set by the driver's controls, which wasn't always the
> maximum). The load regulator had a little electric motor to move it up and
> down. A good proxy for engine torque was available in the form of the fuel
> rack position; the torque output of a diesel engine is closely related to
> the amount of fuel injected per cycle. The fuel rack, of course, was
> controlled by the governor which was set to maintain a particular engine
> speed; a straightforward PI control problem solved by a reasonably simple
> mechanical device.
>
> So it looks like a simple control problem; if the torque is too low,
> increase the excitation, and vice versa.
>
> Congestion control looks like a simple problem too. If there is no
> congestion, increase the amount of data in flight; if there is, reduce it.
> We even have Explicit Congestion Notification now to tell us that crucial
> data point, but we could always infer it from dropped packets before.
>
> So what does the load regulator's control system look like? It has as many
> as five states: fast down, slow down, hold, slow up, fast up. It turns out
> that trains really like changes in tractive effort to be slow and smooth,
> and as infrequent as possible. So while a very simple "bang bang" control
> scheme would be possible, it would inevitably oscillate around the set point
> instead of settling on it. Introducing a central hold state allows it to
> settle when cruising at constant speed, and the two slow states allow the
> sort of fine adjustments needed as a train gradually accelerates or slows,
> putting the generator only slightly out of balance with the engine. The fast
> states remain to allow for quick response to large changes - the driver
> moves the throttle, or the motors abruptly reconfigure for a different speed
> range (the electrical equivalent of changing gear).
>
> On the Internet, we're firmly stuck with bang-bang control. As big an
> improvement as ECN is, it still provides only one bit of information to the
> sender: whether or not there was congestion reported during the last RTT.
> Thus we can only use the "slow up" and "fast down" states of our virtual
> load regulator (except for slow start, which ironically uses the "fast up"
> state), and we are doomed to oscillate around the ideal congestion window,
> never actually settling on it.
>
> Bufferbloat is fundamentally about having insufficient information at the
> endpoints about conditions in the network.

Well said.

> We've done a lot to improve that,
> by moving from zero information to one bit per RTT. But to achieve that holy
> grail, we need more information still.

context being aqm + ecn, fq, fq+aqm, fq+aqm+ecn, dctcp, conex, etc.

> Specifically, we need to know when we're at the correct BDP, not just when
> it's too high. And it'd be nice if we also knew if we were close to it. But
> there is currently no way to provide that information from the network to
> the endpoints.

This is where I was pointing out that FQ and the behavior of multiple
flows in their two phases (slow start and congestion avoidance)
provides a few pieces of useful information  that could possibly be
used to get closer to the ideal.

We know total service times for all active flows. We also have a
separate calculable service time for "sparse flows" in two algorithms
we understand deeply.

We could have some grip on the history for flows that are not currently queued.

We know that the way we currently seek new set points tend to be
bursty ("chasing the inchworm" - I still gotta use that title on a
paper!).

New flows tend to be extremely bursty - and new flows in the real
world also tend to be pretty short, with 95% of all web traffic
fitting into a single IW10.

If e2e we know we are being FQ´d, and yet are bursting to find new
setpoints we can infer from the spacing on the other endpoint what the
contention really is.

There was a stanford result for 10s of thousands of flows that found
an ideal setpoint much lower than we are achieving for dozens, at much
higher rates.

A control theory-ish issue with codel is that it depends on an
arbitrary ideal (5ms) as a definition for "good queue", where "a
gooder queue"
is, in my definition at the moment, "1 packet outstanding ever closer
to 100% of the time while there is 100% utilization".

We could continue to bang on things (reducing the target or other
methods) and aim for a lower ideal setpoint until utilization dropped
below 100%.

Which becomes easier the more flows we know are in progress.

> - Jonathan Morton
>
>
> _______________________________________________
> Cake mailing list
> Cake at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>

-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67