[Bloat] [Codel] [Cake] Control theory and congestion control

Sun May 10 13:04:20 EDT 2015

On May 10, 2015, at 05:35 , Dave Taht <dave.taht at gmail.com> wrote:

> On Sat, May 9, 2015 at 12:02 PM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>>> The "right" amount of buffering is *1* packet, all the time (the goal is
>>> nearly 0 latency with 100% utilization). We are quite far from achieving
>>> that on anything...
>> 
>> And control theory shows, I think, that we never will unless the mechanisms
>> available to us for signalling congestion improve. ECN is good, but it's not
>> sufficient to achieve that ultimate goal. I'll try to explain why.
> 
> The conex and dctcp work explored using ecn for multi-bit signalling.
> 
> While this is a great set of analogies below (and why I am broadening
> the cc) there are two things missing from it.
> 
>> 
>> Aside from computer networking, I also dabble in computer simulated trains.
>> Some of my bigger projects involve detailed simulations of what goes on
>> inside them, especially the older ones which are relatively simple. These
>> were built at a time when the idea of putting anything as delicate as a
>> transistor inside what was effectively a megawatt-class power station was
>> unthinkable, so the control gear tended to be electromechanical or even
>> electropneumatic. The control laws therefore tended to be the simplest ones
>> they could get away with.
>> 
>> The bulk of the generated power went into the main traction circuit, where a
>> dedicated main generator is connected rather directly to the traction motors
>> through a small amount of switchgear (mainly to reverse the fields on the
>> motors at either end off the line). Control of the megawatts of power
>> surging through this circuit was effected by varying the excitation of the
>> main generator. Excitation is in turn provided by shunting the auxiliary
>> voltage through an automatic rheostat known as the Load Regulator before it
>> reaches the field winding of the generator. Without field current, the
>> generator produces no power.
>> 
>> The load regulator is what I want to focus on here. Its job was to adjust
>> the output of the generator to match the power - more precisely the torque -
>> that the engine was capable of producing (or, in English Electric locos at
>> least, the torque set by the driver's controls, which wasn't always the
>> maximum). The load regulator had a little electric motor to move it up and
>> down. A good proxy for engine torque was available in the form of the fuel
>> rack position; the torque output of a diesel engine is closely related to
>> the amount of fuel injected per cycle. The fuel rack, of course, was
>> controlled by the governor which was set to maintain a particular engine
>> speed; a straightforward PI control problem solved by a reasonably simple
>> mechanical device.
>> 
>> So it looks like a simple control problem; if the torque is too low,
>> increase the excitation, and vice versa.
>> 
>> Congestion control looks like a simple problem too. If there is no
>> congestion, increase the amount of data in flight; if there is, reduce it.
>> We even have Explicit Congestion Notification now to tell us that crucial
>> data point, but we could always infer it from dropped packets before.
>> 
>> So what does the load regulator's control system look like? It has as many
>> as five states: fast down, slow down, hold, slow up, fast up. It turns out
>> that trains really like changes in tractive effort to be slow and smooth,
>> and as infrequent as possible. So while a very simple "bang bang" control
>> scheme would be possible, it would inevitably oscillate around the set point
>> instead of settling on it. Introducing a central hold state allows it to
>> settle when cruising at constant speed, and the two slow states allow the
>> sort of fine adjustments needed as a train gradually accelerates or slows,
>> putting the generator only slightly out of balance with the engine. The fast
>> states remain to allow for quick response to large changes - the driver
>> moves the throttle, or the motors abruptly reconfigure for a different speed
>> range (the electrical equivalent of changing gear).
>> 
>> On the Internet, we're firmly stuck with bang-bang control. As big an
>> improvement as ECN is, it still provides only one bit of information to the
>> sender: whether or not there was congestion reported during the last RTT.
>> Thus we can only use the "slow up" and "fast down" states of our virtual
>> load regulator (except for slow start, which ironically uses the "fast up"
>> state), and we are doomed to oscillate around the ideal congestion window,
>> never actually settling on it.
>> 
>> Bufferbloat is fundamentally about having insufficient information at the
>> endpoints about conditions in the network.
> 
> Well said.
> 
>> We've done a lot to improve that,
>> by moving from zero information to one bit per RTT. But to achieve that holy
>> grail, we need more information still.
> 
> context being aqm + ecn, fq, fq+aqm, fq+aqm+ecn, dctcp, conex, etc.
> 
>> Specifically, we need to know when we're at the correct BDP, not just when
>> it's too high. And it'd be nice if we also knew if we were close to it. But
>> there is currently no way to provide that information from the network to
>> the endpoints.
> 
> This is where I was pointing out that FQ and the behavior of multiple
> flows in their two phases (slow start and congestion avoidance)
> provides a few pieces of useful information  that could possibly be
> used to get closer to the ideal.
> 
> We know total service times for all active flows. We also have a
> separate calculable service time for "sparse flows" in two algorithms
> we understand deeply.
> 
> We could have some grip on the history for flows that are not currently queued.
> 
> We know that the way we currently seek new set points tend to be
> bursty ("chasing the inchworm" - I still gotta use that title on a
> paper!).
> 
> New flows tend to be extremely bursty - and new flows in the real
> world also tend to be pretty short, with 95% of all web traffic
> fitting into a single IW10.
> 
> If e2e we know we are being FQ´d, and yet are bursting to find new
> setpoints we can infer from the spacing on the other endpoint what the
> contention really is.
> 
> There was a stanford result for 10s of thousands of flows that found
> an ideal setpoint much lower than we are achieving for dozens, at much
> higher rates.
> 
> A control theory-ish issue with codel is that it depends on an
> arbitrary ideal (5ms) as a definition for "good queue", where "a
> gooder queue”

	I thought that our set point really is 5% of the estimated RTT, and we just default to 5 sincere we guestimate our RTT to be 100ms. Not that I complain, these two numbers seem to work decently over a relive broad range of true RTTs…


Best Regards
	Sebastian

> is, in my definition at the moment, "1 packet outstanding ever closer
> to 100% of the time while there is 100% utilization".
> 
> We could continue to bang on things (reducing the target or other
> methods) and aim for a lower ideal setpoint until utilization dropped
> below 100%.
> 
> Which becomes easier the more flows we know are in progress.
> 
>> - Jonathan Morton
>> 
>> 
>> _______________________________________________
>> Cake mailing list
>> Cake at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>> 
> 
> 
> 
> -- 
> Dave Täht
> Open Networking needs **Open Source Hardware**
> 
> https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67
> _______________________________________________
> Codel mailing list
> Codel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel