[Cake] [Codel] Control theory and congestion control
Dave Taht
dave.taht at gmail.com
Sun May 10 13:48:08 EDT 2015
On Sun, May 10, 2015 at 10:04 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>
> On May 10, 2015, at 05:35 , Dave Taht <dave.taht at gmail.com> wrote:
>
>> On Sat, May 9, 2015 at 12:02 PM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>>>> The "right" amount of buffering is *1* packet, all the time (the goal is
>>>> nearly 0 latency with 100% utilization). We are quite far from achieving
>>>> that on anything...
>>>
>>> And control theory shows, I think, that we never will unless the mechanisms
>>> available to us for signalling congestion improve. ECN is good, but it's not
>>> sufficient to achieve that ultimate goal. I'll try to explain why.
>>
>> The conex and dctcp work explored using ecn for multi-bit signalling.
>>
>> While this is a great set of analogies below (and why I am broadening
>> the cc) there are two things missing from it.
>>
>>>
>>> Aside from computer networking, I also dabble in computer simulated trains.
>>> Some of my bigger projects involve detailed simulations of what goes on
>>> inside them, especially the older ones which are relatively simple. These
>>> were built at a time when the idea of putting anything as delicate as a
>>> transistor inside what was effectively a megawatt-class power station was
>>> unthinkable, so the control gear tended to be electromechanical or even
>>> electropneumatic. The control laws therefore tended to be the simplest ones
>>> they could get away with.
>>>
>>> The bulk of the generated power went into the main traction circuit, where a
>>> dedicated main generator is connected rather directly to the traction motors
>>> through a small amount of switchgear (mainly to reverse the fields on the
>>> motors at either end off the line). Control of the megawatts of power
>>> surging through this circuit was effected by varying the excitation of the
>>> main generator. Excitation is in turn provided by shunting the auxiliary
>>> voltage through an automatic rheostat known as the Load Regulator before it
>>> reaches the field winding of the generator. Without field current, the
>>> generator produces no power.
>>>
>>> The load regulator is what I want to focus on here. Its job was to adjust
>>> the output of the generator to match the power - more precisely the torque -
>>> that the engine was capable of producing (or, in English Electric locos at
>>> least, the torque set by the driver's controls, which wasn't always the
>>> maximum). The load regulator had a little electric motor to move it up and
>>> down. A good proxy for engine torque was available in the form of the fuel
>>> rack position; the torque output of a diesel engine is closely related to
>>> the amount of fuel injected per cycle. The fuel rack, of course, was
>>> controlled by the governor which was set to maintain a particular engine
>>> speed; a straightforward PI control problem solved by a reasonably simple
>>> mechanical device.
>>>
>>> So it looks like a simple control problem; if the torque is too low,
>>> increase the excitation, and vice versa.
>>>
>>> Congestion control looks like a simple problem too. If there is no
>>> congestion, increase the amount of data in flight; if there is, reduce it.
>>> We even have Explicit Congestion Notification now to tell us that crucial
>>> data point, but we could always infer it from dropped packets before.
>>>
>>> So what does the load regulator's control system look like? It has as many
>>> as five states: fast down, slow down, hold, slow up, fast up. It turns out
>>> that trains really like changes in tractive effort to be slow and smooth,
>>> and as infrequent as possible. So while a very simple "bang bang" control
>>> scheme would be possible, it would inevitably oscillate around the set point
>>> instead of settling on it. Introducing a central hold state allows it to
>>> settle when cruising at constant speed, and the two slow states allow the
>>> sort of fine adjustments needed as a train gradually accelerates or slows,
>>> putting the generator only slightly out of balance with the engine. The fast
>>> states remain to allow for quick response to large changes - the driver
>>> moves the throttle, or the motors abruptly reconfigure for a different speed
>>> range (the electrical equivalent of changing gear).
>>>
>>> On the Internet, we're firmly stuck with bang-bang control. As big an
>>> improvement as ECN is, it still provides only one bit of information to the
>>> sender: whether or not there was congestion reported during the last RTT.
>>> Thus we can only use the "slow up" and "fast down" states of our virtual
>>> load regulator (except for slow start, which ironically uses the "fast up"
>>> state), and we are doomed to oscillate around the ideal congestion window,
>>> never actually settling on it.
>>>
>>> Bufferbloat is fundamentally about having insufficient information at the
>>> endpoints about conditions in the network.
>>
>> Well said.
>>
>>> We've done a lot to improve that,
>>> by moving from zero information to one bit per RTT. But to achieve that holy
>>> grail, we need more information still.
>>
>> context being aqm + ecn, fq, fq+aqm, fq+aqm+ecn, dctcp, conex, etc.
>>
>>> Specifically, we need to know when we're at the correct BDP, not just when
>>> it's too high. And it'd be nice if we also knew if we were close to it. But
>>> there is currently no way to provide that information from the network to
>>> the endpoints.
>>
>> This is where I was pointing out that FQ and the behavior of multiple
>> flows in their two phases (slow start and congestion avoidance)
>> provides a few pieces of useful information that could possibly be
>> used to get closer to the ideal.
>>
>> We know total service times for all active flows. We also have a
>> separate calculable service time for "sparse flows" in two algorithms
>> we understand deeply.
>>
>> We could have some grip on the history for flows that are not currently queued.
>>
>> We know that the way we currently seek new set points tend to be
>> bursty ("chasing the inchworm" - I still gotta use that title on a
>> paper!).
>>
>> New flows tend to be extremely bursty - and new flows in the real
>> world also tend to be pretty short, with 95% of all web traffic
>> fitting into a single IW10.
>>
>> If e2e we know we are being FQ´d, and yet are bursting to find new
>> setpoints we can infer from the spacing on the other endpoint what the
>> contention really is.
>>
>> There was a stanford result for 10s of thousands of flows that found
>> an ideal setpoint much lower than we are achieving for dozens, at much
>> higher rates.
>>
>> A control theory-ish issue with codel is that it depends on an
>> arbitrary ideal (5ms) as a definition for "good queue", where "a
>> gooder queue”
>
> I thought that our set point really is 5% of the estimated RTT, and we just default to 5 sincere we guestimate our RTT to be 100ms. Not that I complain, these two numbers seem to work decently over a relive broad range of true RTTs…
Yes, I should have talked about it as estimated RTT (interval) and a
seemingly desirable percentage(target). It is very helpful to think of
it that way if (as in my current testing) you are trying to see how
much better you can do at very short (sub 1ms) RTTs, where it really
is the interval you want to be modifying...
I have been fiddling with as a proof of concept - not an actual
algorithm - how much shorter you can make the queues at short RTTs.
What I did was gradually (per packet) subtract 10ns from the cake
target while at 100% utilization until the target hit 1ms (or bytes
outstanding dropped below 3k). Had the cake code still used a
calculated target from the interval (target >> 4) I would have fiddled
with the interval instead. Using the netperf-wrapper tcp_upload test:
There were two significant results from that (I really should just
start a blog so I can do images inline)
1) At 100Mbit, TSO offloads (bulking) add significant latency to
competing streams:
http://snapon.lab.bufferbloat.net/~d/cake_reduced_target/offload_damage_100mbit.png
This gets much worse as you add tcp flows. I figure day traders would
take notice. TSO packets have much more mass.
2) You CAN get less packets outstanding at this RTT and still keep the
link 100% utilized.
The default codel algo stayed steady at 30-31 packets outstanding with
no losses or marks evident (TSQ?) while the shrinking dynamic target
ecn marked fairly heavily and ultimately reduced the packets
outstanding to 7-17 packets with a slight improvement in actual
throughput. (This stuff is so totally inside the noise floor that it
is hard to discern a difference at all - and you can see the linux
de-optimization for handing ping packets off to hardware in some of
the tests, after the tcp flows end, which skews the latency figures)
http://snapon.lab.bufferbloat.net/~d/cake_reduced_target/dynamic_target_vs_static.png
I think it is back to ns3 to get better grips on some of this.
>
>
> Best Regards
> Sebastian
>
>> is, in my definition at the moment, "1 packet outstanding ever closer
>> to 100% of the time while there is 100% utilization".
>>
>> We could continue to bang on things (reducing the target or other
>> methods) and aim for a lower ideal setpoint until utilization dropped
>> below 100%.
>>
>> Which becomes easier the more flows we know are in progress.
>>
>>> - Jonathan Morton
>>>
>>>
>>> _______________________________________________
>>> Cake mailing list
>>> Cake at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>>>
>>
>>
>>
>> --
>> Dave Täht
>> Open Networking needs **Open Source Hardware**
>>
>> https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67
>> _______________________________________________
>> Codel mailing list
>> Codel at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/codel
>
--
Dave Täht
Open Networking needs **Open Source Hardware**
https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67
More information about the Cake
mailing list