[Bloat] Computer generated congestion control
David Lang
david at lang.hm
Fri Apr 3 05:44:50 EDT 2015
On Fri, 3 Apr 2015, Jonathan Morton wrote:
>>> I'd like them to put some sane upper bound on the RTT - one compatible
> with satellite links, but which would avoid flooding unmanaged buffers to
> multi-minute delays.
>
>> The problem is that there aren't any numbers that meet these two criteria.
>> Even if you ignore 10G and faster interfaces, a 1Gb/s interface
> withsatellite sized latencies is a LOT of data, far more than is needed to
> flood a 'normal' link
>
> I very deliberately said "RTT", not "BDP". TCP stacks already track an
> estimate of RTT for various reasons, so in principle they could stop
> increasing the congestion window when that RTT reaches some critical value
> (1 second, say). The fact that they do not already do so is evidenced by
> the observations of multi-minute induced delays in certain circumstances.
I think the huge delays aren't because the RTT estimates are that long, but
rather that early on the availble bandwidth estimates were wildly high because
there was no feedback happening to indicate otherwise (the buffers were hiding
it all)
once you get into the collapse mode of operation where you are sending multiple
packets for every one that gets through, it's _really_ hard to recover short of
just stopping for a while to let the junk clear.
If it was gradual degredation all the way down, then backing off a little bit
would show clear improvement and feedback loops would clear thigns up fairly
quickly. But when there is a cliff in the performance curve, and you go way
beyond the cliff before you notice it (think Wile E. Coyote missing a turn in
the road), you can't just step back to recover. When a whole group of people do
the same thing, the total backoff that needs to happen for the network to
recover is frequenly significantly more than any one system's contribution to
the problem. They all need to back off a lot.
> And this is not a complete solution by any means. Vegas proved that an
> altruistic limit on RTT by an endpoint, with no other measures within the
> network, leads to poor fairness between flows. But if the major OSes did
> that, more networks would be able to survive overload conditions while
> providing some usable service to their users.
But we don't need to take such a risk, we have active queue management
algorithms that we know will work if they are deployed on the chokepoint
machines (for everything except wifi hops right now)
best of all, these don't require any knowlege or guesswork about the overall
network and no knowlege of the RTT or bandwidth-latency product. All they need
is information about the data flows going through the device and when the local
link can accept mroe data.
making decisions based on local data scales really well. making estimates of the
state of the network overall, not so much.
David Lang
More information about the Bloat
mailing list