[Bloat] [iccrg] Musings on the future of Internet Congestion Control

Sun Jul 10 16:01:17 EDT 2022

Hi !

> On Jul 10, 2022, at 7:27 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
> 
> Hi Michael,
> 
> so I reread your paper and stewed a bit on it.

Many thanks for doing that!  :)

> I believe that I do not buy some of your premises.

you say so, but I don’t really see much disagreement here. Let’s see:

> e.g. you write:
> 
> "We will now examine two factors that make the the present situation particularly worrisome. First, the way the infrastructure has been evolving gives TCP an increasingly large operational space in which it does not see any feedback at all. Second, most TCP connections are extremely short. As a result, it is quite rare for a TCP connection to even see a single congestion notification during its lifetime."
> 
> And seem to see a problem that flows might be able to finish their data transfer business while still in slow start. I see the same data, but see no problem. Unless we have an oracle that tells each sender (over a shared bottleneck) exactly how much to send at any given time point, different control loops will interact on those intermediary nodes.

You really say that you don’t see the solution. The problem is that capacities are underutilized, which means that flows take longer (sometimes, much longer!) to finish than they theoretically could, if we had a better solution.

> I might be limited in my depth of thought here, but having each flow probing for capacity seems exactly the right approach... and doubling CWND or rate every RTT is pretty aggressive already (making slow start shorter by reaching capacity faster within the slow-start framework requires either to start with a higher initial value (what increasing IW tries to achieve?) or use a larger increase factor than 2 per RTT). I consider increased IW a milder approach than the alternative. And once one accepts that a gradual rate increasing is the way forward it falls out logically that some flows will finish before they reach steady state capacity especially if that flows available capacity is large. So what exactly is the problem with short flows not reaching capacity and what alternative exists that does not lead to carnage if more-aggressive start-up phases drive the bottleneck load into emergency drop territory?

There are various ways to do this; one is to cache information and re-use it, assuming that - at least sometimes - new flows will see the same path again. Another is to let parallel flows share information. Yet another is to just be blindly more aggressive. Yet another, chirping.

> And as an aside, a PEP (performance enhancing proxy) that does not enhance performance is useless at best and likely harmful (rather a PDP, performance degrading proxy).

You’ve made it sound worse by changing the term, for whatever that’s worth. If they never help, why has anyone ever called them PEPs in the first place? Why do people buy these boxes?

> The network so far has been doing reasonably well with putting more protocol smarts at the ends than in the parts in between.

Truth is, PEPs are used a lot: at cellular edges, at satellite links…   because the network is *not* always doing reasonably well without them.

> I have witnessed the arguments in the "L4S wars" about how little processing one can ask the more central network nodes perform, e.g. flow queueing which would solve a lot of the issues (e.g. a hyper aggressive slow-start flow would mostly hurt itself if it overshoots its capacity) seems to be a complete no-go.

That’s to do with scalability, which depends on how close to the network’s edge one is.

> I personally think what we should do is have the network supply more information to the end points to control their behavior better. E.g. if we would mandate a max_queue-fill-percentage field in a protocol header and have each node write max(current_value_of_the_field, queue-filling_percentage_of_the_current_node) in every packet, end points could estimate how close to congestion the path is (e.g. by looking at the rate of %queueing changes) and tailor their growth/shrinkage rates accordingly, both during slow-start and during congestion avoidance.

That could well be one way to go. Nice if we provoked you to think!

> But alas we seem to go the path of a relative dumb 1 bit signal giving us an under-defined queue filling state instead and to estimate relative queue filling dynamics from that we need many samples (so literally too little too late, or L3T2), but I digress.

Yeah you do  :-)

Cheers,
Michael