[Bloat] number of home routers with ingress AQM

Wed Apr 3 06:09:42 EDT 2019

>> Fact is... ingress shaping is a hack.
> 
> That is harsh, but I give you sub-optimal and approximate

As the designer of this particular hack, I would characterise it as a workaround.  It *does* work, within certain assumptions which are narrower than for a conventional before-the-bottleneck deployment.  Cake includes a specific shaper mode to help expand those constraints slightly.

One of the assumptions is indeed that the flows respond promptly to AQM, or are non-queue-building in the first place.  Without that, the ingress shaper cannot exercise control over the contents of the dumb bottleneck queue upstream of it.  There is no clear relationship between the fill levels of the two queues; the dumb one can be full but only trickling into the smart one downstream.

Conventional TCP traffic responds to a CE mark after one RTT and should clear its excess traffic from the queue well within two (assuming congestion-avoidance mode); if the AQM also has a response delay built in (as Codel does), that may result in 3 RTTs between congestion onset and establishment of control.  Slow start is another matter entirely, but one I haven't analysed very throughly in this context; it's safe to say however that draining a 2x overshoot will take longer than a few-segment overshoot.

SCE would potentially remove some of that delay, as it signals immediately a queue appears, but we're still talking about one-plus RTTs delay before the dumb queue is drained.  SCE also pulls the TCP out of slow-start soon enough to avoid a 2x overshoot (there's an interaction with sender-side pacing here; without pacing, slow-start is exited very early).  This is the best-case scenario until the true bottleneck learns AQM.

L4S' modification of the response to CE goes in the opposite direction, unless the AQM is modified to suit.  DCTCP stabilises at 2 CEs per RTT (each CE removes half a segment from the cwnd, while the Reno-linear growth function adds one segment per RTT).  Codel grows its signalling frequency linearly over time; if its interval is set close to the actual path RTT (as is recommended), it will take 3 intervals/RTTs to reach the stability criterion for DCTCP, and a further 3 RTTs to control it all the way down to the true BDP.

This is not too bad in an FQ context, but how long would it take to correct a 2x overshoot from slow-start?  DCTCP doesn't get a halving of cwnd per RTT until every single packet is CE-marked!

I think we need to set up a testbed with the new TCP Prague implementation to demonstrate these effects.

 - Jonathan Morton