[Bloat] Bufferbloat on 4G Connexion

Wed Oct 23 05:54:52 EDT 2019

> On 23 Oct, 2019, at 10:28 am, <erik.taraldsen at telenor.com> <erik.taraldsen at telenor.com> wrote:
> 
> If you could influence the 4G vendors to de-bloat their equipment, would you recommend BQL, L4S or codel/cake?

BQL is a good start, but needs to go with something else to actually fix the problem.  All it does is move (most of) the queue out of hardware and into software.  The software side then needs to capitalise on having control of the bottleneck queue, by implementing FQ, AQM, or both.

L4S is a complete waste of time.  It requires basically replacing the entire Internet, including endpoints, to work properly.  If the endpoints are not replaced, then the middleboxes behave like a conventional AQM (see below).

Codel would be a HUGE help.  It's a well-tested example of an AQM that's specifically designed to work with TCP, and should work equally well with other protocols designed to be TCP friendly.  There are other AQMs, such as PIE, BLUE, and even the old standby WRED, that would also be major improvements over the status quo, though they don't perform as well on common TCP traffic as Codel.  The basic requirement here is to bring the maximum *standing* queue down from hundreds or thousands of milliseconds to tens or singles; Codel achieves 5ms with its default Internet-tuned settings.

I use a slight variant of Codel in Cake, which I call COBALT.  The main algorithm is just a refactoring of Codel, but I've also addressed two blind spots in Codel's design.  One is the behaviour upon reactivating very shortly after the standing queue was initially brought under control; COBALT keeps better track of previous state in that case.  The other is addressing heavy load from non-TCP-friendly traffic, which requires a probabilistic drop function rather than a scheduled drop or mark.  To take care of the latter, I've overlaid a version of the BLUE algorithm.

I believe Codel and COBALT should be reasonably straightforward to implement in hardware.  I could help to explain the algorithmic details and rationale to hardware engineers if need be, and help them determine the right design tradeoff.

FQ's main benefit is allowing "sparse" flows of traffic, which tend to be more sensitive to latency, to bypass queues of "bulk" traffic which tends to be more throughput sensitive.  This amounts to reducing "inter-flow induced latency", where AQM focuses on reducing "intra-flow induced latency".  The main difficulty is allegedly in identifying flows and maintaining the per-flow state and individual queues; these should not be insurmountable obstacles, but I will admit the complexity of implementing this in hardware is higher than for a plain AQM.

I would note, however, that 4G already incurs some of this complexity through the need to individually queue, aggregate, schedule, and transmit a stream of traffic to each mobile station.  It should be relatively easy to add an AQM instance per station, thereby ensuring that congestion signals incurred by one subscriber saturating their own capacity don't spill over into other subscribers' traffic.

Combining AQM with FQ is the gold standard, minimising both inter-flow and intra-flow induced latency.  That's what fq_codel and Cake do.  If that was deployed in the right places in a 4G network, you could consider the problem solved.  For an estimate of implementation complexity, take an FQ implementation and add an AQM for each flow's queue.

As a middle ground I could suggest one of my latest projects, CNQ (Cheap Nasty Queuing).  This aims to provide some of the benefits of FQ+AQM, but with only slightly higher implementation complexity than a plain AQM, at the cost of inferior flow-isolating performance than true FQ.  It should therefore be more suitable for high-volume nodes than an FQ-based algorithm.  A software implementation for demo purposes is presently in the works, and we should have some initial performance data fairly soon.  I would seriously consider deploying CNQ on a per-station basis in 4G.

 - Jonathan Morton