[Codel] codel "oversteer"

Dave Taht dave.taht at gmail.com
Tue Jun 19 21:32:00 EDT 2012

I've been forming a theory regarding codel behavior in some
pathological conditions. For the sake of developing the theory I'm
going to return to the original car analogy published here, and add a
new one - "oversteer".


If the underlying interface device driver is overbuffered, when the
packet backlog finally makes it into the qdisc layer, that bursts up
rapidly and codel rapidly ramps up it's drop strategy, which corrects
the problem, but we are back in a state where we are, as in the case
of an auto on ice, or a very loose connection to the steering wheel,
"oversteering" because codel is actually not measuring the entire
time-width of the queue and unable to control it well, even if it

What I observe on wireless now with fq_codel under heavy load is
oscillation in the qdisc layer between 0 length queue and 70 or more
packets backlogged, a burst of drops when that happens, and far more
drops than ecn marks that I expected  (with the new (arbitrary) drop
ecn packets if > 2 * target idea I was fiddling with illustrating the
point better, now). It's difficult to gain further direct insight
without time and packet traces, and maybe exporting more data to
userspace, but this kind of explains a report I got privately on x86
(no ecn drop enabled), and the behavior of fq_codel on wireless on the
present version of cerowrt.

(I could always have inserted a bug, too, if it wasn't for the private
report and having to get on a plane shortly I wouldn't be posting this

Further testing ideas (others!) could try would be:

Increase BQL's setting to over-large values on a BQL enabled interface
and see what happens
Test with an overbuffered ethernet interface in the first place
Improve the ns3 model to have an emulated network interface with
user-settable buffering

Assuming I'm right and others can reproduce this, this implies that
focusing much harder on BQL and overbuffering related issues on the
dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
this point. And we already know that much more hard work on fixing
wifi is needed.

Despite this I'm generally pleased with the fq_codel results over
wireless I'm currently getting from today's build of cerowrt, and
certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
e1000) don't display this behavior, neither does soft rate limiting
using htb - instead achieving a steady state for the packet backlog,
accepting bursts, and otherwise being "nice".

