[Cerowrt-devel] codel "oversteer"
dpreed at reed.com
dpreed at reed.com
Tue Jun 19 23:01:43 EDT 2012
One further thought - the time constants may well be adjusted so that there is what we in radio call an impedance mismatch (I mean it literally, not metaphorically) such that there is an SWR much greater than 1. In an antenna feedline this results in oscillations of power that travel back and forth between transmitter and antenna, resulting in pulsating power delivered to the antenna.
Do you know the time constants of the TCP source congestion control algorithm (frequency of the "sawtooth" that arrives when there is a short queue) and the codel control loop? If they are really different, you get a significant "beat frequency" between the two oscillators. This would directly create the observed phenomenon.
I think you can probably tune one natural frequency or the other until there is an impedance match, and then the codel damping should work.
Maybe this could be simulated in a simple simulator that models the situation to see if this is "normal" given the parameters, or whether it is a logic bug in one implementation or the other.
From: "Dave Taht" <dave.taht at gmail.com>
Sent: Tuesday, June 19, 2012 9:32pm
To: codel at lists.bufferbloat.net, cerowrt-devel at lists.bufferbloat.net
Subject: [Cerowrt-devel] codel "oversteer"
I've been forming a theory regarding codel behavior in some
pathological conditions. For the sake of developing the theory I'm
going to return to the original car analogy published here, and add a
new one - "oversteer".
If the underlying interface device driver is overbuffered, when the
packet backlog finally makes it into the qdisc layer, that bursts up
rapidly and codel rapidly ramps up it's drop strategy, which corrects
the problem, but we are back in a state where we are, as in the case
of an auto on ice, or a very loose connection to the steering wheel,
"oversteering" because codel is actually not measuring the entire
time-width of the queue and unable to control it well, even if it
What I observe on wireless now with fq_codel under heavy load is
oscillation in the qdisc layer between 0 length queue and 70 or more
packets backlogged, a burst of drops when that happens, and far more
drops than ecn marks that I expected (with the new (arbitrary) drop
ecn packets if > 2 * target idea I was fiddling with illustrating the
point better, now). It's difficult to gain further direct insight
without time and packet traces, and maybe exporting more data to
userspace, but this kind of explains a report I got privately on x86
(no ecn drop enabled), and the behavior of fq_codel on wireless on the
present version of cerowrt.
(I could always have inserted a bug, too, if it wasn't for the private
report and having to get on a plane shortly I wouldn't be posting this
Further testing ideas (others!) could try would be:
Increase BQL's setting to over-large values on a BQL enabled interface
and see what happens
Test with an overbuffered ethernet interface in the first place
Improve the ns3 model to have an emulated network interface with
Assuming I'm right and others can reproduce this, this implies that
focusing much harder on BQL and overbuffering related issues on the
dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
this point. And we already know that much more hard work on fixing
wifi is needed.
Despite this I'm generally pleased with the fq_codel results over
wireless I'm currently getting from today's build of cerowrt, and
certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
e1000) don't display this behavior, neither does soft rate limiting
using htb - instead achieving a steady state for the packet backlog,
accepting bursts, and otherwise being "nice".
Cerowrt-devel mailing list
Cerowrt-devel at lists.bufferbloat.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Cerowrt-devel