CoDel AQM discussions
 help / color / mirror / Atom feed
* [Codel] codel "oversteer"
@ 2012-06-20  1:32 Dave Taht
  2012-06-20  3:01 ` [Codel] [Cerowrt-devel] " dpreed
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Dave Taht @ 2012-06-20  1:32 UTC (permalink / raw)
  To: codel, cerowrt-devel

I've been forming a theory regarding codel behavior in some
pathological conditions. For the sake of developing the theory I'm
going to return to the original car analogy published here, and add a
new one - "oversteer".

Briefly:

If the underlying interface device driver is overbuffered, when the
packet backlog finally makes it into the qdisc layer, that bursts up
rapidly and codel rapidly ramps up it's drop strategy, which corrects
the problem, but we are back in a state where we are, as in the case
of an auto on ice, or a very loose connection to the steering wheel,
"oversteering" because codel is actually not measuring the entire
time-width of the queue and unable to control it well, even if it
could.

What I observe on wireless now with fq_codel under heavy load is
oscillation in the qdisc layer between 0 length queue and 70 or more
packets backlogged, a burst of drops when that happens, and far more
drops than ecn marks that I expected  (with the new (arbitrary) drop
ecn packets if > 2 * target idea I was fiddling with illustrating the
point better, now). It's difficult to gain further direct insight
without time and packet traces, and maybe exporting more data to
userspace, but this kind of explains a report I got privately on x86
(no ecn drop enabled), and the behavior of fq_codel on wireless on the
present version of cerowrt.

(I could always have inserted a bug, too, if it wasn't for the private
report and having to get on a plane shortly I wouldn't be posting this
now)

Further testing ideas (others!) could try would be:

Increase BQL's setting to over-large values on a BQL enabled interface
and see what happens
Test with an overbuffered ethernet interface in the first place
Improve the ns3 model to have an emulated network interface with
user-settable buffering

Assuming I'm right and others can reproduce this, this implies that
focusing much harder on BQL and overbuffering related issues on the
dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
this point. And we already know that much more hard work on fixing
wifi is needed.

Despite this I'm generally pleased with the fq_codel results over
wireless I'm currently getting from today's build of cerowrt, and
certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
e1000) don't display this behavior, neither does soft rate limiting
using htb - instead achieving a steady state for the packet backlog,
accepting bursts, and otherwise being "nice".

-- 
Dave Täht
SKYPE: davetaht
http://ronsravings.blogspot.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Codel] [Cerowrt-devel] codel "oversteer"
  2012-06-20  1:32 [Codel] codel "oversteer" Dave Taht
@ 2012-06-20  3:01 ` dpreed
  2012-06-20 10:08 ` [Codel] " Jonathan Morton
  2012-06-20 20:07 ` Kathleen Nichols
  2 siblings, 0 replies; 7+ messages in thread
From: dpreed @ 2012-06-20  3:01 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 4059 bytes --]


One further thought - the time constants may well be adjusted so that there is what we in radio call an impedance mismatch (I mean it literally, not metaphorically) such that there is an SWR much greater than 1.  In an antenna feedline this results in oscillations of power that travel back and forth between transmitter and antenna, resulting in pulsating power delivered to the antenna.
 
Do you know the time constants of the TCP source congestion control algorithm (frequency of the "sawtooth" that arrives when there is a short queue) and the codel control loop?  If they are really different, you get a significant "beat frequency" between the two oscillators.   This would directly create the observed phenomenon.
 
I think you can probably tune one natural frequency or the other until there is an impedance match,  and then the codel damping should work.
 
Maybe this could be simulated in a simple simulator that models the situation to see if this is "normal" given the parameters, or whether it is a logic bug in one implementation or the other.
 
-----Original Message-----
From: "Dave Taht" <dave.taht@gmail.com>
Sent: Tuesday, June 19, 2012 9:32pm
To: codel@lists.bufferbloat.net, cerowrt-devel@lists.bufferbloat.net
Subject: [Cerowrt-devel] codel "oversteer"



I've been forming a theory regarding codel behavior in some
pathological conditions. For the sake of developing the theory I'm
going to return to the original car analogy published here, and add a
new one - "oversteer".

Briefly:

If the underlying interface device driver is overbuffered, when the
packet backlog finally makes it into the qdisc layer, that bursts up
rapidly and codel rapidly ramps up it's drop strategy, which corrects
the problem, but we are back in a state where we are, as in the case
of an auto on ice, or a very loose connection to the steering wheel,
"oversteering" because codel is actually not measuring the entire
time-width of the queue and unable to control it well, even if it
could.

What I observe on wireless now with fq_codel under heavy load is
oscillation in the qdisc layer between 0 length queue and 70 or more
packets backlogged, a burst of drops when that happens, and far more
drops than ecn marks that I expected  (with the new (arbitrary) drop
ecn packets if > 2 * target idea I was fiddling with illustrating the
point better, now). It's difficult to gain further direct insight
without time and packet traces, and maybe exporting more data to
userspace, but this kind of explains a report I got privately on x86
(no ecn drop enabled), and the behavior of fq_codel on wireless on the
present version of cerowrt.

(I could always have inserted a bug, too, if it wasn't for the private
report and having to get on a plane shortly I wouldn't be posting this
now)

Further testing ideas (others!) could try would be:

Increase BQL's setting to over-large values on a BQL enabled interface
and see what happens
Test with an overbuffered ethernet interface in the first place
Improve the ns3 model to have an emulated network interface with
user-settable buffering

Assuming I'm right and others can reproduce this, this implies that
focusing much harder on BQL and overbuffering related issues on the
dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
this point. And we already know that much more hard work on fixing
wifi is needed.

Despite this I'm generally pleased with the fq_codel results over
wireless I'm currently getting from today's build of cerowrt, and
certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
e1000) don't display this behavior, neither does soft rate limiting
using htb - instead achieving a steady state for the packet backlog,
accepting bursts, and otherwise being "nice".

-- 
Dave Täht
SKYPE: davetaht
http://ronsravings.blogspot.com/
_______________________________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel

[-- Attachment #2: Type: text/html, Size: 4795 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Codel] codel "oversteer"
  2012-06-20  1:32 [Codel] codel "oversteer" Dave Taht
  2012-06-20  3:01 ` [Codel] [Cerowrt-devel] " dpreed
@ 2012-06-20 10:08 ` Jonathan Morton
  2012-06-20 15:52   ` Jim Gettys
  2012-06-20 20:07 ` Kathleen Nichols
  2 siblings, 1 reply; 7+ messages in thread
From: Jonathan Morton @ 2012-06-20 10:08 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel

Is the cwnd also oscillating wildly or is it just an artefact of the visible part of the queue only being a fraction of the real queue?

Are ACK packets being aggregated by wireless? That would be a good explanation for large bursts that flood the buffer, if the rwnd opens a lot suddenly. This would also be an argument that 2*n is too small for the ECN drop threshold. 

The key to knowledge is not to rely on others to teach you it. 

On 20 Jun 2012, at 04:32, Dave Taht <dave.taht@gmail.com> wrote:

> I've been forming a theory regarding codel behavior in some
> pathological conditions. For the sake of developing the theory I'm
> going to return to the original car analogy published here, and add a
> new one - "oversteer".
> 
> Briefly:
> 
> If the underlying interface device driver is overbuffered, when the
> packet backlog finally makes it into the qdisc layer, that bursts up
> rapidly and codel rapidly ramps up it's drop strategy, which corrects
> the problem, but we are back in a state where we are, as in the case
> of an auto on ice, or a very loose connection to the steering wheel,
> "oversteering" because codel is actually not measuring the entire
> time-width of the queue and unable to control it well, even if it
> could.
> 
> What I observe on wireless now with fq_codel under heavy load is
> oscillation in the qdisc layer between 0 length queue and 70 or more
> packets backlogged, a burst of drops when that happens, and far more
> drops than ecn marks that I expected  (with the new (arbitrary) drop
> ecn packets if > 2 * target idea I was fiddling with illustrating the
> point better, now). It's difficult to gain further direct insight
> without time and packet traces, and maybe exporting more data to
> userspace, but this kind of explains a report I got privately on x86
> (no ecn drop enabled), and the behavior of fq_codel on wireless on the
> present version of cerowrt.
> 
> (I could always have inserted a bug, too, if it wasn't for the private
> report and having to get on a plane shortly I wouldn't be posting this
> now)
> 
> Further testing ideas (others!) could try would be:
> 
> Increase BQL's setting to over-large values on a BQL enabled interface
> and see what happens
> Test with an overbuffered ethernet interface in the first place
> Improve the ns3 model to have an emulated network interface with
> user-settable buffering
> 
> Assuming I'm right and others can reproduce this, this implies that
> focusing much harder on BQL and overbuffering related issues on the
> dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
> this point. And we already know that much more hard work on fixing
> wifi is needed.
> 
> Despite this I'm generally pleased with the fq_codel results over
> wireless I'm currently getting from today's build of cerowrt, and
> certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
> e1000) don't display this behavior, neither does soft rate limiting
> using htb - instead achieving a steady state for the packet backlog,
> accepting bursts, and otherwise being "nice".
> 
> -- 
> Dave Täht
> SKYPE: davetaht
> http://ronsravings.blogspot.com/
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Codel] codel "oversteer"
  2012-06-20 10:08 ` [Codel] " Jonathan Morton
@ 2012-06-20 15:52   ` Jim Gettys
  2012-06-20 19:03     ` [Codel] [Cerowrt-devel] " dpreed
  2012-06-20 20:14     ` [Codel] " Kathleen Nichols
  0 siblings, 2 replies; 7+ messages in thread
From: Jim Gettys @ 2012-06-20 15:52 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: codel, cerowrt-devel

On 06/20/2012 06:08 AM, Jonathan Morton wrote:
> Is the cwnd also oscillating wildly or is it just an artefact of the visible part of the queue only being a fraction of the real queue?
>
> Are ACK packets being aggregated by wireless? That would be a good explanation for large bursts that flood the buffer, if the rwnd opens a lot suddenly. This would also be an argument that 2*n is too small for the ECN drop threshold. 

Yeah, I've been worrying about ack compression...  Not sure exactly what
we should be doing about it, as I don't fully understand it.
                    - Jim

>
> The key to knowledge is not to rely on others to teach you it. 
>
> On 20 Jun 2012, at 04:32, Dave Taht <dave.taht@gmail.com> wrote:
>
>> I've been forming a theory regarding codel behavior in some
>> pathological conditions. For the sake of developing the theory I'm
>> going to return to the original car analogy published here, and add a
>> new one - "oversteer".
>>
>> Briefly:
>>
>> If the underlying interface device driver is overbuffered, when the
>> packet backlog finally makes it into the qdisc layer, that bursts up
>> rapidly and codel rapidly ramps up it's drop strategy, which corrects
>> the problem, but we are back in a state where we are, as in the case
>> of an auto on ice, or a very loose connection to the steering wheel,
>> "oversteering" because codel is actually not measuring the entire
>> time-width of the queue and unable to control it well, even if it
>> could.
>>
>> What I observe on wireless now with fq_codel under heavy load is
>> oscillation in the qdisc layer between 0 length queue and 70 or more
>> packets backlogged, a burst of drops when that happens, and far more
>> drops than ecn marks that I expected  (with the new (arbitrary) drop
>> ecn packets if > 2 * target idea I was fiddling with illustrating the
>> point better, now). It's difficult to gain further direct insight
>> without time and packet traces, and maybe exporting more data to
>> userspace, but this kind of explains a report I got privately on x86
>> (no ecn drop enabled), and the behavior of fq_codel on wireless on the
>> present version of cerowrt.
>>
>> (I could always have inserted a bug, too, if it wasn't for the private
>> report and having to get on a plane shortly I wouldn't be posting this
>> now)
>>
>> Further testing ideas (others!) could try would be:
>>
>> Increase BQL's setting to over-large values on a BQL enabled interface
>> and see what happens
>> Test with an overbuffered ethernet interface in the first place
>> Improve the ns3 model to have an emulated network interface with
>> user-settable buffering
>>
>> Assuming I'm right and others can reproduce this, this implies that
>> focusing much harder on BQL and overbuffering related issues on the
>> dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
>> this point. And we already know that much more hard work on fixing
>> wifi is needed.
>>
>> Despite this I'm generally pleased with the fq_codel results over
>> wireless I'm currently getting from today's build of cerowrt, and
>> certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
>> e1000) don't display this behavior, neither does soft rate limiting
>> using htb - instead achieving a steady state for the packet backlog,
>> accepting bursts, and otherwise being "nice".
>>
>> -- 
>> Dave Täht
>> SKYPE: davetaht
>> http://ronsravings.blogspot.com/
>> _______________________________________________
>> Codel mailing list
>> Codel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/codel
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Codel] [Cerowrt-devel]  codel "oversteer"
  2012-06-20 15:52   ` Jim Gettys
@ 2012-06-20 19:03     ` dpreed
  2012-06-20 20:14     ` [Codel] " Kathleen Nichols
  1 sibling, 0 replies; 7+ messages in thread
From: dpreed @ 2012-06-20 19:03 UTC (permalink / raw)
  To: Jim Gettys; +Cc: codel, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 6012 bytes --]


Simulate what you think is going on, or create a closed-form model.   If the phenomenon appears in the simulation, it will help you experiment with how to eliminate it.  If it does not, you need to understand why what you *think* is going on is not what is actually going on.
 
As I noted, 70 packet queues should not appear due to a simple overload.  What TCP does, from the 75,000 foot perspective, is try to aggressively move any queues that would build up inside the network back to the source buffer, by managing the window down whenever it sees a queue building.
 
That's why bufferbloat is so evil - it masks any signal about the buildup of queues until all the queues are full, and large queues take a *long* time to drain down to "empty".
 
The steady state of a low-latency network under *any* load (even overload) should be one where there are at most one packet queued on each outgoing link internal to the network.
 
[if you need to know why, imagine the opposite were true - then the internal queues make all the control loops very, very long, which makes the network oscillate unstably, with very large variance of latency.]
 
The purpose of queues is *only* to smooth short random bursts, such as might happen on a shared internal link due to occasional "collisions" of traffic from uncorrelated sources.
 
Unfortunately, a vast percentage of designers don't understand that.  Hence, we get bufferbloat - making the queues bigger and bigger, and eliminating any queue buildup signalling back to the source that is overloading the network.
 
I assume codel is supposed to fix that.  If it is letting queues internal to the net fill up, it is doing the wrong thing.
 
-----Original Message-----
From: "Jim Gettys" <jg@freedesktop.org>
Sent: Wednesday, June 20, 2012 11:52am
To: "Jonathan Morton" <chromatix99@gmail.com>
Cc: "codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>, "cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] [Codel] codel "oversteer"



On 06/20/2012 06:08 AM, Jonathan Morton wrote:
> Is the cwnd also oscillating wildly or is it just an artefact of the visible part of the queue only being a fraction of the real queue?
>
> Are ACK packets being aggregated by wireless? That would be a good explanation for large bursts that flood the buffer, if the rwnd opens a lot suddenly. This would also be an argument that 2*n is too small for the ECN drop threshold. 

Yeah, I've been worrying about ack compression...  Not sure exactly what
we should be doing about it, as I don't fully understand it.
 - Jim

>
> The key to knowledge is not to rely on others to teach you it. 
>
> On 20 Jun 2012, at 04:32, Dave Taht <dave.taht@gmail.com> wrote:
>
>> I've been forming a theory regarding codel behavior in some
>> pathological conditions. For the sake of developing the theory I'm
>> going to return to the original car analogy published here, and add a
>> new one - "oversteer".
>>
>> Briefly:
>>
>> If the underlying interface device driver is overbuffered, when the
>> packet backlog finally makes it into the qdisc layer, that bursts up
>> rapidly and codel rapidly ramps up it's drop strategy, which corrects
>> the problem, but we are back in a state where we are, as in the case
>> of an auto on ice, or a very loose connection to the steering wheel,
>> "oversteering" because codel is actually not measuring the entire
>> time-width of the queue and unable to control it well, even if it
>> could.
>>
>> What I observe on wireless now with fq_codel under heavy load is
>> oscillation in the qdisc layer between 0 length queue and 70 or more
>> packets backlogged, a burst of drops when that happens, and far more
>> drops than ecn marks that I expected  (with the new (arbitrary) drop
>> ecn packets if > 2 * target idea I was fiddling with illustrating the
>> point better, now). It's difficult to gain further direct insight
>> without time and packet traces, and maybe exporting more data to
>> userspace, but this kind of explains a report I got privately on x86
>> (no ecn drop enabled), and the behavior of fq_codel on wireless on the
>> present version of cerowrt.
>>
>> (I could always have inserted a bug, too, if it wasn't for the private
>> report and having to get on a plane shortly I wouldn't be posting this
>> now)
>>
>> Further testing ideas (others!) could try would be:
>>
>> Increase BQL's setting to over-large values on a BQL enabled interface
>> and see what happens
>> Test with an overbuffered ethernet interface in the first place
>> Improve the ns3 model to have an emulated network interface with
>> user-settable buffering
>>
>> Assuming I'm right and others can reproduce this, this implies that
>> focusing much harder on BQL and overbuffering related issues on the
>> dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
>> this point. And we already know that much more hard work on fixing
>> wifi is needed.
>>
>> Despite this I'm generally pleased with the fq_codel results over
>> wireless I'm currently getting from today's build of cerowrt, and
>> certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
>> e1000) don't display this behavior, neither does soft rate limiting
>> using htb - instead achieving a steady state for the packet backlog,
>> accepting bursts, and otherwise being "nice".
>>
>> -- 
>> Dave Täht
>> SKYPE: davetaht
>> http://ronsravings.blogspot.com/
>> _______________________________________________
>> Codel mailing list
>> Codel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/codel
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel

_______________________________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel

[-- Attachment #2: Type: text/html, Size: 7569 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Codel] codel "oversteer"
  2012-06-20  1:32 [Codel] codel "oversteer" Dave Taht
  2012-06-20  3:01 ` [Codel] [Cerowrt-devel] " dpreed
  2012-06-20 10:08 ` [Codel] " Jonathan Morton
@ 2012-06-20 20:07 ` Kathleen Nichols
  2 siblings, 0 replies; 7+ messages in thread
From: Kathleen Nichols @ 2012-06-20 20:07 UTC (permalink / raw)
  To: codel


If most of the buffering is at the device driver level then fq_codel isn't
the answer.

When you get your drop burst, is that codel drops or tail drops? If the
driver just has enough buffering/delay in order to properly service
the link, then you don't really want to involve that in the queue
management.

If there are bugs then who knows? But it would be good to be able to
instrument
the drops and get some trace information.

	Kathie

On 6/19/12 6:32 PM, Dave Taht wrote:
> I've been forming a theory regarding codel behavior in some
> pathological conditions. For the sake of developing the theory I'm
> going to return to the original car analogy published here, and add a
> new one - "oversteer".
> 
> Briefly:
> 
> If the underlying interface device driver is overbuffered, when the
> packet backlog finally makes it into the qdisc layer, that bursts up
> rapidly and codel rapidly ramps up it's drop strategy, which corrects
> the problem, but we are back in a state where we are, as in the case
> of an auto on ice, or a very loose connection to the steering wheel,
> "oversteering" because codel is actually not measuring the entire
> time-width of the queue and unable to control it well, even if it
> could.
> 
> What I observe on wireless now with fq_codel under heavy load is
> oscillation in the qdisc layer between 0 length queue and 70 or more
> packets backlogged, a burst of drops when that happens, and far more
> drops than ecn marks that I expected  (with the new (arbitrary) drop
> ecn packets if > 2 * target idea I was fiddling with illustrating the
> point better, now). It's difficult to gain further direct insight
> without time and packet traces, and maybe exporting more data to
> userspace, but this kind of explains a report I got privately on x86
> (no ecn drop enabled), and the behavior of fq_codel on wireless on the
> present version of cerowrt.
> 
> (I could always have inserted a bug, too, if it wasn't for the private
> report and having to get on a plane shortly I wouldn't be posting this
> now)
> 
> Further testing ideas (others!) could try would be:
> 
> Increase BQL's setting to over-large values on a BQL enabled interface
> and see what happens
> Test with an overbuffered ethernet interface in the first place
> Improve the ns3 model to have an emulated network interface with
> user-settable buffering
> 
> Assuming I'm right and others can reproduce this, this implies that
> focusing much harder on BQL and overbuffering related issues on the
> dozens? hundreds? of non-BQL enabled ethernet drivers is needed at
> this point. And we already know that much more hard work on fixing
> wifi is needed.
> 
> Despite this I'm generally pleased with the fq_codel results over
> wireless I'm currently getting from today's build of cerowrt, and
> certainly the BQL-enabled ethernet drivers I've worked with (ar71xx,
> e1000) don't display this behavior, neither does soft rate limiting
> using htb - instead achieving a steady state for the packet backlog,
> accepting bursts, and otherwise being "nice".
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Codel] codel "oversteer"
  2012-06-20 15:52   ` Jim Gettys
  2012-06-20 19:03     ` [Codel] [Cerowrt-devel] " dpreed
@ 2012-06-20 20:14     ` Kathleen Nichols
  1 sibling, 0 replies; 7+ messages in thread
From: Kathleen Nichols @ 2012-06-20 20:14 UTC (permalink / raw)
  To: codel


Good traffic mixing seems to be the answer to ack compression and
fqcodel should provide that. I've just started to rerun the reverse
traffic scenarios I ran before we wrote the paper, now using (a slight
variant of Eric's) fqcodel and it looks better at controlling the
queue. It's not that ack compression foils codel; it still works but
not as well. Ack compression foils tcp.

	Kathie

On 6/20/12 8:52 AM, Jim Gettys wrote:
> On 06/20/2012 06:08 AM, Jonathan Morton wrote:
>> Is the cwnd also oscillating wildly or is it just an artefact of
>> the visible part of the queue only being a fraction of the real
>> queue?
>> 
>> Are ACK packets being aggregated by wireless? That would be a good
>> explanation for large bursts that flood the buffer, if the rwnd
>> opens a lot suddenly. This would also be an argument that 2*n is
>> too small for the ECN drop threshold.
> 
> Yeah, I've been worrying about ack compression...  Not sure exactly
> what we should be doing about it, as I don't fully understand it. -
> Jim
> 
>> 
>> The key to knowledge is not to rely on others to teach you it.
>> 
>> On 20 Jun 2012, at 04:32, Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> I've been forming a theory regarding codel behavior in some 
>>> pathological conditions. For the sake of developing the theory
>>> I'm going to return to the original car analogy published here,
>>> and add a new one - "oversteer".
>>> 
>>> Briefly:
>>> 
>>> If the underlying interface device driver is overbuffered, when
>>> the packet backlog finally makes it into the qdisc layer, that
>>> bursts up rapidly and codel rapidly ramps up it's drop strategy,
>>> which corrects the problem, but we are back in a state where we
>>> are, as in the case of an auto on ice, or a very loose connection
>>> to the steering wheel, "oversteering" because codel is actually
>>> not measuring the entire time-width of the queue and unable to
>>> control it well, even if it could.
>>> 
>>> What I observe on wireless now with fq_codel under heavy load is 
>>> oscillation in the qdisc layer between 0 length queue and 70 or
>>> more packets backlogged, a burst of drops when that happens, and
>>> far more drops than ecn marks that I expected  (with the new
>>> (arbitrary) drop ecn packets if > 2 * target idea I was fiddling
>>> with illustrating the point better, now). It's difficult to gain
>>> further direct insight without time and packet traces, and maybe
>>> exporting more data to userspace, but this kind of explains a
>>> report I got privately on x86 (no ecn drop enabled), and the
>>> behavior of fq_codel on wireless on the present version of
>>> cerowrt.
>>> 
>>> (I could always have inserted a bug, too, if it wasn't for the
>>> private report and having to get on a plane shortly I wouldn't be
>>> posting this now)
>>> 
>>> Further testing ideas (others!) could try would be:
>>> 
>>> Increase BQL's setting to over-large values on a BQL enabled
>>> interface and see what happens Test with an overbuffered ethernet
>>> interface in the first place Improve the ns3 model to have an
>>> emulated network interface with user-settable buffering
>>> 
>>> Assuming I'm right and others can reproduce this, this implies
>>> that focusing much harder on BQL and overbuffering related issues
>>> on the dozens? hundreds? of non-BQL enabled ethernet drivers is
>>> needed at this point. And we already know that much more hard
>>> work on fixing wifi is needed.
>>> 
>>> Despite this I'm generally pleased with the fq_codel results
>>> over wireless I'm currently getting from today's build of
>>> cerowrt, and certainly the BQL-enabled ethernet drivers I've
>>> worked with (ar71xx, e1000) don't display this behavior, neither
>>> does soft rate limiting using htb - instead achieving a steady
>>> state for the packet backlog, accepting bursts, and otherwise
>>> being "nice".
>>> 
>>> -- Dave Täht SKYPE: davetaht http://ronsravings.blogspot.com/ 
>>> _______________________________________________ Codel mailing
>>> list Codel@lists.bufferbloat.net 
>>> https://lists.bufferbloat.net/listinfo/codel
>> _______________________________________________ Codel mailing list 
>> Codel@lists.bufferbloat.net 
>> https://lists.bufferbloat.net/listinfo/codel
> 
> _______________________________________________ Codel mailing list 
> Codel@lists.bufferbloat.net 
> https://lists.bufferbloat.net/listinfo/codel
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-06-20 20:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-20  1:32 [Codel] codel "oversteer" Dave Taht
2012-06-20  3:01 ` [Codel] [Cerowrt-devel] " dpreed
2012-06-20 10:08 ` [Codel] " Jonathan Morton
2012-06-20 15:52   ` Jim Gettys
2012-06-20 19:03     ` [Codel] [Cerowrt-devel] " dpreed
2012-06-20 20:14     ` [Codel] " Kathleen Nichols
2012-06-20 20:07 ` Kathleen Nichols

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox