[Codel] [RFC PATCH] codel: ecn mark at target

Mon Aug 6 13:50:55 EDT 2012

On Mon, Aug 6, 2012 at 9:46 AM, Eric Dumazet <eric.dumazet at gmail.com> wrote:

> Every time we suspect sender has a small cwnd, we enter the quickack
> mode.
>
> So when receiving CE segments, we should do the same. Thats a flaw in
> current linux code to do nothing and expect more packets to come.
>
> It just works because RTO triggers. A lot of bugs are hidden because of
> various timers that take some emergency actions.
>

This discussion is getting mildly off-track. My intent in posting this patch
was to prove how wrong the "ecn mark at target" idea was by example,
and in doing so, shed light on those new to codel, on how the algorithm
actually works, and to encourage those that didn't grok it, to read and
run the code in whatever scenarios would help more people to
grokking in fullness.

I hadn't expected to twiddle a bug!

In a previous mail eric had written:

>link with HTB limit of 100Mbit/s
>One netperf TCP_STREAM to a non ECN enabled local target, duration 100s
>One netpert TCP_STREAM to a ECN enabled local target, duration 100s

>1) Current Codel implementation

>non ECN target A : 50.55 Mbit/s   (4336 dropped packets)
>ECN target B     : 45.86 Mbits/s (3239 ecn marks)

My assumption is that with the ecn quickack fix in place, that these two
streams are now more closely equivalent in throughput.

>2) Patch to mark all ecn-enabled packets above 'target'

>non ECN target A : 95.43 Mbit/s (2957 drops)
>ECN target B    :  0.97 Mbit/s (2500 ecn marks)

And this was the result I'd expected from the patch regardless. While
I would expect the ecn quickack thing to help matters, with the
"ietf patch" - not mine! I just decided that a negative proof was the quickest
way to move forward! - I would still expect non-ecn marked streams dropped
via the codel drop scheduler to seriously  outcompete ecn streams
being marked at target.

but as I'm on the road, and away from my test rig, I lack numbers
to deal with. Eric, could you repost the results from these two
scenarios?

>It seems few people really understood CoDel 'target'. When a link
>is used, almost all packets are above target. Thats OK. If it was not
>OK, why would we use the sqrt(count) at all, I wonder.

Two of the really brilliant parts of codel are that A) it aims to *eventually*
hit an optimum target, while absorbing bursts, and B) does so gently
using the invsqrt trick to schedule drops at a decreasing rate until an
optimum is found, and to continually re-adjust under varying loads to
ensure a minimal queue length.

The "finding the optimum" portion of codel is still a matter
of research, with something like 6 different variants existing so far,
and the version published in the original paper being thoroughly
obsolete. The current ns2 code has some promising ideas in it,
and I went through the trouble of getting that, and what
variants I could find for ns2, up on github last week.

https://github.com/dtaht/ns2

The ns3 code is also on github, which has a fq_codel implementation
too that doesn't quite match the linux code.

If anyone would like commit access, let me know. The ns2 version now ALSO
contains a re-implementation of fq_codel, made lighter-weight
because it is packet oriented rather than byte-fair oriented.

NOTE:

I have an email about reconciling each of the variants of codel/fq_codel
in ns2,ns3, and linux stacked up behind this email, but  I'd like to keep
trying to nail this ECN issue to the ground, first.

My own concern with ECN is finding a way to use it effectively while making
it difficult to abuse in the real world.

>Codel intent is not to mark/drop _all_ packets above 'target'.

>ECN intent is to replace a drop by a mark, not marking packets that
>would not have been dropped at all if they were not ECN enabled.

I really, really, really would like more folk to run themselves through
the behavior of codel in various scenarios, particularly
under contention from multiple streams, and in the fq_codel case as well.

In part it's a shame
that we removed some of the scaffolding that existed in earlier, less
optimized versions of the Linux implementation, which lent more insight
as to what happened, when.

The ns3 code has remnants of that, tracking the major states in codel
in a way that's mildly easier to understand.

I've had to look at a lot of traces and a lot of captures to get a grip
one what actually happens in each variant.

-- 
Dave Täht