[Cake] peeling at sub-1ms

Mon May 25 16:09:39 EDT 2015

On Sun, May 24, 2015 at 9:56 PM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>
>> On 23 May, 2015, at 20:47, Dave Taht <dave.taht at gmail.com> wrote:
>
>> I think I would be happier if we peeled at 250usec rather than 1ms. At
>> 110Mbit that is 2 packets, but at a gig quite a lot.
>>
>> (selfishly, that is the speed I am running at while watching
>> dslreports misbehave)
>>
>> The second problem is that the default peeling behavior somehow should
>> work while at line rate, whether the rate be 10mbit or a gbit.
>>
>> another option is when we know we have lots of flows, to peel more.
>
> Since it doesn’t seem to have completely blown up in your face, I can assume that I was on basically the right track, and do a bit more refinement on the peeling code.  There probably is some room for efficiency improvements.
>

I said the heck with it and peeled superpackets apart universally in
my most recent tests of cake. The original (1960s) definition of a
packet was 1000 bits. Everything since then has fudged the issue.
packets occupy mass, transistors, etc.

A core trade-off I am observing is that having one queue per flow
impacts overall delay - I see fq_codel achieving a 15ms induced delay
typically (with 100 flows extant and 4 measurement flows), cake 30,
pie about 40ms... on a 100/10 link on these crazy bidir tests.

Now the goal of course is to fill the pipe - and another goal (which
is becoming more apparent to me) is to provide an SRTT estimate to the
tcps that is real. We are giving a consistent over-estimate of the
baseline RTT to the tcps... the cwnds stay flat... the single data
point the "fast/slow queue" distinction gives is not enough ( I keep
thinking we can seriously improve a tcp here, and also certainly
improve the ecn behavior of a tcp if it used the minimum observed rtt
to base it's rate reductions on).

I also re-instated the 300 quantum for speeds up to 400mbit instead of
the previous cutoff of 40mbit. I don't like how much extra cpu that
uses, but it led to much saner ack clocking overall at these extreme
loads.

The above also kills the fast/slow distinction also.

Sigh. I really need a fora for writing these down, with graphs, and
turning the results around more quickly to share. and a testbed
running. and....

sigh.

Do you want me to push branches for this sort of exercises?

> If I do some pre-computation at shaper configuration time, I think I can efficiently handle the cases you mention.

Well, one thought in bobbie (dynamically dropping the rate) is pretty
computationally expensive in cake and hits several paths when
changed on the command line that it needent (tc cake change dev eth2
bandwidth 50mbit). If we were to start exploring a
gargoyle-router-project-esq self tuning option those paths should get
tightened.

I am definately seeing heisenbugs with watch tc -s qdisc qdisc show on
a 2 sec interval. Want to throw drops/marks to userspace....

>
>  - Jonathan Morton
>

-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67