[Cake] Cake3 - source code and some questions

Dave Taht dave.taht at gmail.com
Thu Apr 16 15:26:02 EDT 2015


On Thu, Apr 16, 2015 at 6:48 AM, Adrian Popescu
<adriannnpopescu at gmail.com> wrote:
> I've discovered there are other problems in the Linux networking stack
> which don't seem to be related to fq_codel, qdiscs, AQM and HTB.
>
> There are latency inducing issues bugs in the Ethernet network drivers
> of many network adapters, including e1000e, or in the kernel itself.
> Some kernels are better. The newest ones have severe regressions in
> this area.

There are a multiplicity of problems in doing real-time packet processing
while on processors that generally cannot context switch in under 1000
cycles anymore,
and on virtual machines, oh, my!

One of the biggest fixes for linux networking was the BQL infrastructure which
is in something like 24 drivers now (including the e1000e) - but things like
TSO and GRO offloads remain a PITA.

There are still many drivers left to fix. One (mvneta) is really bugging me
of late....

> I was under the impression there's a problem in codel or fq_codel that
> lead to very frequent latency micro-spikes of between 1 and 3
> milliseconds.

In my world it is sometimes hard to worry about stuff down in this latency
noise level. This is so far less than what these algorithms generally
solve in the first place, that I generally have treated it as noise.

That said, I remain sad that the linux-rt folk are so underfunded: as they
have the tools and expertise to try and get more consistent low latency
with full throughput.

>It also seemed to produce bigger latency spikes under
> moderate load. No amount of tuning and disabling of offloads helped
> with this.

Regrettably you are not providing enough details. Repeatable
tests and actual measurements are always helpful.

There is presently
what is viewed as a regresson by the Xen folk involving the tcp
small queues subsystem, which is being discussed heavily on lkml.

What are you referring to specifically?

>
> Imagine having 2 milliseconds of latency to your ISP and having your
> router induce between 3 to 5 milliseconds of latency for every flow.
> It's not particularly helpful for low latency paths on high bandwidth
> links. Adding more latency in both directions to high latency paths is
> even worse.

Imagine having your router induce seconds or 10s of seconds latency
- the state of affairs on most edge devices today.

I DO care about latency and jitter to this level, but it is very hard to
isolate and measure.

>
> These problems are the reason behind starting this thread. I believed
> these problems to be related to fq_codel or to the codel algorithm
> itself.

Not enough detail, what exactly, are you measuring, on what hardware
using what tools?

> My question about porting these improvements to codel and fq_codel was
> strictly about the tighter recovery, better invsqrt and other codel

In the tree and in cerowrt for 2 years has been multiple variants of the
algorithms under test, individually. Cake rolls up the best of these
attempts thus far, and each of those separate models remain in-tree
for further testing against all the other variables.

In no case have I cared one whit about sub 3ms worth of jitter, I was
mostly looking to get faster convergence, better
utilization at longer RTTs, better behavior at > 100Mbit, and more
filling of the pipe in generally

I happen to not agree with jonathon that the better invsqrt (cache) now in cake
accomplishes anything, but plan to test.

I do think that the better resumption stuff (which has one part that
corrects an error in newton's method going in reverse) helps at >
100mbit, which was a speed we were not able to test effectively at in
our prior attempts before we had all these nice test tools.

> enhancements mentioned on the wiki page.

Well, at the sub 3ms level it is almost always about the device driver,
BQL, tcp small queues. and kernel context switch time.

There are two feature of BQLs I dislike in that it uses a MIAD
(Rather than AIMD) controller, and that it's buffering is additive
across hardware multiqueues (and devices have sprouted a lot of those
of late, which exhibit birthday problems)

I dislike TCP IW10 intensely. I feel the entire GRO/TSO/GSO concepts
are basically broken (and that we need to develop better hardware that
can deal with packets as packets again).

There are quite a large list of things to solve to get latencies lower
than 2ms that are *hard*. Perhaps userspace networking is an answer.

> The solution to this unstable latency will necessitate migration to
> another platform without Linux. I'm aware cake and fq_codel won't fix
> this problem on Linux.

Well a huge goal here is from the dual BSD/GPL licensing of cake. We hope
that someone will show up to port this version of the algorithms to another OS,
and by escaping the monoculture, we will learn more about how to do it
more right.

There is already a project starting to do a dpdk  version

Codel is in click, codel in ns2, fq_codel in ns3 - we do not have anyone
committed to doing cake in anything else at present. Keep hoping a BSD
expert will show up to do a pfsense version.

> _______________________________________________
> Cake mailing list
> Cake at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake



-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67



More information about the Cake mailing list