[Cake] Cake3 - source code and some questions

Wed Apr 22 17:02:35 EDT 2015

Hello Dave,

On Thu, Apr 16, 2015 at 10:26 PM, Dave Taht <dave.taht at gmail.com> wrote:
>
>> I was under the impression there's a problem in codel or fq_codel that
>> lead to very frequent latency micro-spikes of between 1 and 3
>> milliseconds.
>
> In my world it is sometimes hard to worry about stuff down in this latency
> noise level. This is so far less than what these algorithms generally
> solve in the first place, that I generally have treated it as noise.
>

These algorithms have been doing well on 1 gigabit links. However,
latency depends on the used kernel.

>
>>It also seemed to produce bigger latency spikes under
>> moderate load. No amount of tuning and disabling of offloads helped
>> with this.
>
> Regrettably you are not providing enough details. Repeatable
> tests and actual measurements are always helpful.
>
> There is presently
> what is viewed as a regresson by the Xen folk involving the tcp
> small queues subsystem, which is being discussed heavily on lkml.
>
> What are you referring to specifically?
>

e1000e (82574, dual port and quad port server e1000e adapters with
various Intel chips) is exhibiting varying latency based on the used
kernel.

Local network ping latencies on Ubuntu 14.04 with its LTS 3.13 kernel
are always below 0.5 milliseconds when idle and lightly loaded. The
latest kernels from kernel.org are unable to match those latencies.
These modern kernels are always seeing latencies of 2-3 milliseconds.

Turning off all offloads on the involved ethernet e1000e network
interfaces doesn't help. This is all physical Intel hardware with
e1000e interfaces.

Two idle hosts on a local network should be able to ping each other
and get latencies which are always less than one millisecond. FreeBSD
doesn't have this problem.

>
> Imagine having your router induce seconds or 10s of seconds latency
> - the state of affairs on most edge devices today.
>
> I DO care about latency and jitter to this level, but it is very hard to
> isolate and measure.
>

2-3 milliseconds of jitter or extra latency each way wouldn't have
been a big deal if that represented the only problem a network has to
deal with all the time. Induced latency coupled with stupid RED
causing packet loss at the ISP leads to lower throughput. Induced
latency coupled with wireless packet loss and packet loss caused by
stupid RED is worse.

Adding 5 more milliseconds of latency (or more) during higher load can
be very bad if you're playing a game and the latency is at 95
milliseconds already.

Two persons having a conversation over two such edge devices would get
about 10 milliseconds of induced RTT latency for no good reason. Two
persons having a conversation over two such edge devices from two
devices which induce such latency would probably be seeing up to 12
milliseconds of latency only because of their network. That would be
in addition to all the network latency due to the RTT.

This works properly on older kernels and there's no fq_codel problem.
cake is also affected on the new kernels. Maybe it's better to say the
AQM or shaper makes no difference. Disabling offloads merely reduce it
slightly while idling.

>>
>> These problems are the reason behind starting this thread. I believed
>> these problems to be related to fq_codel or to the codel algorithm
>> itself.
>
> Not enough detail, what exactly, are you measuring, on what hardware
> using what tools?

testing method:
- take two e1000e modern machines (sandy bridge, ivy bridge, haswell)
- one machine should run Ubuntu 14.04 with kernel LTS kernel 3.13
- the other machine should run Ubuntu 14.04 with kernel LTS kernel
3.13 for the control test
- the other machine should run kernel 3.18/3.19/4.0 for the other test
- the two hosts should be very lightly loaded
- connect the machines through a gigabit switch

control test:
ping 172.16.0.1
result: response below 0.5 milliseconds

second test:
ping 172.16.0.1
result: response above 1 millisecond, sometimes 2 milliseconds or worse

Seeing worse latency under load (20-100 milliseconds) isn't uncommon.
I believe this to be a regression in the kernel or in the network
drivers.

>
>> My question about porting these improvements to codel and fq_codel was
>> strictly about the tighter recovery, better invsqrt and other codel
>
> In the tree and in cerowrt for 2 years has been multiple variants of the
> algorithms under test, individually. Cake rolls up the best of these
> attempts thus far, and each of those separate models remain in-tree
> for further testing against all the other variables.
>
> In no case have I cared one whit about sub 3ms worth of jitter, I was
> mostly looking to get faster convergence, better
> utilization at longer RTTs, better behavior at > 100Mbit, and more
> filling of the pipe in generally
>
> I happen to not agree with jonathon that the better invsqrt (cache) now in cake
> accomplishes anything, but plan to test.
>
> I do think that the better resumption stuff (which has one part that
> corrects an error in newton's method going in reverse) helps at >
> 100mbit, which was a speed we were not able to test effectively at in
> our prior attempts before we had all these nice test tools.
>
>> enhancements mentioned on the wiki page.
>
> Well, at the sub 3ms level it is almost always about the device driver,
> BQL, tcp small queues. and kernel context switch time.
>
> There are two feature of BQLs I dislike in that it uses a MIAD
> (Rather than AIMD) controller, and that it's buffering is additive
> across hardware multiqueues (and devices have sprouted a lot of those
> of late, which exhibit birthday problems)
>
> I dislike TCP IW10 intensely. I feel the entire GRO/TSO/GSO concepts
> are basically broken (and that we need to develop better hardware that
> can deal with packets as packets again).
>
> There are quite a large list of things to solve to get latencies lower
> than 2ms that are *hard*. Perhaps userspace networking is an answer.
>
>> The solution to this unstable latency will necessitate migration to
>> another platform without Linux. I'm aware cake and fq_codel won't fix
>> this problem on Linux.
>
> Well a huge goal here is from the dual BSD/GPL licensing of cake. We hope
> that someone will show up to port this version of the algorithms to another OS,
> and by escaping the monoculture, we will learn more about how to do it
> more right.
>
> There is already a project starting to do a dpdk  version
>
> Codel is in click, codel in ns2, fq_codel in ns3 - we do not have anyone
> committed to doing cake in anything else at present. Keep hoping a BSD
> expert will show up to do a pfsense version.
>
>> _______________________________________________
>> Cake mailing list
>> Cake at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Dave Täht
> Open Networking needs **Open Source Hardware**
>
> https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67