[Cake] Cake upstream Planning

Dave Taht dave at taht.net
Wed Nov 15 14:44:45 EST 2017

(note changed topic thread)

I dearly would like to try and submit cake to mainline linux in
december. Getting it done is going to take group effort.

And trying to cover all the corner cases, is going to take co-ordination
and scripting, and perhaps we should switch to google docs to pull together.

Also, it might be fun to schedule a dramatic reading of the source code
via videoconference because theres a lot in cake that not enough people
(except maybe jonathan) understand.

Pete Heist <peteheist at gmail.com> writes:

>     On Nov 14, 2017, at 9:10 PM, Dave Taht <dave at taht.net> wrote:
>     Pete Heist <peteheist at gmail.com> writes:
>         By the way, what or how much is needed to get Cake mainlined?
>     I'd like us to give it a go when net-next reopens in two weeks,
>     we'd then have 6 weeks or so to get it right.
>     We need:
>     * Someone to do the heavy lifting. Which I suspect would be me.
>     * Someones with various hardware platforms that current kernels can be
>     run on. qemu?
>     * I'd like to see the ack filtering work get tested on lede at low
>     bandwidths on dsl especially.
>     * A whole lotta tests at various RTTs
> I can offer some testing time, and can script or batch a range of RTTs. netns
> would be useful here. For completeness, I suggest a product of rrul_be runs:
> Rates: 128 / 256 / 512Kbit, 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 / 256 / 512Mbit,
> 1Gbit
> RTTs: 150 / 300 / 600us, 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 / 256 / 512 / 1024ms

Well, we need simple basic single tcp download tests, I would love to
also reuse the http and voip tests toke used in the first paper.

> Opinions? Some of those might be rough (I’m looking at you 128Kbit / 1024ms),
> but it would be good to know what happens. For hardware, I could turn my Mac
> Mini into a qemu box. I guess this list is about right:

Doing a few qemu setups would be good. In particular it helps with
letting us test a net-next kernel. If we could make available qemu
images all the better.

> https://www.debian.org/releases/stable/i386/ch02s01.html.en. I don’t know if all
> tests need to be tried on all platforms.

My principal requirement for multi-arch testing is that it "not crash"
and "compile". More direct testing - like with the mvneta and other odd
ethernet devices, kind of requires real hardware.

> Testing could go much further, with host fairness, diffserv keywords, rtt
> settings (more on that later), overheads, nat, etc. We could also test
> underpowered hardware with rate limiting to see if it degrades gracefully. For
> sanity, we could just test a smattering of these things.

This is a case where flent's batch facility would help. And we can divvy
up the load among servers using the new netns technique. Assuming I get
a bit of funding we can also grab some servers in the cloud, but I'm not
expecting that, so...

I do plan on getting a box to replace snapon also in this timeframe.

>     Blockers:
>     * Ripping out all the backward compatability cruft for submission to
>     mainline and following netdev formatting conventions for comments and
>     indentation. I'd like any new features in the backport to get
>     backported, though (sigh), as lede looks to be shipping a 4.9 based
>     kernel.
> Argh, but probably has to be done.

That turned out to not be hard. I'm about to test that result today.

Folding the result sanely back into the main repo did turn out to be
hard. I also have no idea how to fold together the cobalt and regular
cake branches at the moment, so I'm sticking with cobalt.

>     * tc-cake man page needs to be updated.
>     * tc-adv related code updated to latest iproute2

I will start a repo for this.

>     * There is some work going on here to add ack filtering to cake, which
>     looks VERY promising: https://github.com/dtaht/sch_cake/pull/63
>     I'm going to add something like this to netem also. It may be that
>     merely leveraging the hash would be enough in cake's case.
>     * Testing against the net-next kernel on x86, x86_64, arm, mips, and
>     aarch architectures. (I just got bit by not testing 32 bit arches, sigh)
> Regarding the target and interval settings Cake uses, here are the current
> keywords available and their settings:
> datacentre: 19us / 114us (us yanks might like ‘datacenter' as a synonym)
> lan: 50us / 1ms
> metro: 500us / 10ms
> regional: 1.5ms / 30ms
> internet: 5ms / 100ms
> oceanic: 15ms / 300ms
> satellite: 50ms / 1s
> interplanetary: 5ms / 3600s
> About a year ago I raised a concern that these values were outside what the
> CoDel authors intended. The counter-argument at the time was that
> experimentally, we can show that TCP RTT can be reduced on a Gbit LAN with the
> ‘lan’ keyword. And that argument seems to hold, so far. On two BQLd systems (2x
> PCEngines APU2s) connected with GigE, I can run the same experiment now and show
> that:
> TCP RTT ~= 8ms with default qdisc, throughput ~= 940 Mbit
> TCP RTT ~= 4.5ms with ‘cake unlimited’, throughput ~= 920 Mbit
> TCP RTT ~= 1ms with ‘cake unlimited lan’, throughput ~= 920 Mbit
> So yes, we can lower TCP RTT with these more aggressive settings. But just to
> make sure, we’re confident that there are no other side effects from these lower
> targets and intervals? Is there anything else I should test for to be sure? For
> example, when I rate limit to 950 Mbit and try the same test above, ‘lan’ causes
> a 20% drop in throughput vs the defaults. That may be from an overtaxed CPU, but
> I don’t know. I also wonder how this affects routed vs local traffic. I’ll try
> to test this at some point, as I want to understand it better anyway to know how
> backhaul links should be configured...
>     Non-Blockers:
>     * I don't believe in cobalt, or rather, I won't believe in it until we
>     have data at many RTTs. That said, what I'd propose would be a
>     monolithic cobalt.h file rather than codel5.h.
>     The netns stuff will make simulating RTTs and bandwidths much easier….
>     * I think the fq_codel batch drop facility is better than what cake uses
>     in case of floods. Partially due to the need to handle backports the
>     mechanism fq_codel uses is hard to use - but going mainline we could add
>     this.
>     * The autorate_ingress code should be marked experimental. I keep hoping
>     it can be improved by better looking for "smoothness" inbound, but
>     algorithms escape me. This doesn't bother me much, as tcp continues to
>     be improved over the past 50 years, perhaps we can find ways to improve
>     this with more users.
>     * It is possible to tune the quantum and peeling functions to not peel
>     to the extent they do. Particularly there is usually no need (aside from
>     wanting accurate statistics) to peel below 1500 bytes (except perhaps
>     with the new ack filter mode). We experimented a lot with this in the
>     early days but could never come to a resolution.
>     * I don't have any use for precidence mode and would like to remove it.
> Regarding non-blockers, for FreeNet’s purposes, I wanted to see if I could add
> the option to use packet marks as one of the identifiers for host isolation, but
> I’ve not had time to explore it yet. This would be helpful for ISPs that want to
> ensure fairness when there isn’t a one-to-one mapping between IP address and
> customer. I’ll see if I can at least try it.

More information about the Cake mailing list