[Cake] Cake on OpenWRT community builds and my observations

Frits Riep riep at riepnet.com
Mon Dec 7 21:02:44 EST 2015

First of all, I would like to thank all of the contributors for their hard
work and good progress in the challenge of combating bufferbloat!

Most IT professionals and users still have no idea of issue or the potential
fixes, and so awareness in the larger community is still a major issue.

I had in the past used Cerowrt builds and other openwrt builds with fq_codel
and also have been very happy with the results.

I wanted to let you know that I found a build which incorporates sqm-scripts
with Cake.  The build that I found is accessible in the OpenWRT Community
Releases, under "Optimized and feature rich trunk build for select routers"
by arokh.


I'd like to provide some feedback on my experiences with Cake so far.  I
just updated my TP-Link Archer C7 to an OpenWRT image which was pre built
with sqm-scripts and I was very pleasantly surprised to find that it
supported Cake, and I set it up and tested using queue discipline as "cake"
and queue setup script "layer_cake.qos"  I have been following the progress
over time in the bufferbloat email lists, but was not aware that I could
access a build supporting cake without doing my own build and I was not
really knowledgeable enough to take that on.

I've so far been extremely pleased with the results of my initial testing,
and even when I set the bandwidth limit at very low speeds (like 1 Mbs up
and down), applications like Netflix worked perfectly, pings were great, and
dslreports.com/speedtest, report A+ on bufferbloat.

As I read about the qos part, I am wondering however is it only IP Layer-3
DSCP which goes into the queue management system or is Layer-2 traffic like
802.1p traffic also being managed?

It would be great to manage that as well if it is not because there are many
applications where there is bufferbloat which could be managed by OpenWRT.
For example, there are wireless links between buildings, situations where
priority packets are not marked.  Voip phones often are set by default to
prioritize voice packets via 802.1p.

Please let me know your thoughts on this.

Frits Riep

-----Original Message-----
From: cake-bounces at lists.bufferbloat.net
[mailto:cake-bounces at lists.bufferbloat.net] On Behalf Of
cake-request at lists.bufferbloat.net
Sent: Monday, December 07, 2015 3:00 PM
To: cake at lists.bufferbloat.net
Subject: Cake Digest, Vol 9, Issue 4

Send Cake mailing list submissions to
	cake at lists.bufferbloat.net

To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
	cake-request at lists.bufferbloat.net

You can reach the person managing the list at
	cake-owner at lists.bufferbloat.net

When replying, please edit your Subject line so it is more specific than
"Re: Contents of Cake digest..."

Today's Topics:

   1. Re: second system syndrome (Kevin Darbyshire-Bryant)


Message: 1
Date: Mon, 7 Dec 2015 12:24:34 +0000
From: Kevin Darbyshire-Bryant <kevin at darbyshire-bryant.me.uk>
To: <cake at lists.bufferbloat.net>
Subject: Re: [Cake] second system syndrome
Message-ID: <56657A82.2080601 at darbyshire-bryant.me.uk>
Content-Type: text/plain; charset="windows-1252"

Here are some further thoughts, intermingled with Sebastian's comments as
well.  This is more of a brain dump/thinking out loud so is in complete
disorder and blunt.

On 06/12/15 16:08, Sebastian Moeller wrote:
> Hi Dave,
> since I am not really involved in cake development make out of my comments
what you will?
> Even though I added comments below, IMHO, the way to proceed is discuss
the statistics to pass back to tc, and define a set we agree to stick to
(sat least as a base set, potentially copied from the best of fq_codel and
HTB) and then ask the kernel and iproute2 folks to merge what we have.
Changes that improve performance will most likely be possible in the future
even if upstreamed already? 
I'm going to get shouted at for this, but I don't care.  There is one more
feature (ok 2 more features) I'd like to see.

1) Extend DSCP washing to include a 'washed DSCP value' per tin.  It's a
simple extension and allows washing from the default 'best effort' to
something else we can choose.  I've written about 25% of this, what I'm
struggling with is a reasonable way of passing 8 bytes (the DSCP code
for each tin) from userspace to kernel land.   The tc interface would be
simple (and horrible) in the sense of it'll be a hex string of the required
dscp codes for each tin.  Unless someone wishes to write a better interface
of course.  There will of course be additional cpu cost
- picking up the 'washed to' from memory.

2) dual flow isolation.

> On Dec 6, 2015, at 15:53 , Dave Taht <dave.taht at gmail.com> wrote:
>> I find myself torn by 3 things.
>> 1) The number of huge wins in fixing wifi far outweigh what we have 
>> thus far achieved, or not achieved, in cake.
The distractions of crappy wifi and then the FCC d?b?cle haven't helped the
focus on Cake one bit.

Personally speaking, there's a lack of clear project 'lead' - I persist in
my assertion that I'm not a coder, certainly not a confident one. 
I've been reluctant to push commits to the repo with ideas/lunacy quite
frankly because I don't want to piss either you or Jonathan off by messing
with what I perceive to be 'his' code.  On the other hand, some things have
happened (and stuck!) because I just blew a raspberry in the general
direction, said 'stuff it' and pushed - anything that went into a feature
branch got ignored, anything in 'master' got at least compiled
and possibly even reviewed.   We should make better use of git &
(mis)feature branches.   But there's need for a 'Linus' here.  And a lot
more collaboration.

> 	As we say at home ?der Spatz in der Hand ist besser als die Taube
auf dem Dach?; so given that cake is almost baked and wifi needs a lot more
than simple go-faster-stripes maybe finishing cake while getting wifi
improved is achievable?
>> 2) Science - Cake is like wet paint! There knobs to fiddle, endless 
>> tests to run, new ideas to try... measurements to take! papers to 
>> write!
>> 3) Engineering - I just want it to be *done*. It's been too long. It 
>> was demonstrably faster than htb + fq_codel on weak hardware last 
>> june, and handled GRO peeling, which were the two biggest "bugs" in 
>> sqm I viewed we had.
> 	Two questions:
> 1) was it really faster for long enough tests (and has anybody
accidentally looked at cpu temperatures once cake starts throttling)?
> 2) Bugs in sqm? I thought that cake?s reason d??tre was not to improve
sqm-scripts? performance, but to make it simple for my mom to setup up a
decent latency conserving internet access. So performance gains are sugar on
top, but lack there of is not necessarily a merge stopper?
1) I suspect but don't *know* that the glitchy performance issue was a
manifestation of the incorrectly sized buffer bug (now well & truly
squished)  There were other odd symptoms associated with that bug too:
apparent changes in ECN marking, now behaving better.  I've run 40/10 rrul
tests (wired) for 10 minutes and not seen any evidence of nasties. 
WiFi on the other hand......yes, let's leave that.

2) 'Apparent simplicity' is the phrase used on the bufferbloat cake page and
I like it!  Yes cake is 'complicated', but it's  darn sight easier than a
myriad of iptables rules & other carp that I don't understand.  I also
wonder if the focus on cpu usage is losing sight of offsets by not having
all those rules around.
>> In wearing these 3 hats, I would
>> 3A) like to drop cake, personally, from something I needed to care about.
>> 3B) But, can't, because the profusion of features need to be fully
>> In this test series: http://snapon.cs.kau.se/~d/bcake_tests/ neither 
>> cake or bcake were "better" than the existing codel in any measurable 
>> way, and in most cases, worse. bcake did mildly better at a short
>> (10ms) RTT... which was interesting.
> 	But since codel/fq_cdel are hard to set up, especially in
combination with a shaper; rough performance parity with HTB+fq_codel might
be sufficient justification for a merge.
>> If you want to take apart this batch with "flent", looking for 
>> enlightenment, also, please go ahead.
>> Were I to short circuit the science here, I'd rip out the sqrt cache 
>> and fold back in mainline codel into cake. This would also have the 
>> added benefit of also moving us back to 32bitland for various values 
>> (tho "now" becomes a bit trickier) and hopefully improving cpu 
>> efficiency a bit further (but this has to get done carefully unless 
>> your head is good at 32 bit overflow math)
>> Next up, a series testing the fq portions...
>> If someone (else) would like to fork cake again and do the two things 
>> above, I'd appreciate it.
Feature branch.  Beyond my cut'n'paste C I'm afraid.
>> 3C) Most of the new statistics are pretty useless IMHO. Interesting, 
>> but in the end I mostly care about drops and marks only.
> 	I do care about packet size (and max packet size). The kernel?s
complicated rules when and which overhead to add or not to add are so
under-documented that one needs a way to figure out what information reaches
the qdiscs/shapers otherwise meaningful per-paket-overhead accounting is not
going to work. Max_packet size I see as the only way to check wether
Meta-packets hit the qdiscs or not. Even if cake does not peel or always
peel this is informative in my opinion.
I put 'last packet' there simply because it was available nearby in the code
:-)  I agree with Sebastian that max_packet has been extremely useful in
*KNOWING* what overheads the kernel is passing into the qdisc.  Compromise:
Drop 'last packet', maintain 'max_packet'.
>> 3D) Don't have a use for the rate estimator either, and so far the 
>> dual queue idea has not materialized. I understand how it might be 
>> done now - using the 8 way set associative thing per DEST hash, but I 
>> don't really see the benefit of that vs just using a DEST hash in the 
>> first place.
>> 3E) Want cake to run as fast as possible on cheap hardware and be a 
>> demonstrable win over htb + fq_codel - and upstream it and be done 
>> with it.
> 	Being able to set up a decent shaper/codel combination in one line
> tc is already a win (but I repeat myself)
You do.  And I agree for a 2nd (or is it 3rd) time.
>> 3F) At the moment I'm favoring peeling at the current quantum rather 
>> than anything more elaborate.
> 	Why quantum, why not simply at MTU boundaries? I seem to recall that
aggregates already carry information how many MTU segments they consist out
of which could be re-used?
>> 3G) really want the thing to work correctly down to 64k and up to at 
>> least a gbit.
>> which needs testing... but probably after we pick a codel....
>> 2A) As a science vehicle, there are many other things we could be 
>> trying in cake, and I happen to like the idea of the (currently sort) 
>> cache in for example, trying a faster curve at startup - or, as in 
>> the
>> ns2 code - a harder curve at say count + 128 or even earlier, as the 
>> speed up in drops gets pretty tiny even at count + 16. (see attached)
>> (it doesn't make much sense to calculate the sqrt at run time - you 
>> can just calculate the constants elsewhere and plug them in, btw.
>> attached is a teeny proggie that does that an also explores a harder 
>> initial curve (you'd jump count up to where it matched the curve when 
>> you reverted to the invsqrt calculation) - and no, I haven't tried 
>> plugging this in either... DANGER! Wet Paint!
>> I also like keeping all the core values 64 bits, from a science
>> There are also things like reducing the number of flows, and 
>> exercising the 8 way associative cache more - to say 256, 128, or 
>> even 32? Or relative to the bandwidth... or target setting...
>> and I do keep wishing we could at the very least resolve the target > 
>> mtu issue. std codel turns off with a single mtu outstanding. That 
>> arguably should have been all that was needed...
>> and then there's ecn...
>> 1A) Boy do we have major problems with wifi, and major gains to be 
>> had
>> 1B) All the new platforms have bugs eveyerhwer, including in the 
>> ethernet drivers
>> 0)
>> So I guess it does come down to - what are the "musts" for cake 
>> before it goes upstream?
> 	Get the feature set defined (potentially strip contentious features
for the time being and merge them piecewise into the kernel proper) as well
as a statistics set so the communication with tc is future proof enough for
the near future. Then try to get it merged...
>> How much more work is required, by everybody, on every topic, til 
>> that happens? Can we just fork off what is known to work reasonably 
>> well, and let the rest evolve separately in a cake2?
> 	I was under the impression, that you and Toke are currently
measuring the performance costs of the additional features so decisions
which features to include could be made based on their cost?

It is difficult for the rest of us to help with the measuring if we don't
know how & what you're measuring.  Judging from recent comments I'd say some
code profiling is going on too.  I've no idea how to do that...let alone on
my router...but if there were some instruction I'm willing to try on a non
X86_64 platform.

I'd also like to see a lot more algorithm documentation.  I can read
(some) code but there are quite a few places in cake that are opaque to me:

1) The DRR soft shaper algorithm, how it relates to 'now'
2) Interaction with bandwidth thresholds
3) Difference between tin_quantum_band & tin_quantum_prio.

The code even contains the comment 'this is the priority soft-shaper magic'!


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4816 bytes
Desc: S/MIME Cryptographic Signature


Cake mailing list
Cake at lists.bufferbloat.net

End of Cake Digest, Vol 9, Issue 4

More information about the Cake mailing list