[Cake] second system syndrome

Mon Dec 7 07:24:34 EST 2015

Here are some further thoughts, intermingled with Sebastian's comments
as well.  This is more of a brain dump/thinking out loud so is in
complete disorder and blunt.

On 06/12/15 16:08, Sebastian Moeller wrote:
> Hi Dave,
>
> since I am not really involved in cake development make out of my comments what you will…
>
> Even though I added comments below, IMHO, the way to proceed is discuss the statistics to pass back to tc, and define a set we agree to stick to (sat least as a base set, potentially copied from the best of fq_codel and HTB) and then ask the kernel and iproute2 folks to merge what we have. Changes that improve performance will most likely be possible in the future even if upstreamed already… 
I'm going to get shouted at for this, but I don't care.  There is one
more feature (ok 2 more features) I'd like to see.

1) Extend DSCP washing to include a 'washed DSCP value' per tin.  It's a
simple extension and allows washing from the default 'best effort' to
something else we can choose.  I've written about 25% of this, what I'm
struggling with is a reasonable way of passing 8 bytes (the DSCP code
for each tin) from userspace to kernel land.   The tc interface would be
simple (and horrible) in the sense of it'll be a hex string of the
required dscp codes for each tin.  Unless someone wishes to write a
better interface of course.  There will of course be additional cpu cost
- picking up the 'washed to' from memory.

2) dual flow isolation.

>
>
> On Dec 6, 2015, at 15:53 , Dave Taht <dave.taht at gmail.com> wrote:
>
>> I find myself torn by 3 things.
>>
>> 1) The number of huge wins in fixing wifi far outweigh what we have
>> thus far achieved, or not achieved, in cake.
The distractions of crappy wifi and then the FCC débâcle haven't helped
the focus on Cake one bit.

Personally speaking, there's a lack of clear project 'lead' - I persist
in my assertion that I'm not a coder, certainly not a confident one. 
I've been reluctant to push commits to the repo with ideas/lunacy quite
frankly because I don't want to piss either you or Jonathan off by
messing with what I perceive to be 'his' code.  On the other hand, some
things have happened (and stuck!) because I just blew a raspberry in the
general direction, said 'stuff it' and pushed - anything that went into
a feature branch got ignored, anything in 'master' got at least compiled
and possibly even reviewed.   We should make better use of git &
(mis)feature branches.   But there's need for a 'Linus' here.  And a lot
more collaboration.

> 	As we say at home “der Spatz in der Hand ist besser als die Taube auf dem Dach”; so given that cake is almost baked and wifi needs a lot more than simple go-faster-stripes maybe finishing cake while getting wifi improved is achievable?
>
>> 2) Science - Cake is like wet paint! There knobs to fiddle, endless
>> tests to run, new ideas to try... measurements to take! papers to
>> write!
>>
>> 3) Engineering - I just want it to be *done*. It's been too long. It
>> was demonstrably faster than htb + fq_codel on weak hardware last
>> june, and handled GRO peeling, which were the two biggest "bugs" in
>> sqm I viewed we had.
> 	Two questions:
> 1) was it really faster for long enough tests (and has anybody accidentally looked at cpu temperatures once cake starts throttling)?
> 2) Bugs in sqm? I thought that cake’s reason d’être was not to improve sqm-scripts’ performance, but to make it simple for my mom to setup up a decent latency conserving internet access. So performance gains are sugar on top, but lack there of is not necessarily a merge stopper?
1) I suspect but don't *know* that the glitchy performance issue was a
manifestation of the incorrectly sized buffer bug (now well & truly
squished)  There were other odd symptoms associated with that bug too:
apparent changes in ECN marking, now behaving better.  I've run 40/10
rrul tests (wired) for 10 minutes and not seen any evidence of nasties. 
WiFi on the other hand......yes, let's leave that.

2) 'Apparent simplicity' is the phrase used on the bufferbloat cake page
and I like it!  Yes cake is 'complicated', but it's  darn sight easier
than a myriad of iptables rules & other carp that I don't understand.  I
also wonder if the focus on cpu usage is losing sight of offsets by not
having all those rules around.
>
>> In wearing these 3 hats, I would
>>
>> 3A) like to drop cake, personally, from something I needed to care about.
>> 3B) But, can't, because the profusion of features need to be fully evaluated.
>> In this test series: http://snapon.cs.kau.se/~d/bcake_tests/ neither
>> cake or bcake were "better" than the existing codel in any measurable
>> way, and in most cases, worse. bcake did mildly better at a short
>> (10ms) RTT... which was interesting.
> 	But since codel/fq_cdel are hard to set up, especially in combination with a shaper; rough performance parity with HTB+fq_codel might be sufficient justification for a merge.
Agreed
>
>> If you want to take apart this batch with "flent", looking for
>> enlightenment, also, please go ahead.
>>
>> Were I to short circuit the science here, I'd rip out the sqrt cache
>> and fold back in mainline codel into cake. This would also have the
>> added benefit of also moving us back to 32bitland for various values
>> (tho "now" becomes a bit trickier) and hopefully improving cpu
>> efficiency a bit further (but this has to get done carefully unless
>> your head is good at 32 bit overflow math)
>>
>> Next up, a series testing the fq portions...
>>
>> If someone (else) would like to fork cake again and do the two things
>> above, I'd appreciate it.
Feature branch.  Beyond my cut'n'paste C I'm afraid.
>>
>> 3C) Most of the new statistics are pretty useless IMHO. Interesting,
>> but in the end I mostly care about drops and marks only.
> 	I do care about packet size (and max packet size). The kernel’s complicated rules when and which overhead to add or not to add are so under-documented that one needs a way to figure out what information reaches the qdiscs/shapers otherwise meaningful per-paket-overhead accounting is not going to work. Max_packet size I see as the only way to check wether Meta-packets hit the qdiscs or not. Even if cake does not peel or always peel this is informative in my opinion.
I put 'last packet' there simply because it was available nearby in the
code :-)  I agree with Sebastian that max_packet has been extremely
useful in *KNOWING* what overheads the kernel is passing into the
qdisc.  Compromise:  Drop 'last packet', maintain 'max_packet'.
>
>> 3D) Don't have a use for the rate estimator either, and so far the
>> dual queue idea has not materialized. I understand how it might be
>> done now - using the 8 way set associative thing per DEST hash, but I
>> don't really see the benefit of that vs just using a DEST hash in the
>> first place.
>>
>> 3E) Want cake to run as fast as possible on cheap hardware and be a
>> demonstrable win over htb + fq_codel - and upstream it and be done
>> with it.
> 	Being able to set up a decent shaper/codel combination in one line of tc is already a win (but I repeat myself)
You do.  And I agree for a 2nd (or is it 3rd) time.
>
>> 3F) At the moment I'm favoring peeling at the current quantum rather
>> than anything more elaborate.
> 	Why quantum, why not simply at MTU boundaries? I seem to recall that aggregates already carry information how many MTU segments they consist out of which could be re-used?
>
>> 3G) really want the thing to work correctly down to 64k and up to at
>> least a gbit.
>> which needs testing... but probably after we pick a codel....
>>
>> 2A) As a science vehicle, there are many other things we could be
>> trying in cake, and I happen to like the idea of the (currently sort)
>> cache in for example, trying a faster curve at startup - or, as in the
>> ns2 code - a harder curve at say count + 128 or even earlier, as the
>> speed up in drops gets pretty tiny even at count + 16. (see attached)
>>
>> (it doesn't make much sense to calculate the sqrt at run time - you
>> can just calculate the constants elsewhere and plug them in, btw.
>> attached is a teeny proggie that does that an also explores a harder
>> initial curve (you'd jump count up to where it matched the curve when
>> you reverted to the invsqrt calculation) - and no, I haven't tried
>> plugging this in either... DANGER! Wet Paint!
>>
>> I also like keeping all the core values 64 bits, from a science perspective.
>>
>> There are also things like reducing the number of flows, and
>> exercising the 8 way associative cache more - to say 256, 128, or even
>> 32? Or relative to the bandwidth... or target setting...
>>
>> and I do keep wishing we could at the very least resolve the target >
>> mtu issue. std codel turns off with a single mtu outstanding. That
>> arguably should have been all that was needed...
>>
>> and then there's ecn...
>>
>> 1A) Boy do we have major problems with wifi, and major gains to be had
>> 1B) All the new platforms have bugs eveyerhwer, including in the
>> ethernet drivers
>>
>> 0)
>>
>> So I guess it does come down to - what are the "musts" for cake before
>> it goes upstream?
> 	Get the feature set defined (potentially strip contentious features for the time being and merge them piecewise into the kernel proper) as well as a statistics set so the communication with tc is future proof enough for the near future. Then try to get it merged...
>
>> How much more work is required, by everybody, on
>> every topic, til that happens? Can we just fork off what is known to
>> work reasonably well, and let the rest evolve separately in a cake2?
> 	I was under the impression, that you and Toke are currently measuring the performance costs of the additional features so decisions which features to include could be made based on their cost?

It is difficult for the rest of us to help with the measuring if we
don't know how & what you're measuring.  Judging from recent comments
I'd say some code profiling is going on too.  I've no idea how to do
that...let alone on my router...but if there were some instruction I'm
willing to try on a non X86_64 platform.

I'd also like to see a lot more algorithm documentation.  I can read
(some) code but there are quite a few places in cake that are opaque to me:

1) The DRR soft shaper algorithm, how it relates to 'now'
2) Interaction with bandwidth thresholds
3) Difference between tin_quantum_band & tin_quantum_prio.

The code even contains the comment 'this is the priority soft-shaper magic'!

Kevin

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4816 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.bufferbloat.net/pipermail/cake/attachments/20151207/cde8f8d8/attachment-0002.bin>