[Cake] upstreaming cake in 2017?

Thu Dec 22 20:43:49 EST 2016

On Thu, 22 Dec 2016 21:02:28 +0100
Sebastian Moeller <moeller0 at gmx.de> wrote:

> Hi Dave,
> 
> > On Dec 22, 2016, at 20:43, Dave Taht <dave.taht at gmail.com> wrote:
> > 
> > I think most of the reasons why cake could not be upstreamed are now
> > on their way towards being resolved, and after lede ships, I can't
> > think of any left to stop an
> > upstreaming push.
> > 
> > Some reasons for not upstreaming were:
> > 
> > * Because the algorithms weren't stable enough
> > * Because it wasn't feature complete until last month (denatting,
> > triple-isolate, and a 3 tier sqm)
> > * Because it had to work on embedded products going back to 3.12 or so
> > * Because I was busy with make-wifi-fast - which we got upstream as
> > soon as humanly possible.
> > * Because it was gated on having the large tester base we have with
> > lede (4.4 based)
> > * Because it rather abuses the tc statistics tool to generate tons of stats
> > * Because DSCP markings remain in flux at the ietf  
> 
> 	But does that matter? Is there really a hope that DSCPs will ever work outside of a well controlled DS/cP-domain? Because inside one, you can make any DSCP mean anything you want. Trusting ingress DSCPs to do the right thing and/or be well enough conserved is a lottery ticket. And also trusting that the right applications use the right ietf-compatible markings while no app tries to abuse those seems optimistic. And finally to end-users the problem is not so much which DSCP to priority bands/tier scheme was used, but rather how to convince their important applications to actually mark their packets such.
> 
> > * We ignore the packet priority fields entirely
> > * We don't know what diffserv models and ratios truly make sense  
> 
> 	Well, IMHO that is a good indicator that making it configurable in addition to a few well reasoned configuration seems not the worst thing to do, no?
> 
> > 
> > Anyone got more reasons not to upstream? Any more desirable features?
> > 
> > In looking over the sources today I see a couple issues:
> > 
> > * usage of  // comments and overlong lines
> > * could just use constants for the diffserv lookup tables (I just pushed the
> >   revised gen_cake_const.c file for the sqm mode, but didn't rip out the
> >   relevant code in sch_cake). I note that several of my boxes have 64
> > hw queues now
> > * I would rather like to retire "precedence” entirely  
> 
> 	Why? At least it is a scheme that can be reasonably well described even if it rarely will be a good match for what people want. What is does get right IIRCC is sticking to half of the DSCP bits...
> 
> > * cake cannot shape above 40Gbit (32 bit setting). Someday +40Gbit is possible
> > * we could split gso segments at quantum rather than always
> > * could use some profiling on x86, arm, and mips arches
> > * Need long RTT tests and stuff that abuses cobalt features
> > * Are we convinced the atm and overhead compensators are correct?  
> 
> 	The ATM compensation itself is quite nice, the PTM compensation IMHO is not doing the right thing (less precise and more computationally intensive than required, even though by probably only little). I still have not become a friend of the keywords (it does not help that at least one of them seems not on accordance with the relevant ITU documents). Then again I am sure the keywords do not need me as a friend. But all of this is optional and hence no showstopper for merging (as long as none of them become default options changing them later seems doable to me).
>  
> 
> > * ipv6 nat?  
> 
> 	The current believe seems to be that whoever does IPv6 NAT with a /128 and port remapping can keep the pieces. I have no idea how widespread such a configuration actually is, and adding an option for that after upstreaming also seems not unreasonable?
> 
> > * ipsec recognition and prioritization?  
> 
> 	Why?
> 
> Best Regards
> 	Sebastian
> 
> P.S.: The only part where I can claim some level of expertise (for a low value of expertise) is the overhead accounting stuff, so take the rest with a smile.
> 
> > * I liked deprioritizing ping in sqm-scripts
> > 
> > Hardware mq is bugging me - a single queued version of cake on the
> > root qdisc has much lower latency than a bql'd mq with cake on each
> > queue and *almost* the same throughput.
> > 

It would also help to have a description of which use-case cake is trying to solve:
 - how much configuration (lots HTB) or zero (fq_codel)
 - AP, CPE, backbone router, host system?
Also what assumptions about the network are being made?

Ideally this could end up in both iproute2 and kernel documentation. Don't worry
if it is too much effort right away, LWN might help out.