[Cake] [Make-wifi-fast] Cake in mac80211
dave at taht.net
Wed Feb 5 14:46:31 EST 2020
Jonathan Morton <chromatix99 at gmail.com> writes:
>> On 5 Feb, 2020, at 6:06 pm, Dave Taht <dave at taht.net> wrote:
>>> D) "cobalt" is proving out better in several respects than pure
>>> and folding in some of that makes sense, except I don't know which
>>> things are the most valuable considering wifi's other problems
>>> Reading paper now. Thanks for the pointer.
>> I tend to think out that fq_codel is "good enough" in most
>> circumstances. The edge cases that cake handles better are a matter of a
>> few percentage points, vs orders of magnitude that we get with fq_codel
>> alone vs a vs a FIFO, and my focus of late has been to make things that
>> ate less cpu or were better offloadable than networked better. Others differ.
> I think COBALT might be worth putting in, as it should have
> essentially no net cost and does behave a little better than stock
> Codel. It's better at handling unresponsive traffic, in particular.
Cake, as a whole, benchmarks out at 2x+ more cpu than htb + fq_codel
does, while admittedly doing more stuff.
There are 3 interrelated algorithms in cobalt
1) saturating arithmetic. I have no idea if current compilers do
saturated arith on either mips or arm boxes better than they do, but
intel still doesn't. Hate wasting the cpu on it, and don't mind that the
counter overflows after 4 billion iterations on some workloads.
(I did upstream a mild improvement to the bulk dropper a few months back)
2) Blue - to me - unproven as yet - as I'd like to try saturating
3) I *LIKE* the more graduated drop off in cobalt... in theory.
Also, in the case of wifi, we never implemented the bulk dropper that
the mainline code has, and should definately do that.
4) Increasingly I feel the need to drop unresponsive ecn flows more
robustly. I like what you stuck in your current SCE tree to make blue
kick in earlier. Needs benchmarks...
5) As for things like the invsqrt cache, meh, don't feel like that much
accuracy is required, costs an expensive memory access, wanted to see
how well pie and dualq worked. (Really wish P4 and BPF had an invsqrt
6) Same goes for set associativity.
I LIKE competition! The more folk we have hacking on this stuff the
better it gets. :) I've helped get fq-pie mainlined to have another
reference for comparison, with some hope for seeing more stuff offloaded
on more devices....
But in the scheme of debloating things, and sticking to just wifi for
this paragraph, tend to feel that txop clamping, & reducing hw retries,
and doing saner things with multicast, are a bigger win than
improvements to fq_codel itself or cake.
I haven't done much work on fq_codel_fast of late, but I threw out
everything people didn't use, and put in new things that were needed
like gso splitting and an early version of SCE, but few have tried
it... and my original goal for it was to have a multi-core shaper
facility in it and more limited queues automatically when used as
a default qdisc - 64000 fq_codeld (or cake!) queues seems like quite a
lot when you have 64 hw mqs. I'd be more comfortable if it autotuned...
(see also rss++)
... in terms of fantasizing ...
I'd like cake, to be able to use RSS and shape across
multiple cores. My basic dream has generally been that a single
line for inbound shaping that worked with RSS would work miracles.
tc qdisc add dev eth0 ingress cake 100Mbit.
without needing to use tc mirred.
A lot of good things have happened over the last few years to make that
more feasible - listification as one example. For all I know it's easy
to do now....
Would love to see a hardware offload. Am looking forward to google's
preso on their ebf+etx solution at netdevconf. Might be a game changer,
that... it feeds back into my old concept for the "bobbie" policer
much better if only timestamps worked from hw ingress to hw egress.
e2e: I'd really like to see BBRv1 gain RFC3168 and BBRv2 get SCE for
comparison purposes. I'm looking forward to the preliminary experiments
with mmwave radio (paper upcoming) because I think we're all thinking
about how that's going to work, wrong...
And I'd like new grant money derived from a penny per user voluntary
donation from the billion+ machines running fq_codel...
And a pony.
It's my hope more people show up to go and explore all these options,
and collaborate and make a better, bufferbloat-free internet, somehow,
in my lifetime.
> - Jonathan Morton
More information about the Cake