[Make-wifi-fast] [Cake] Cake in mac80211

Dave Taht dave at taht.net
Wed Feb 5 11:06:13 EST 2020


Bjørn Ivar Teigen <bjorn at domos.no> writes:

> Thanks for the feedback!
>
> Some comments and questions added inline.
>
> On Tue, 4 Feb 2020 at 18:07, Dave Taht <dave.taht at gmail.com> wrote:
>
>     On Tue, Feb 4, 2020 at 7:25 AM Jonathan Morton
>     <chromatix99 at gmail.com> wrote:
>     >
>     > > On 4 Feb, 2020, at 5:20 pm, Bjørn Ivar Teigen <bjorn at domos.no>
>     wrote:
>     > >
>     > > Are there any plans, work or just comments on the idea of
>     implementing cake in mac80211 as was done with fq_codel?
>     >
>     > To consider doing that, there'd have to be a concrete benefit to
>     doing so.
>     
>     Research is research! :) Everything is worth trying! There's got
>     to be
>     some better ideas out there, and we have a long list of things we
>     could have done to keep improving wifi had funding not run out.
>     
>     We barely scratched the surface of this list.
>     
>     https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit
>    
>     
>     > Most of Cake's most useful features, beyond what fq_codel
>     already supports, are actually implied or even done better by the
>     WiFi environment and the mac80211 layer adaptation (particularly
>     airtime fairness).
>     
>     In my opinion(s)
>     
>     A) I think ack-filtering will help somewhat on 802.11n, but it's
>     not
>     worth the added cpu cost on an AP and I'd prefer hosts reduce
>     their
>     ack load in the tcp stack (IMHO, others may differ, it's worth
>     trying)
>     B) The underlying wifi scheduler essentially does per host fq
>     better
>     than cake can (because it's layer 2 vs layer 3), as per jonathan's
>     comment above 
>
>     C) Instead of using a 8 way set associative hash and 1024 queues,
>     fq_codel for wifi uses 4096 with a disambiguation pointer for
>     collisions. Seems good enough.
>     
>
> Didn't catch that before. Are the extra queues there because of the
> different access categories on Wi-Fi? Seems like that would mean most
> of them are not in use considering how little traffic is marked with
> DSCP.

I wasn't counting those. There's one set of 4k queues per access class.

While I agree that access classes are rarely used, and am of the opinion
that they shouldn't actually be used on an n or ac AP as better scheduling of the
BE class suffices. 802.11e is useful on well behaved clients for a few
things.

the number of queues was kind of picked as a function of the absolute maximum
number of stations wireless-n can take and a swag.

our original conception was that we'd have one fairly small fq_codel
instance per station, dynamically arriving and departing as the station
did, which proved really problematic to implement - we were stuck on how
stateful it was and all kinds of locking issues, for nearly 2 years
before michiel kazior came up with the simpler "lots of queues +
disambiguation pointer" idea.

Another idea unexplored is clamping the used and advertised (in the
beacon) txop size dynamically when under higher contention. I certainly
get better latency with a 2-3ms txop, but I never got around to
publishing those results in a coherent form. it also increases the
opportunities for an effective mu-mimo burst.

This to me is way better than explicitly choosing access classes.

My take on things for wifi 6 was that firmware needed to expose a per
station abstraction, and we needed to go back to the fq_codel instance
per station idea.

>
>     D) "cobalt" is proving out better in several respects than pure
>     codel,
>     and folding in some of that makes sense, except I don't know which
>     things are the most valuable considering wifi's other problems
>     
>
> Reading paper now. Thanks for the pointer.

I tend to think out that fq_codel is "good enough" in most
circumstances. The edge cases that cake handles better are a matter of a
few percentage points, vs orders of magnitude that we get with fq_codel
alone vs a vs a FIFO, and my focus of late has been to make things that
ate less cpu or were better offloadable than networked better. Others differ. 


>     E) I'd like to dynamically increase the quantum size as a function
>     of
>     load or number of flows. 
>     
>
>     I'd really like benchmarks of the proprietary versions coming out.
>     Qualcomm has their own fq_codelish thing baked into their firmware
>     now... I have no idea what broadcom is doing... fq-pie?
>     
>
> I've started looking at benchmarking proprietary drivers with emphasis
> on queueing performance. If you have any tips,

I've been after eero in particular to publish results.

> or if you would like to
> co-author a paper (I'm working on a PhD), I am very interested.

I have been without a voice since toke graduated, so yes.

>
>     The librerouter is now available. I'd like to try that.
>     
>     Recently I benchmarked red rock cafe in mountain view, which had
>     the
>     best bufferbloat and rrul score of any cybercafe I'd ever tried -
>     they
>     have a mojo networks AP, which arista bought a while back. It was
>     lovely.... I have no idea what they do,
>     but whatever it was it was *good*. I'm really happy see
>     bufferbloat
>     getting fixes everywhere, but really need to add quic to the
>     benchmark
>     suite somehow in order to feel better about people not rewriring
>     tcp
>     headers to do what they want.
>     
>     more importantly:
>     
>     Would really like to get cracking on a wifi 6 version. So far, all
>     the
>     vendors are lying, there is no OFDMA support in anything we've
>     played
>     with. There are some new outer limits there (1000+ devices), a
>     need to
>     do gang scheduling, and per-station firmware, and I'm
>     profoundly unimpressed with proprietary vendor's efforts so far
>     and
>     wish they'd open up their firmware more so more of us could take a
>     crack at it....
>     
>
> I agree, there are some interesting problems arising there. Interested
> to follow the work if and when this happens. Any luck finding a
> company willing to work on open-source drivers for Wi-Fi 6?

Nope. Feel free to try harder! I keep thinking that with various parties
struggling so hard they might actually try to open things up...

>     I'd really like to get the intel (iwl) version, especially the
>     ax200
>     chips, ported over to the AQL + fq_codel interfaces, at least. The
>     first attempt went badly, last quarter. Needs eyeballs and time...
>     Would like to find some other wifi chip worth fixing - raspi 4?
>     Some
>     android wifi chip? what?
>     Don't know how the ath11k effort is going...
>     
>     In mainline...
>     I'd like to get the wifi codel target on 5ghz down from 20ms (too
>     much) to 10ms, (or as I run it here to 8ms) in mainline, or at
>     least
>     openwrt, but that would require some benchmarking by multiple
>     folk,
>     and I was waiting for the ath10k ATF code to go upstream first. At
>     least make it tunable.
>     
>
> Have done some testing myself and 10ms looks like the correct limit on
> 5GHz.

Yea! Put results somewhere... I've kind of made a mistake in that I
ran my own patched kernels and openwrt instances for years now and
didn't really notice what hadn't got done until some testing at the last
battlemesh. Getting AQL for ath10k upstream is one piece of fallout from that.


>     Overall, reducing hw retries to sanity would be a nice thing to
>     attempt in the ath9k, at least. Although the ongoing SCE work
>     (gradual
>     rate reduction) is interesting, I tend to think reducing hardware
>     retries (with increased loss) would have a more dramatic effect on
>     reducing wifi latencies.
>     Presently with the codel target of 20ms in both directions, I get
>     60-80ms tcp latencies (still better than most fiber!) over wifi
>     with a
>     20ms target at 70mbits. What happens at 300+, no idea. cynically I
>     think much of the internet is essentially running at a max rwind
>     or
>     swind rather than within athe sawtooth.
>     
>
> Also interesting
>
>     doing something more sane to rate limit multicast would be good
>     also.
>     It was quite the long list in that google document, back in the
>     day we
>     thought the wifi industry might decide to collaborate in order to
>     meet
>     the 5G threat.
>     
>     > a Cake instance to the wifi interface as well, if you have a
>     need to do so.
>     
>     It certainly is feasible to do that. I do that now on several
>     802.11ac
>     devices that don't have the fq_codel for wifi hooks, preferring to
>     rate limit them well below capacity so as to ensure consistent low
>     latency. It's really neat to see people able to play world of
>     warcraft
>     and other games over
>     the wifi here. ( started deploying ubnt's uap mesh products,
>     reflashed
>     with openwrt, along portions of my wifi backbone . Looking forward
>     to
>     the AQL backport for those, but I hope someone else does it)
>     
>
> Have this setup at home and it really does make a difference, even
> with just normal browsing. Has bigger impact than I would have
> guessed!

Normal browsing rocks on fq_codel derived solutions.

See fig 14 and fig 24 on this cablelabs study

https://www-res.cablelabs.com/wp-content/uploads/2019/02/28094118/Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf

Why pie "won" to this day bothers me, as at the time it seemed feasible
to implement fq_codel decently on this class of devices. 

(and they weren't even benchmarking the final fq_codel version, but a
quite crippled sfq based one)

>
>     >
>     > - Jonathan Morton
>     > _______________________________________________
>     > Cake mailing list
>     > Cake at lists.bufferbloat.net
>     > https://lists.bufferbloat.net/listinfo/cake
>     
>     
>     
>     -- 
>     Make Music, Not War
>     
>     Dave Täht
>     CTO, TekLibre, LLC
>     http://www.teklibre.com
>     Tel: 1-831-435-0729


More information about the Make-wifi-fast mailing list