From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.taht.net (mail.taht.net [176.58.107.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id B157B3B29D; Wed, 5 Feb 2020 11:06:16 -0500 (EST) Received: from dancer.taht.net (unknown [IPv6:2601:646:8301:676f:eea8:6bff:fefe:9a2]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.taht.net (Postfix) with ESMTPSA id 1FCB1221D8; Wed, 5 Feb 2020 16:06:14 +0000 (UTC) From: Dave Taht To: =?utf-8?Q?Bj=C3=B8rn?= Ivar Teigen Cc: Dave Taht , Cake List , Make-Wifi-fast References: <07250850-5FAF-4AB7-9551-0B26D648AF3D@gmail.com> Date: Wed, 05 Feb 2020 08:06:13 -0800 In-Reply-To: (=?utf-8?Q?=22Bj=C3=B8rn?= Ivar Teigen"'s message of "Wed, 5 Feb 2020 12:53:14 +0100") Message-ID: <87v9oluih6.fsf@taht.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] Cake in mac80211 X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2020 16:06:16 -0000 Bj=C3=B8rn Ivar Teigen writes: > Thanks for the feedback! > > Some comments and questions added inline. > > On Tue, 4 Feb 2020 at 18:07, Dave Taht wrote: > > On Tue, Feb 4, 2020 at 7:25 AM Jonathan Morton > wrote: > > > > > On 4 Feb, 2020, at 5:20 pm, Bj=C3=B8rn Ivar Teigen > wrote: > > > > > > Are there any plans, work or just comments on the idea of > implementing cake in mac80211 as was done with fq_codel? > > > > To consider doing that, there'd have to be a concrete benefit to > doing so. >=20=20=20=20=20 > Research is research! :) Everything is worth trying! There's got > to be > some better ideas out there, and we have a long list of things we > could have done to keep improving wifi had funding not run out. >=20=20=20=20=20 > We barely scratched the surface of this list. >=20=20=20=20=20 > https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE= 285LElJBW4/edit >=20=20=20=20 >=20=20=20=20=20 > > Most of Cake's most useful features, beyond what fq_codel > already supports, are actually implied or even done better by the > WiFi environment and the mac80211 layer adaptation (particularly > airtime fairness). >=20=20=20=20=20 > In my opinion(s) >=20=20=20=20=20 > A) I think ack-filtering will help somewhat on 802.11n, but it's > not > worth the added cpu cost on an AP and I'd prefer hosts reduce > their > ack load in the tcp stack (IMHO, others may differ, it's worth > trying) > B) The underlying wifi scheduler essentially does per host fq > better > than cake can (because it's layer 2 vs layer 3), as per jonathan's > comment above=20 > > C) Instead of using a 8 way set associative hash and 1024 queues, > fq_codel for wifi uses 4096 with a disambiguation pointer for > collisions. Seems good enough. >=20=20=20=20=20 > > Didn't catch that before. Are the extra queues there because of the > different access categories on Wi-Fi? Seems like that would mean most > of them are not in use considering how little traffic is marked with > DSCP. I wasn't counting those. There's one set of 4k queues per access class. While I agree that access classes are rarely used, and am of the opinion that they shouldn't actually be used on an n or ac AP as better scheduling = of the BE class suffices. 802.11e is useful on well behaved clients for a few things. the number of queues was kind of picked as a function of the absolute maxim= um number of stations wireless-n can take and a swag. our original conception was that we'd have one fairly small fq_codel instance per station, dynamically arriving and departing as the station did, which proved really problematic to implement - we were stuck on how stateful it was and all kinds of locking issues, for nearly 2 years before michiel kazior came up with the simpler "lots of queues + disambiguation pointer" idea. Another idea unexplored is clamping the used and advertised (in the beacon) txop size dynamically when under higher contention. I certainly get better latency with a 2-3ms txop, but I never got around to publishing those results in a coherent form. it also increases the opportunities for an effective mu-mimo burst. This to me is way better than explicitly choosing access classes. My take on things for wifi 6 was that firmware needed to expose a per station abstraction, and we needed to go back to the fq_codel instance per station idea. > > D) "cobalt" is proving out better in several respects than pure > codel, > and folding in some of that makes sense, except I don't know which > things are the most valuable considering wifi's other problems >=20=20=20=20=20 > > Reading paper now. Thanks for the pointer. I tend to think out that fq_codel is "good enough" in most circumstances. The edge cases that cake handles better are a matter of a few percentage points, vs orders of magnitude that we get with fq_codel alone vs a vs a FIFO, and my focus of late has been to make things that ate less cpu or were better offloadable than networked better. Others diffe= r.=20 > E) I'd like to dynamically increase the quantum size as a function > of > load or number of flows.=20 >=20=20=20=20=20 > > I'd really like benchmarks of the proprietary versions coming out. > Qualcomm has their own fq_codelish thing baked into their firmware > now... I have no idea what broadcom is doing... fq-pie? >=20=20=20=20=20 > > I've started looking at benchmarking proprietary drivers with emphasis > on queueing performance. If you have any tips, I've been after eero in particular to publish results. > or if you would like to > co-author a paper (I'm working on a PhD), I am very interested. I have been without a voice since toke graduated, so yes. > > The librerouter is now available. I'd like to try that. >=20=20=20=20=20 > Recently I benchmarked red rock cafe in mountain view, which had > the > best bufferbloat and rrul score of any cybercafe I'd ever tried - > they > have a mojo networks AP, which arista bought a while back. It was > lovely.... I have no idea what they do, > but whatever it was it was *good*. I'm really happy see > bufferbloat > getting fixes everywhere, but really need to add quic to the > benchmark > suite somehow in order to feel better about people not rewriring > tcp > headers to do what they want. >=20=20=20=20=20 > more importantly: >=20=20=20=20=20 > Would really like to get cracking on a wifi 6 version. So far, all > the > vendors are lying, there is no OFDMA support in anything we've > played > with. There are some new outer limits there (1000+ devices), a > need to > do gang scheduling, and per-station firmware, and I'm > profoundly unimpressed with proprietary vendor's efforts so far > and > wish they'd open up their firmware more so more of us could take a > crack at it.... >=20=20=20=20=20 > > I agree, there are some interesting problems arising there. Interested > to follow the work if and when this happens. Any luck finding a > company willing to work on open-source drivers for Wi-Fi 6? Nope. Feel free to try harder! I keep thinking that with various parties struggling so hard they might actually try to open things up... > I'd really like to get the intel (iwl) version, especially the > ax200 > chips, ported over to the AQL + fq_codel interfaces, at least. The > first attempt went badly, last quarter. Needs eyeballs and time... > Would like to find some other wifi chip worth fixing - raspi 4? > Some > android wifi chip? what? > Don't know how the ath11k effort is going... >=20=20=20=20=20 > In mainline... > I'd like to get the wifi codel target on 5ghz down from 20ms (too > much) to 10ms, (or as I run it here to 8ms) in mainline, or at > least > openwrt, but that would require some benchmarking by multiple > folk, > and I was waiting for the ath10k ATF code to go upstream first. At > least make it tunable. >=20=20=20=20=20 > > Have done some testing myself and 10ms looks like the correct limit on > 5GHz. Yea! Put results somewhere... I've kind of made a mistake in that I ran my own patched kernels and openwrt instances for years now and didn't really notice what hadn't got done until some testing at the last battlemesh. Getting AQL for ath10k upstream is one piece of fallout from th= at. > Overall, reducing hw retries to sanity would be a nice thing to > attempt in the ath9k, at least. Although the ongoing SCE work > (gradual > rate reduction) is interesting, I tend to think reducing hardware > retries (with increased loss) would have a more dramatic effect on > reducing wifi latencies. > Presently with the codel target of 20ms in both directions, I get > 60-80ms tcp latencies (still better than most fiber!) over wifi > with a > 20ms target at 70mbits. What happens at 300+, no idea. cynically I > think much of the internet is essentially running at a max rwind > or > swind rather than within athe sawtooth. >=20=20=20=20=20 > > Also interesting > > doing something more sane to rate limit multicast would be good > also. > It was quite the long list in that google document, back in the > day we > thought the wifi industry might decide to collaborate in order to > meet > the 5G threat. >=20=20=20=20=20 > > a Cake instance to the wifi interface as well, if you have a > need to do so. >=20=20=20=20=20 > It certainly is feasible to do that. I do that now on several > 802.11ac > devices that don't have the fq_codel for wifi hooks, preferring to > rate limit them well below capacity so as to ensure consistent low > latency. It's really neat to see people able to play world of > warcraft > and other games over > the wifi here. ( started deploying ubnt's uap mesh products, > reflashed > with openwrt, along portions of my wifi backbone . Looking forward > to > the AQL backport for those, but I hope someone else does it) >=20=20=20=20=20 > > Have this setup at home and it really does make a difference, even > with just normal browsing. Has bigger impact than I would have > guessed! Normal browsing rocks on fq_codel derived solutions. See fig 14 and fig 24 on this cablelabs study https://www-res.cablelabs.com/wp-content/uploads/2019/02/28094118/Active_Qu= eue_Management_Algorithms_DOCSIS_3_0.pdf Why pie "won" to this day bothers me, as at the time it seemed feasible to implement fq_codel decently on this class of devices.=20 (and they weren't even benchmarking the final fq_codel version, but a quite crippled sfq based one) > > > > > - Jonathan Morton > > _______________________________________________ > > Cake mailing list > > Cake@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/cake >=20=20=20=20=20 >=20=20=20=20=20 >=20=20=20=20=20 > --=20 > Make Music, Not War >=20=20=20=20=20 > Dave T=C3=A4ht > CTO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-831-435-0729