[Make-wifi-fast] Fwd: [RFC/RFT] mac80211: implement fq_codel for software queuing

Thu Mar 3 22:23:36 EST 2016

This appears to have bounced.

---------- Forwarded message ----------
From: Michal Kazior <michal.kazior at tieto.com>
Date: Mon, Feb 29, 2016 at 4:35 AM
Subject: Re: [RFC/RFT] mac80211: implement fq_codel for software queuing
To: Dave Taht <dave.taht at gmail.com>
Cc: make-wifi-fast at lists.bufferbloat.net,
"codel at lists.bufferbloat.net" <codel at lists.bufferbloat.net>

On 26 February 2016 at 23:20, Dave Taht <dave.taht at gmail.com> wrote:
> Dear Michal:
>
> Can you take a picture of your setup?

I guess a diagram must do for now:

                      .---------[G0]
                      |
        [L0]        [AP]        [L4]

        [L1]        [L2]        [L3]

 * diagram skips testbed control plane
 * G0 is traffic generator
   - connected via ethernet to AP (AP bridges traffic)
   - running 3.16
 * AP runs QCA99X0 (4 antenna) non-encrypted network
 * L0..L4 are laptops
   - running 4.3.0
 * each has up to 3 QCA9337 (1 antenna) chips
 * total 10 clients
   - all connected to the AP
 * some of the chips are mounted on an Express Card adapters
 * some of the chips are mounted inside with mPCI-E -> M.2 adapters
   - antennas are put rogue-style through gaps in laptop' exterior
 * each client antenna is placed in ~0.5m away from the AP
 * client antennae are not uniformly placed with regard to each other
   (limited by pigtail lengths)
 * each client chip is run inside a QEMU VM with PCI-passthrough

Let me know if you want to know more details.

> Our intent is to continue to improve the flent test suite to be able
> to generate repeatable tests, track relevant wifi behaviors and pull
> relevant data back, graphed over time (of test) and time (over test
> runs). A problem with udp flood tests is that tcp traffic is always
> bidirectional (data vs acks), so a naive thought would be, that yes,
> you should get half the bandwidth you get with a udp flood test.

I don't see why you'd be doomed to get only half the bandwidth because
of that? Sure, Wi-Fi is half-duplex but transmit time for ACKs is a
lot smaller than transmit time for the data.

Moreover you also have stuff like satellite links which have
inherently long latency/pipes and large Bandwidth-Delay Product. You
could think of Wi-Fi in a similar fashion (albeit it's more dynamic so
it's not directly comparable). I'm not saying it should be the default
though.

> But in the age of aggregation that is not correct.
>
> It is my hope for us to join you on testing/evaluating the various
> bits, but with so many patches (wonderfully, but suddenly) flying
> around in loose formation ( can we start a lowlatency-wifi kernel tree
> somewhere? - oy, there are so many other moving parts!), that's going
> to take a bit. While we have some ath10k gear, the biggest testbeds
> (karstad, san francisco, yurtlab) are all ath9k based.
>
> Some things you could do for us whilst we try to catch up.
>
> Take packet captures! - there are plenty of tcp experts on the codel list.
>
> For single station tests: run a repeatable test series: rrul, rrul_be,
> tcp_upload, tcp_download. Provide those flent.gz files.
> rrul exercises 3 of the 4 802.11e queues on most systems.
> rrul_be one queue
>
> Example:
> #!/bin/sh
> T="some meaningful title like fq_codel_target_30ms_10meters-crazynewpatch-1"
> S=some.netperf.server.nearby
> F="fent -x -l 60 "
> TESTS="rrul rrul_be tcp_upload tcp_download"
>
> for i in $TESTS
> do
> $F -H $S -t "$T"
> done
>
> flent-gui *.gz
>
> If you are running tests overnight (recommended, wifi data is noisy so
> are office environments), iterate on the $T-test number...
>
> You can also track remote queue lengths and stats with other flent
> options.

> My assumption however is that you are almost entirely
> bypassing the qdisc queue now(?) and injecting things into a queue
> that cannot be seen by userland?

Yes. The patch uses IFF_NO_QUEUE (it would be dev->tx_queue_len=0 in
pre-4.2 I think) so there are no qdiscs. Hence there's also no tx
queue wake/stop logic performed.

Userspace shouldn't see much of a difference because sockets still
keep track of sk_buffs (and hence block on write/sendmsg when socket
buffer limit is reached). Since the fq_drop() looks for elephant flows
and head-drops them even if txq_limit limits is reached, it should
work fine even without subqueue_stop/wake.

> For playing with MU-mimo, the various rtt_fair tests in flent were our
> starting point, which test anywhere from 1 to 4 stations.  example
> testing 2 stations with two tcp streams.
>
> rtt_fair4be -H station1 -H station2 -H station1 -H station2
>
> The packet captures should be *fascinating* on that.
>
> Aircaps interesting also.
>
> Other variables to tweak:
>
> 0) Use the same driver on server and client. Then a reference driver.
> 1) Disable codel entirely or give it a really big target/interval
> (30ms, 300ms) to just look at the fq portion of the algorithm.
> 2) enabling ECN on the tcps on server and client will give you a clear
> idea as to when codel was kicking in vs packets being dropped
> elsewhere on the packet captures.

My current patch doesn't handle ECN.

> 3) One of my biggest ongoing concerns with adapting codel in wifi has
> been the impact of multicast on it - mdns-scan (along with any of the
> above tests), or some other heavy mcast program in the background
> (uftp is not bad). mu-mimo introduces new issues with sounding that I
> don't think anyone understands at any detail yet.

> Can wireshark or
> some other tool "see" a sounding?

Hmm.. NDP (null-data-packets) don't have any MAC payload to my
knowledge which makes it kind of pointless to even report to the host.
Even if it does it'd need some low-level RF data that is derived from
receiving such packets. Radiotap isn't sufficient for that, I'm sure.
Vendor radiotap could be used but I still don't know what info
could/should be exposed for TxBF sounding.

Otherwise there are is also sounding management frames for
starting/controlling sounding (if I'm remembering right) so you should
be - at least - able to see that sounding is being *attempted*.

> 4) Distance and rate control. MCS4 was my basic rate for transmits
> from stations for the longest time as that appeared to be the median
> rate I'd got in various coffee shops... while I realize you have to
> achieve peak throughput under ideal conditions, it's achieving good
> overall performance in more abusive conditions...
>
> ... and ...
>
> 5) come to battlemesh with what you got.

Sounds tempting but I can't promise anything.

Anyway, thanks for all the tips! I'll play with flent and get back to
you later. I've been preempted by other things for the time being..

Michał