Lets make wifi fast again!
 help / color / mirror / Atom feed
From: Michal Kazior <michal.kazior@tieto.com>
To: Avery Pennarun <apenwarr@gmail.com>
Cc: Tim Shepard <shep@alum.mit.edu>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	 Johannes Berg <johannes@sipsolutions.net>,
	Dave Taht <dave.taht@gmail.com>,
	 make-wifi-fast@lists.bufferbloat.net,
	codel@lists.bufferbloat.net
Subject: Re: [Make-wifi-fast] [PATCH 1/2] mac80211: implement fair queuing per txq
Date: Mon, 11 Apr 2016 09:25:44 +0200	[thread overview]
Message-ID: <CA+BoTQ=H+jh_wiSyo4sSWvdx0cpTiqxT5=HXqBEsRJP8R6NSfQ@mail.gmail.com> (raw)
In-Reply-To: <CAHqTa-0yEFYAFpSKyko1ET33GsN==_D8TpBeq6AtwLj4XMTW0A@mail.gmail.com>

On 8 April 2016 at 06:37, Avery Pennarun <apenwarr@gmail.com> wrote:
> On Fri, Mar 25, 2016 at 5:27 AM, Michal Kazior <michal.kazior@tieto.com> wrote:
>> mac80211's software queues were designed to work
>> very closely with device tx queues. They are
>> required to make use of 802.11 packet aggregation
>> easily and efficiently.
>>
>> However the logic imposed a per-AC queue limit.
>> With the limit too small mac80211 wasn't be able
>> to guarantee fairness across TIDs nor stations
>> because single burst to a slow station could
>> monopolize queues and reach per-AC limit
>> preventing traffic from other stations being
>> queued into mac80211's software queues. Having the
>> limit too large would make smart qdiscs, e.g.
>> fq_codel, a lot less efficient as they are
>> designed on the premise that they are very close
>> to the actualy device tx queues.
>
> As usual, I'm way behind on everything, but I have been testing this
> patch series in the background (no clear results to report yet) and
> wanted to comment at a very high level.  I think you are actually
> doing several stages of improvements all at once here:
>
> [0. Baseline: one big queue going into the driver]
> 1. Switch ath10k to mac80211 per-station queues.
> 2. Change per-station queues to use NO_QUEUE qdisc and *not* ever stop
> the kernel netdev queue (since there no longer is one).
> 3. Actively manage per-station queues with fq_codel.
> 4. DQL-like control system for managing hardware queues.

The #4 is not really part of this patch series (unless you add the
older RFCv2 with txop scheduling or ath10k RFC which tests global DQL
to the mix).

Otherwise you're correct. I would argue #1 doesn't really matter
because without it all other changes are no-op though.


> Just to clarify what I mean by #2, if I understand correctly, before
> this patch, the driver+mac80211 keeps track of the total number of
> packets in all the mac80211 queues.

It keeps track of number of packets per AC..


>  When the total exceeds a fixed
> amount (or when one of the per-station queues gets full?) mac80211
> tells the kernel to stop sending in new packets, so they sit around in
> the qdisc instead.

..and stops netdev subqueues (0..3) when this per-AC limit is reached.

That is all assuming you have a wake_tx_queue() based driver.
Otherwise mac80211 doesn't really do anything special. It stops
subqueues when driver asks it to (when internal driver/device queues
become full).


> The problem with this behaviour is we probably
> have a lot of packets for one station, and not many packets for other
> stations, even if the netdev qdisc has plenty of packets still waiting
> for those other stations.  When you then go to drain the mac80211
> queues in a round-robin fashion, only the fullest queue (corresponding
> to the busiest stream to the fastest station) can get optimal results.
> The driver can then either send out from the fullest queue (unfair but
> fast) or round robin using the non-full queues (fair but non-optimal
> speed).

Roughly, yes.


> Upon implementing #2, we would essentially never tell the kernel to
> stop sending packets; instead, it just always forwards them to
> mac80211, which needs to learn how to drop them instead of providing
> backpressure.  This moves the entire qdisc functionality into
> mac80211, hence the use of NO_QUEUE.

Correct.


> It's then obvious that if you just did the obvious thing (tail drop),
> you'll end up with high latency, so you added fq_codel to the mix.

Yes. Also the mere taildrop doesn't split flows so you end up bulky
TCP traffic delaying/starving, e.g. your ICMP (and other short/bursty
traffic) which might be considered a regression if someone used
fq_codel before.


> However, as people on this thread have noticed, fq_codel is
> complicated.  I'd like to be able to evaluate the performance impact
> of each of the above steps separately.  In particular, my theory is
> that if we implement #2 with just a simple FIFO queue per station,
> then if we have two stations competing (one slow, one fast), and
> dequeue aggregates using round robin, then we should get all of:
>
> a) Full airtime utilization and max-length aggregates
> and
> b) High latency only on busy stations, but near-zero latency on idle
> stations (because of round-robin servicing of the per-station queues).

Roughly, yes.


> Using just a tail drop implementation, it should be very easy for me
> to test that (a) and (b) are true.  It should also be strictly equal
> (one station) or better (multiple stations) than using mac80211 soft
> queues with the pfifo_fast qdisc.  If that isn't what happens, then
> we'll know something went wrong with that part of the code, and we can
> debug that before moving on to a wifi-aware fq_codel.
>
> So my request: do you mind splitting your patch into two patches, one
> that implements just NO_QUEUE and per-station fifo tail drop, with a
> second patch that converts the tail drop to fq_codel?

Hmm.. Actually it might be worth splitting it up into 3 patches:
 1. noqueue (taildrop)
 2. fq (headdrop)
 3. codel

In which case (2) should introduce additional intra-station flow
fairness (compared to (1)) and (3) should make tcp behave better,
right?

Moreover I'm thinking of making "fq" part reusable the same way
"codel" is (or attempts to be today at least), i.e. header trickery
for easy gluing. This should make it easier to fix non-mac80211
drivers as well with possibly little effort in the future. Thoughts?


> Another advantage of the split is that we could then test NO_QUEUE +
> tail_drop + DQL.  Again, that should be strictly better than the
> NO_QUEUE + tail_drop + fixed_driver_queue.  Then it might be easier to
> debug the (much more fiddly) fq_codel on top.
>
> Thoughts?

It does make sense to split it up.


Michał

  parent reply	other threads:[~2016-04-11  7:25 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-16 10:17 [Make-wifi-fast] [RFCv2 0/3] mac80211: implement fq codel Michal Kazior
2016-03-16 10:17 ` [Make-wifi-fast] [RFCv2 1/3] mac80211: implement fq_codel for software queuing Michal Kazior
2016-03-22  1:35   ` David Lang
2016-03-22  6:51     ` Michal Kazior
2016-03-16 10:17 ` [Make-wifi-fast] [RFCv2 2/3] ath10k: report per-station tx/rate rates to mac80211 Michal Kazior
2016-03-24  7:19   ` Mohammed Shafi Shajakhan
2016-03-24  7:49     ` Michal Kazior
2016-03-24 12:23       ` Mohammed Shafi Shajakhan
2016-03-24 12:31         ` Michal Kazior
2016-03-16 10:17 ` [Make-wifi-fast] [RFCv2 3/3] ath10k: use ieee80211_tx_schedule() Michal Kazior
2016-03-16 10:26 ` [Make-wifi-fast] [RFCv2 0/3] mac80211: implement fq codel Michal Kazior
2016-03-16 15:37   ` Dave Taht
2016-03-16 18:36     ` Dave Taht
2016-03-16 18:55       ` Bob Copeland
2016-03-16 19:48         ` Jasmine Strong
2016-03-17  8:55           ` Michal Kazior
2016-03-17 11:12             ` Bob Copeland
2016-03-17 17:00             ` Dave Taht
2016-03-17 17:24               ` [Make-wifi-fast] [Codel] " Rick Jones
2016-03-21 11:57               ` [Make-wifi-fast] " Michal Kazior
2016-03-17  9:43       ` Michal Kazior
2016-03-17  9:03     ` Michal Kazior
2016-03-25  9:27 ` [Make-wifi-fast] [PATCH 0/2] mac80211: implement fq_codel Michal Kazior
2016-03-25  9:27   ` [Make-wifi-fast] [PATCH 1/2] mac80211: implement fair queuing per txq Michal Kazior
     [not found]     ` <CAHqTa-0yEFYAFpSKyko1ET33GsN==_D8TpBeq6AtwLj4XMTW0A@mail.gmail.com>
2016-04-11  7:25       ` Michal Kazior [this message]
2016-03-25  9:27   ` [Make-wifi-fast] [PATCH 2/2] mac80211: expose some txq/fq internals and knobs via debugfs Michal Kazior
2016-03-31 10:28   ` [Make-wifi-fast] [PATCHv2 0/2] mac80211: implement fq_codel Michal Kazior
2016-03-31 10:28     ` [Make-wifi-fast] [PATCHv2 1/2] mac80211: implement fair queuing per txq Michal Kazior
2016-04-05 13:57       ` Johannes Berg
2016-04-05 14:32         ` Dave Taht
     [not found]           ` <1459927276.17504.6.camel@sipsolutions.net>
2016-04-06 17:39             ` Dave Taht
2016-04-07  8:53               ` Johannes Berg
2016-04-06  5:35         ` Michal Kazior
2016-04-06  6:03           ` Jonathan Morton
2016-04-06  7:16             ` Michal Kazior
2016-04-06 16:46               ` Jonathan Morton
2016-04-06  7:19           ` Johannes Berg
2016-03-31 10:28     ` [Make-wifi-fast] [PATCHv2 2/2] mac80211: expose some txq/fq internals and knobs via debugfs Michal Kazior
2016-04-14 12:18     ` [Make-wifi-fast] [PATCHv3 0/5] mac80211: implement fq_codel Michal Kazior
2016-04-14 12:18       ` [Make-wifi-fast] [PATCHv3 1/5] mac80211: skip netdev queue control with software queuing Michal Kazior
2016-04-16 22:21         ` Johannes Berg
2016-04-18  5:39           ` Michal Kazior
2016-04-14 12:18       ` [Make-wifi-fast] [PATCHv3 2/5] mac80211: implement fair queueing per txq Michal Kazior
2016-04-16 22:23         ` Johannes Berg
2016-04-16 22:25           ` Johannes Berg
2016-04-18  5:16             ` Michal Kazior
2016-04-18 12:31               ` [Make-wifi-fast] [Codel] " Eric Dumazet
2016-04-18 13:36                 ` Michal Kazior
2016-04-19  9:10                   ` Johannes Berg
2016-04-14 12:18       ` [Make-wifi-fast] [PATCHv3 3/5] mac80211: add debug knobs for fair queuing Michal Kazior
2016-04-14 12:18       ` [Make-wifi-fast] [PATCHv3 4/5] mac80211: implement codel on fair queuing flows Michal Kazior
2016-04-16 22:29         ` Johannes Berg
2016-04-18  5:31           ` Michal Kazior
2016-04-18 12:38             ` Michal Kazior
2016-04-19  9:06               ` Johannes Berg
2016-04-19  9:31                 ` Michal Kazior
2016-04-19  9:57                   ` Johannes Berg
2016-04-14 12:18       ` [Make-wifi-fast] [PATCHv3 5/5] mac80211: add debug knobs for codel Michal Kazior
2016-05-05 11:00       ` [Make-wifi-fast] [PATCHv4 0/5] mac80211: implement fq_codel Michal Kazior
2016-05-05 11:00         ` [Make-wifi-fast] [PATCHv4 1/5] mac80211: skip netdev queue control with software queuing Michal Kazior
2016-05-09 12:28           ` Michal Kazior
2016-05-05 11:00         ` [Make-wifi-fast] [PATCHv4 2/5] mac80211: implement fair queueing per txq Michal Kazior
2016-05-05 11:00         ` [Make-wifi-fast] [PATCHv4 3/5] mac80211: add debug knobs for fair queuing Michal Kazior
2016-06-09  9:48           ` Johannes Berg
2016-05-05 11:00         ` [Make-wifi-fast] [PATCHv4 4/5] mac80211: implement codel on fair queuing flows Michal Kazior
2016-05-05 15:30           ` Dave Taht
2016-05-05 11:00         ` [Make-wifi-fast] [PATCHv4 5/5] mac80211: add debug knobs for codel Michal Kazior
2016-05-05 15:21           ` Dave Taht
2016-05-06  5:27             ` Michal Kazior
2016-05-06  5:51               ` Dave Taht
2016-05-06  6:33                 ` Michal Kazior
2016-05-06  7:23                   ` Dave Taht
2016-05-19  8:37         ` [Make-wifi-fast] [PATCHv5 0/5] mac80211: implement fq_codel Michal Kazior
2016-05-19  8:37           ` [Make-wifi-fast] [PATCHv5 1/5] mac80211: skip netdev queue control with software queuing Michal Kazior
2016-05-19  8:37           ` [Make-wifi-fast] [PATCHv5 2/5] mac80211: implement fair queueing per txq Michal Kazior
2016-05-19  8:37           ` [Make-wifi-fast] [PATCHv5 3/5] mac80211: add debug knobs for fair queuing Michal Kazior
2016-05-19  8:37           ` [Make-wifi-fast] [PATCHv5 4/5] mac80211: implement codel on fair queuing flows Michal Kazior
2016-05-19  8:37           ` [Make-wifi-fast] [PATCHv5 5/5] mac80211: add debug knobs for codel Michal Kazior
2016-06-09  9:47             ` Johannes Berg
2016-05-31 12:12           ` [Make-wifi-fast] [PATCHv5 0/5] mac80211: implement fq_codel Toke Høiland-Jørgensen
2016-05-31 12:31             ` Michal Kazior
2016-06-09  9:49           ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/make-wifi-fast.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+BoTQ=H+jh_wiSyo4sSWvdx0cpTiqxT5=HXqBEsRJP8R6NSfQ@mail.gmail.com' \
    --to=michal.kazior@tieto.com \
    --cc=apenwarr@gmail.com \
    --cc=codel@lists.bufferbloat.net \
    --cc=dave.taht@gmail.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-wireless@vger.kernel.org \
    --cc=make-wifi-fast@lists.bufferbloat.net \
    --cc=shep@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox