[Make-wifi-fast] [RFC] mac80211: Add airtime fairness accounting

Mon Oct 9 08:38:37 EDT 2017

Johannes Berg <johannes at sipsolutions.net> writes:

> On Mon, 2017-10-09 at 11:42 +0200, Toke Høiland-Jørgensen wrote:
>
>> Well, the padding and spacing between frames is at most 11 bytes (4-
>> byte delimiter, 4-byte FCS and 3-byte padding), which is ~0.7% of a
>> full-sized frame. I'm not too worried about errors on that scale,
>> TBH.
>
> I'm not sure - this really should take the whole frame exchange
> sequence into consideration, since the "dead" IFS time and the ACK etc.
> is also airtime consumed for that station, even if there's no actual
> transmission going on.
>
> If you factor that in, the overhead reduction with aggregation is
> considerable! With an 80 MHz 2x2 MCS 9 (866Mbps PHY rate) A-MPDU
> containing 64 packets, you can reach >650Mbps (with protection),
> without A-MPDU you can reach only about 45Mbps I think.
>
> You'd think that a 1500 byte frame takes 1.5ms for the 1Mbps client,
> and ~14µs for the above mentioned VHT rate.
>
> In reality, however, the overhead for both is comparable in absolute
> numbers, it's >200µs.
>
> If you don't take any of this overhead into account at all, then you'll
> vastly over-allocate time for clients sending small (non-aggregated)
> frames, because for those - even with slow rates - the overhead will
> dominate.

Right, but most of these are constant values that are straight forward
to add as long as you know how the frame was received, no? Maybe not as
a general function in mac80211, but the driver should be able to
perform a reasonable computation in the absence of information from the
hardware.

What does iwl put into the status.tx_time field of ieee80211_tx_info,
BTW? That was the only driver I could find that used the field, and it
looks like it just writes something it gets from the hardware into it.
So does that value include overhead? And what about retransmissions?

> I don't know if there's an easy answer. Perhaps not accounting for the
> overhead but assuming that clients won't be stupid and will actually
> do aggregation when they ramp up their rates is reasonable in most
> scenarios, but I'm afraid that we'll find interop issues - we found
> for example that if you enable U-APSD lots of devices won't do
> aggregation any more ...

What do you mean by "interop" here, exactly? Just that stations doing
weird things will see reduced performance?

>> > > Ideally, I would prefer the scheduling to be "two-pass": First,
>> > > decide which physical station to send to, then decide which TID
>> > > on that station to service. 
>> > 
>> > Yeah, that would make more sense.
>> > 
>> > > But because everything is done at the TID/TXQ level, that is not
>> > > quite trivial to achieve I think...
>> > 
>> > Well you can group the TXQs, I guess. They all have a STA pointer,
>> > so
>> > you could put a one- or two-bit "schedule color" field into each
>> > station and if you find a TXQ with the same station color you just
>> > skip it or something like that?
>> 
>> Couldn't we add something like a get_next_txq(phy) function to
>> mac80211 that the drivers can call to get the queue to pull packets
>> from? That way, responsibility for scheduling both stations and QoS
>> levels falls to mac80211, which makes it possible to do clever
>> scheduling stuff without having to re-implement it in every driver.
>> Also, this function could handle all the special TXQs for PS and non-
>> data frames that you were talking about in your other email?
>> 
>> Unless there's some reason I'm missing that the driver really needs
>> to schedule the TXQs, I think this would make a lot of sense?
>
> I have no idea, that's something you'll have to ask Felix I guess. I'd
> think it should work, but the scheduling might have other constraints
> like wanting to fill certain A-MPDU buffers, or getting a specific
> channel (though that's already decided once you pick the station).

I'm pretty sure it will work for ath9k. That just picks a TXQ and keeps
pulling packets until it has filled an aggregate. That would still be
possible if mac80211 picks the TXQ instead of the driver itself. So I
was asking more generally, but if you don't see anything obvious that
would prevent me from doing this, I guess I'll go and try it out :)

> It might also be hard to combine that - if you have space on your VI
> queue, how do you then pick the queue? We can't really go *all* the
> way and do scheduling *entirely* in software, getting rid of per-AC
> queues, since the per-AC queues also work to assign the EDCA
> parameters etc.

We'll need to keep putting packets into different hardware queues, sure.
But deciding which one can be done at the last instant (i.e., for ath9k,
ask mac80211 for a TXQ, look at which AC that TXQ belongs to, and start
putting packets into that hardware queue).

One of the things I would also like to try, is to sometimes promote or
demote packets between AC levels. E.g., if a station has one VO packet
and a bunch of BE packets queued, it may sometimes be more efficient to
just put the VO packet at the beginning of a BE aggregate. I still need
to figure out for which values of 'sometimes' this is a good idea, but
I'd like to at least be able to support this sort of shenanigans, which
I believe what I proposed above will.

> Also, in iwlwifi we actually have a HW queue per TID to facilitate
> aggregation, though we could just let mac80211 pick the next TXQ to
> serve and skip in the unlikely case that the HW queue for that is
> already full (which really shouldn't happen).

Yeah, there may be a need for the driver to be able to express some
constraints on the queues it can accept currently; may a bitmap of
eligible TID numbers, or just a way of saying "can't use this TXQ,
please give me another". But it may also be that it's enough for the
driver to just give up and try again later if it can't use the TXQ it is
assigned...

-Toke