[Make-wifi-fast] [PATCH v2] mac80211: Move crypto IV generation to after TXQ dequeue.
toke at toke.dk
Fri Aug 26 04:54:47 EDT 2016
Johannes Berg <johannes at sipsolutions.net> writes:
> On Mon, 2016-08-22 at 16:47 +0200, Toke Høiland-Jørgensen wrote:
>> I suppose that could be a way to do it (i.e. have
>> ieee80211_tx_dequeue call all the TX hooks etc), but am not sure
>> whether there would be problems doing all this work in the loop
>> that's building aggregates (which is what would happen for ath9k at
> I don't know, but it seems that it's worth trying.
>> An alternative could be to split the process up in two: An "early"
>> and "late" stage, where the early stage does everything that is not
>> sensitive to reordering and the occasional drop, and the late stage
>> is everything that is. Then the queueing step could happen in-between
>> the two stages, and the non-queueing path could just call both stages
>> at once. In effect, this would just make the current work-arounds be
>> more explicit in the structure, rather than being marked as
> I'm not sure that works the way you think it does.
> What you did works for fast-xmit, but *only* because that doesn't do
> software crypto. If, for some reason, the TXQ stuff combines with
> software crypto, which doesn't seem impossible (ath9k even has a module
> parameter, iirc), then you have no way for this to work.
Yeah, I realised that when I started reviewing the slow path (sorry for
not realising that straight away). The v3 takes the "split handlers"
approach for this reason. That saved having to deal with fragmentation
on TXQ dequeue, and it means that some of the processing can be done
before queueing (such as GSO splitting; having packets be as small as
possible before applying FQ to them is a good thing if we want to
realise the full potential).
It seems there are still some bugs to work out with that patch, but I'd
be grateful if you could glance at it and comment on whether you think
this is a viable way forward (provided we can work out all the bugs, of
>> > Now, it's unlikely to be that simple - fragmentation, for example,
>> > might mess this up.
>> > Overall though, I'm definitely wondering if it should be this way,
>> > since all the special cases just add complexity.
>> I agree that the work-arounds are iffy, but I do also think it's
>> important to keep in mind that we are improving latency by orders of
>> magnitude here. A few special cases is worth it to achieve that, IMO.
>> And then iterating towards a design that don't need them, of course
> I don't really agree, I'm not going to treat this unlike any other
> feature, which gets merged when it's ready for that.
> Right now, your code here obviously isn't, since it doesn't even
> address the cases that ath9k could run into, so either ath9k shouldn't
> use this mac80211 feature, or the mac80211 feature needs to be fixed
> before ath9k can use it.
Yeah, I agree now that I've looked at it some more :)
> I have no problems with documenting that the TXQ stuff can only be
> used with full hardware crypto, but then we should add some checks and
> warnings in mac80211 to ensure that, i.e. not allow software keys when
> TXQ stuff is used, nor allow keys with mac80211 PN assignment, etc.
I'd much rather fix things so it works in all cases. My patch to ath9k
to use this stuff completely removes the old TX path, and things like
the airtime fairness scheduler needs the intermediate queues to work.
More information about the Make-wifi-fast