* Re: [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues [not found] <E1b8sbm-00083e-00@www.xplot.org> @ 2016-06-04 14:32 ` Toke Høiland-Jørgensen [not found] ` <E1b9kkn-0000h7-00@www.xplot.org> 0 siblings, 1 reply; 6+ messages in thread From: Toke Høiland-Jørgensen @ 2016-06-04 14:32 UTC (permalink / raw) To: Tim Shepard; +Cc: linux-wireless, make-wifi-fast, ath9k-devel Tim Shepard <shep@alum.mit.edu> writes: >> Reworked to not require the global variable renaming in ath9k. >> >> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> > > Huh. I wonder why you did that? Is it really better to call the > ieee80211_txq a swq and call the ath9k hardware queue a txq? I > thought doing the renaming made for more readable and much more > maintainable code (where searching for text strings produced much > cleaner results when trying to track down all references). Well, the immediate reason was that at the time I just spent two weeks figuring out how ath9k worked and what the different concepts were, and suddenly they start to mean something different. The txq->hwq renaming would probably make sense in itself, but suddenly the same variable (txq) meant something different, which was quite confusing. So I found it was easier to change your patch to not require the renaming. A secondary reason was to keep the patch delta as small as possible. I'm definitely not the right person to make that call, but I found the global renaming somewhat intrusive. As for the first point, I think it would be easier if you did not call the mac80211 queues 'txq' but something else ('swq' like I did, perhaps; I also considered 'imq' for intermediate queue). This will at least make it clear at a glance that it is something different than 'txq' was previously. > I am grateful to learn that someone has read my patched version of > ath9k at least enough to do this rework. Any comments on the actual > work? Well, it seems to work well enough (haven't noticed the pending_frames issue, but on the other hand I haven't been looking very hard). However, it does fell like somewhat of a temporary solution. Having another intermediate queue in the driver (tid->i_q) and having a side-effect in ath_tid_has_buffered() seems a bit iffy to me. Wouldn't the 'right' way to do it be to have ath_tid_has_buffered() ask the mac layer if it has any frames queued, and then have ath_tx_get_tid_subframe() basically translate straight to a call to ieee80211_tx_dequeue() (taking into account the retry queue)? Not sure if that covers all code paths, but the current approach seems like it has one too many layers of queues? Haven't given the above a lot of thought, but since you asked... :) -Toke ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <E1b9kkn-0000h7-00@www.xplot.org>]
* Re: [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues [not found] ` <E1b9kkn-0000h7-00@www.xplot.org> @ 2016-06-06 4:26 ` Dave Taht [not found] ` <E1b9nQv-0001fc-00@www.xplot.org> 2016-06-06 17:28 ` Toke Høiland-Jørgensen 1 sibling, 1 reply; 6+ messages in thread From: Dave Taht @ 2016-06-06 4:26 UTC (permalink / raw) To: Tim Shepard Cc: Toke Høiland-Jørgensen, linux-wireless, make-wifi-fast, ath9k-devel On Sun, Jun 5, 2016 at 7:59 PM, Tim Shepard <shep@alum.mit.edu> wrote: > > >> > Huh. I wonder why you did that? Is it really better to call the >> > ieee80211_txq a swq and call the ath9k hardware queue a txq? I >> > thought doing the renaming made for more readable and much more >> > maintainable code (where searching for text strings produced much >> > cleaner results when trying to track down all references). >> >> Well, the immediate reason was that at the time I just spent two weeks >> figuring out how ath9k worked and what the different concepts were, and >> suddenly they start to mean something different. The txq->hwq renaming >> would probably make sense in itself, but suddenly the same variable >> (txq) meant something different, which was quite confusing. So I found >> it was easier to change your patch to not require the renaming. >> >> A secondary reason was to keep the patch delta as small as possible. I'm >> definitely not the right person to make that call, but I found the >> global renaming somewhat intrusive. >> > > OK, makes sense. But you left it called txq in mac80211. So someone > reading the mac80211 code and ath9k at the same time (to understand > the whole stack) winds up with txq meaning different things, including > in places in ath9k where it has to reference a field in a structure > defined by mac80211's header files to point to one of its txqs. > >> As for the first point, I think it would be easier if you did not call >> the mac80211 queues 'txq' but something else ('swq' like I did, perhaps; >> I also considered 'imq' for intermediate queue). This will at least make >> it clear at a glance that it is something different than 'txq' was >> previously. > > I'm not the one who called them txq. That was Felix's patch. > He also wrote the mt76 driver and in that driver txq consistently > means the mac80211 intermediate queue and not a driver internal queue > or a hardware queue. So I was trying to converge ath9k to be more > like the one and only example I had of a driver that uses the > intermediate queue. Well, finding a common and minimally invasive (re)naming scheme would be good. > >> > I am grateful to learn that someone has read my patched version of >> > ath9k at least enough to do this rework. Any comments on the actual >> > work? >> >> Well, it seems to work well enough (haven't noticed the pending_frames >> issue, but on the other hand I haven't been looking very hard). However, >> it does fell like somewhat of a temporary solution. Having another >> intermediate queue in the driver (tid->i_q) and having a side-effect in >> ath_tid_has_buffered() seems a bit iffy to me. Wouldn't the 'right' way >> to do it be to have ath_tid_has_buffered() ask the mac layer if it has >> any frames queued, and then have ath_tx_get_tid_subframe() basically >> translate straight to a call to ieee80211_tx_dequeue() (taking into >> account the retry queue)? Not sure if that covers all code paths, but >> the current approach seems like it has one too many layers of queues? >> >> Haven't given the above a lot of thought, but since you asked... :) > > Thanks. I've gotten no other feedback that suggests anyone else has > read the code. So I very much appreciate it. I not only read it but tested it extensively against the ath9k stock, ath10k stock, ath10k -michal's fqmac 3.5 patches, and your ath9k patches... After losing my temper at the impact of channel scans... ( https://plus.google.com/u/0/107942175615993706558/posts/WA915Pt4SRN ), I got told how to get rid of them for testing, and started redoing the work when I got back from vacation. The reductions in latency without a decline in throughput were wonderful vs stock in all cases. The latency reductions at lower rates and speed of response to rate changes was not what I'd hoped for, but given all the other pieces flying in loose formation (like the need for an airtime fair scheduler also), I'm still pretty happy. http://blog.cerowrt.org/flent/nmfixed/20mbit_not_entirely_convinced_weve_won_completely.svg http://blog.cerowrt.org/flent/nmfixed/160_vs_80_half_delay.svg (More plots and data in that directory and in the channel_scan directory.) At the moment however I plan to pause until cleaner patch sets arrive.... > Yes, I didn't like that side effect either, but (at least for my first > attempt) wanted to leave the old transmit path working, in particular > because I couldn't see how to get all the chanctx stuff working right > without leaving the old driver-internal queues in place. (I'm not > even sure what I would have to do to test the driver with > CONFIG_ATH9K_CHANNEL_CONTEXT turned on. I did test that it compiles > without error when CONFIG_ATH9K_CHANNEL_CONTEXT=y before I released my > v2 patch, and I tried to understand it enough to avoid breaking > anything. (My v1 patch broke compilation when > CONFIG_ATH9K_CHANNEL_CONTEXT=y as some robot notified me after I > posted it.) > > I looked for a way to ask mac80211 if there are any packets left in > the intermediate queue without dequeueing a packet and I failed to > find such an interface. qdisc->peek like function? A bitmap that says these stations have data outstanding? > I guess I should have submitted a patch to > add that to mac80211. Or maybe there's a way to refactor the > aggregation code in ath9k so that it can cleanly work with the > existing ieee80211_tx_dequeue() without having to add another > interface to mac80211, but I didn't figure it out. It would have been > a bigger patch to ath9k, and require more thinking when reading the > patch to see if it works (assuming pre-patch ath9k works). > > My current approach was an attempt to get it working. Yes, the code > looks like it has another layer of queues, but at run time there's > only one packet at a time on the internal queue that packets wind up > on from the mac80211 intermediate queue and by putting it on that > queue it can interface to the same code that pulls from that queue or > the previously existing queues (which the chanctx code seems to > require remain around without major surgery). > > I thought about first patching ath9k to remove all the chanctx stuff > just to have a simpler starting point to work on, but I wouldn't > expect that patch to ever be accepted upstream (assuming that somebody > somewhere is getting useful stuff from the chanctx stuff). > > Again, thanks for looking at this patch. > > I'm now working on figuring out the right way to fix my bug in > the <= v2 patch where I fail to respect the hwq_max_pending > values on the new transmit path, and I plan to post a v3 patch > when I get that done. > > -Tim Shepard > shep@alum.mit.edu > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <E1b9nQv-0001fc-00@www.xplot.org>]
* Re: [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues [not found] ` <E1b9nQv-0001fc-00@www.xplot.org> @ 2016-06-06 16:55 ` Dave Taht 2016-06-06 17:26 ` Dave Taht 0 siblings, 1 reply; 6+ messages in thread From: Dave Taht @ 2016-06-06 16:55 UTC (permalink / raw) To: Tim Shepard Cc: Toke Høiland-Jørgensen, linux-wireless, make-wifi-fast, ath9k-devel On Sun, Jun 5, 2016 at 10:50 PM, Tim Shepard <shep@alum.mit.edu> wrote: > > >> > Thanks. I've gotten no other feedback that suggests anyone else has >> > read the code. So I very much appreciate it. >> >> I not only read it but tested it extensively against the ath9k stock, >> ath10k stock, ath10k -michal's fqmac 3.5 patches, and your ath9k >> patches... > > My patch to ath9k shouldn't have any effect on any of the ath10k > stuff. It only cuts ath9k over to use the new intermediate queues. Like the Sith, there are always at least two wifi chips on a link. One behaving significantly differently (like flow queuing and codeling and striving for airtime fairness) does affect the behavior of the other. While I did do some ath9k to ath9k testing, I was mostly testing the "whole enchalada". There are many, many moving parts left to swap out. And the patchset I was testing included your fq_codel port for ath9k but it was based on codel5.h. Michal's latest stuff reworked mainline codel to be generically usable, and it *is* a different variant of the algorithm. The airtime-fair stuff does not as yet include fq_codel on top. It's a MPU, and still puzzling as to why it did not get closer to perfect fairness. I also fiddled with the idea of dynamically altering the beacon's txop sizes. A really short best effort txop (94) was "Interesting". I need to take apart how more beacons are constructed from non-linux vendors. The whole enchalada is tasting pretty good thus far. > I should point out again that Avery's observation that michal's > mac80211 flow queueing patches and mac80211 codel stuff aren't needed > to the improvement between competing client stations. To *2* competing client stations, this is somewhat true at present. There are at least 5 other (pending) factors to factor in. * Toke's preliminary airtime-fair patches already showed a net gain in bandwidth for the higher rate competing station. The "performance anomaly", identified way back in 2003, is still with us without also striving for airtime fairness. * In order to hold latencies low with > 2 stations active, I advocate gradually using shorter txops, which will improve behavior for stations doing interactive tasks, and offering service sooner to infrequently seen stations. * In order to do gang scheduling for mu-mimo, we need to have several 2-3ms sized queues outstanding for the devices that can be mu-mimoed. * Getting minstrel-blues in there to sample more dynamically would be nice * Reducing retries and retransmits would be nice when already congested. I'd also like to try QosNoack. > All that's > needed is to use the new mac80211 per-station per-tid intermediate > queues and also set the IFF_NO_QUEUE bit. It's a heckofastart. > > For ath9k, Felix's mac80211 interemediate queues patch (already in > mainline Linux over a year ago), my patch to ath9k, and just > Michal's first patch alone "[PATCHv3 1/5] mac80211: skip netdev > queue control with software queuing" should (and seems to in > testing I've done so far) get all the latency improvement there is > to be had when the competing traffic is to a different client > station. I think it can be shaved from the presently observed 7-12ms minimum at 160mbit by another 2-3x. Also the codel implementation is not as yet as tightly controlling queue size as I'd like - I haven't pushed it as hard at sub 20mbit performance as I'd like (coping with being enraged at networkmanager's over-use of channel scans was what I was at, last) but I'm showing queue depth of well over 25ms at that rate right now on the ath9k patch. > > > >> After losing my temper at the impact of channel scans... >> >> ( https://plus.google.com/u/0/107942175615993706558/posts/WA915Pt4SRN >> ), I got told how to get rid of them for testing, and started redoing >> the work when I got back from vacation. > > Heh... I've never seen that problem. But I'm not running > network-manager on any machine in my testbed. I tend to think it is important to measure what happens in the real world, to clearly identify what the real world problems actually are. I let everybody else, with attenuators, and emulators, do whatever they want. Me, I'm perpetually setting up real-world labs like the yurtlab and sflab and the isclab to try to see what happens in practice. I now plan to put some work in on making channel scans less nasty and to also look into what it takes to implement 802.11k, 802.11r and 802.11v. or at the very least, nag people to get it more right. https://bugzilla.gnome.org/show_bug.cgi?id=766482 NetworkManager's author suggests here that https://blogs.gnome.org/dcbw/2016/05/16/networkmanager-and-wifi-scans/ "You can also advocate that your favorite WiFi interface add support for NetworkManager’s RequestScan()" > >> > I looked for a way to ask mac80211 if there are any packets left in >> > the intermediate queue without dequeueing a packet and I failed to >> > find such an interface. >> >> qdisc->peek like function? A bitmap that says these stations have data >> outstanding? >> > > I'm not sure what you mean here. Are you asking if that was what I > was looking for in the existing mac80211 code? If so, yes, that is > what I was looking for. Or are you suggesting (or agreeing) that > indeed such an interface should be added to mac80211? Yes, both. > Though I don't see how it could be a bitmap... stations don't have > any obvious stable associated index into such a bitmap. Mores the pity. I liked the idea of adapting something more qfq-like than fq_codel like in parts of this. Allocating a fixed number of bits (usually max_stations is set to like 128), and using array indexes for various station related params (including rate stats), seems to be simpler than chasing pointers around. > > The real problem might be the way all the code in ath9k/xmit.c is > designed where it needs ath_tid_has_buffered() in so many places. Yes, the ath9k is mroe than a bit hairy. :/ > Look at the mt76 driver, and see how it also maintains a small > (probably only at most one packet ever) queue "retry_q" which > sometimes does hold a packet which will get sent before it will > pull another packet from ieee80211_tx_dequeue. But the rest of the > driver is structured so that a NULL return from mt76_txq_dequeue() > is all it needs. That's probably the sort of structure we ought to > have in ath9k, but that's going to be a more invasive patch to > ath9k (and I haven't figured out how to do it yet). I did finally acquire a mt76 board to play with and I agree that that that driver appears the most ideal to be prototyping with. > > I think we can get there incrementally though. I hope the patch for > ath9k that I've got now (and the one I hope to have soon that fixes > the problem of not making use of hwq_max_pending() ) might be useful > intermediate steps (that work, and provide improvements) then we can > clean up ath9k removing mechanisms that are no longer needed (breaking > the old transmit path, and somehow handling the chanctx stuff). Groovy. > > > -Tim Shepard > shep@alum.mit.edu -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues 2016-06-06 16:55 ` Dave Taht @ 2016-06-06 17:26 ` Dave Taht 0 siblings, 0 replies; 6+ messages in thread From: Dave Taht @ 2016-06-06 17:26 UTC (permalink / raw) To: Tim Shepard Cc: Toke Høiland-Jørgensen, linux-wireless, make-wifi-fast, ath9k-devel For the record, michal's lastest patchset for the ath10k is here: https://github.com/kazikcz/linux/tree/fqmac-v5 which includes the reworked codel.h support (which also landed in net-next as of april 22) (no, haven't tried it yet, I'm only a day back from vacation) ... but it would pay to leverage rate control more, for the ath9k, and I'd like folk to agree on a standardized set of statistics in a std location that can be polled for all implementations (ath9k, ath10k, mt76) I am also reviewing this: http://info.iet.unipi.it/~luigi/papers/20160511-mysched-preprint.pdf as we have a chance to innovate and use less locking with all this stuff happening at the mac80211 layer. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues [not found] ` <E1b9kkn-0000h7-00@www.xplot.org> 2016-06-06 4:26 ` Dave Taht @ 2016-06-06 17:28 ` Toke Høiland-Jørgensen 1 sibling, 0 replies; 6+ messages in thread From: Toke Høiland-Jørgensen @ 2016-06-06 17:28 UTC (permalink / raw) To: Tim Shepard; +Cc: linux-wireless, make-wifi-fast, ath9k-devel Tim Shepard <shep@alum.mit.edu> writes: > OK, makes sense. But you left it called txq in mac80211. So someone > reading the mac80211 code and ath9k at the same time (to understand > the whole stack) winds up with txq meaning different things, including > in places in ath9k where it has to reference a field in a structure > defined by mac80211's header files to point to one of its txqs. Yeah, realise there's a problem here. I was coming at it from the ath9k side, so to speak, and having the variable txq suddenly be something else was confusing. > >> As for the first point, I think it would be easier if you did not call >> the mac80211 queues 'txq' but something else ('swq' like I did, perhaps; >> I also considered 'imq' for intermediate queue). This will at least make >> it clear at a glance that it is something different than 'txq' was >> previously. > > I'm not the one who called them txq. I was referring to the variable names, not the struct name. Having 'txq->foo' suddenly be something else is what ticked me off. > That was Felix's patch. He also wrote the mt76 driver and in that > driver txq consistently means the mac80211 intermediate queue and not > a driver internal queue or a hardware queue. So I was trying to > converge ath9k to be more like the one and only example I had of a > driver that uses the intermediate queue. Well, that is certainly an argument for going ahead with the renaming. Hmm, would posting the renaming as a proper proposed patch explaining the reasoning be a way to get some feedback on whether this would be a tractable way forward (and also of keeping the delta against mainline lower)? > Thanks. I've gotten no other feedback that suggests anyone else has > read the code. So I very much appreciate it. You're very welcome; your patch made it possible for me to get straight to hacking on the parts that I wanted to, without having to first figure out how to best interface with the mac80211 stuff. Very helpful :) > Yes, I didn't like that side effect either, but (at least for my first > attempt) wanted to leave the old transmit path working, in particular > because I couldn't see how to get all the chanctx stuff working right > without leaving the old driver-internal queues in place. (I'm not even > sure what I would have to do to test the driver with > CONFIG_ATH9K_CHANNEL_CONTEXT turned on. I did test that it compiles > without error when CONFIG_ATH9K_CHANNEL_CONTEXT=y before I released my > v2 patch, and I tried to understand it enough to avoid breaking > anything. (My v1 patch broke compilation when > CONFIG_ATH9K_CHANNEL_CONTEXT=y as some robot notified me after I > posted it.) Right. Well I have been cheerfully ignoring the chanctx stuff so far. What does that do? > I looked for a way to ask mac80211 if there are any packets left in > the intermediate queue without dequeueing a packet and I failed to > find such an interface. I guess I should have submitted a patch to add > that to mac80211. Or maybe there's a way to refactor the aggregation > code in ath9k so that it can cleanly work with the existing > ieee80211_tx_dequeue() without having to add another interface to > mac80211, but I didn't figure it out. It would have been a bigger > patch to ath9k, and require more thinking when reading the patch to > see if it works (assuming pre-patch ath9k works). Well code actually building the aggregates is not the problem, I think. That just loops on while(ath_tid_has_buffered()) which is pretty straight forward to turn into a dequeue + checking for NULL. It's the aggr_{sleep,wakup,resume} that's conditioning txq wakeup on ath_tid_has_buffered() that's the main problem I guess. What would happen if that was changed to just unconditionally schedule the tid on wakeup? But maybe having an ieee80211_tx_peek() function would be useful in any case? It seems that there's an internal data structure in mac80211 (struct txq_info) that keeps a byte count for that queue, so exporting that would be trivial... > I'm now working on figuring out the right way to fix my bug in the <= > v2 patch where I fail to respect the hwq_max_pending values on the new > transmit path, and I plan to post a v3 patch when I get that done. What's the symptom of this? As I said I haven't noticed anything, but it might be worth looking out for. -Toke ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Make-wifi-fast] [RFC/RFT 0/5] Adding an airtime fairness scheduler to ath9k @ 2016-06-03 16:51 Toke Høiland-Jørgensen 2016-06-03 16:51 ` [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues Toke Høiland-Jørgensen 0 siblings, 1 reply; 6+ messages in thread From: Toke Høiland-Jørgensen @ 2016-06-03 16:51 UTC (permalink / raw) To: linux-wireless, make-wifi-fast, ath9k-devel This patch set is my first pass at attempting to add an airtime fairness enforcing scheduler to ath9k (for use in access point mode). For some background on this approach and its possible benefits see this paper: https://www.usenix.org/event/usenix04/tech/general/full_papers/tan/tan.pdf This is still very preliminary work, but I thought I'd post it for comment anyway. Tentatively, it's looking like it gives some improvement, but is by no means perfect in what it is achieving. See https://blog.tohojo.dk/2016/06/a-first-stab-at-an-airtime-fairness-scheduler-for-ath9k.html for some graphs of the patch set. I'm seeing roughly a 15% improvement in aggregate throughput in the presence of a slow station. To work, the scheduler needs per-station queueing, so this patch series also includes Michal's patch to not have qdiscs on the interface when wake_tx_queue is present, and Tim's patch to add wake_tx_queue to ath9k. The latter I reworked to not require the global variable renaming, since I found that change to be too confusing after having spent time getting used to the ath9k names for things (I put my own name on this commit and added a second Signed-off-by; apologies if this is not the right way to do things for this kind of rework). The scheduler patch set itself is in three parts: The first patch simply adds airtime accounting and exports it to a per-station debugfs file (which the latest git version of Flent knows how to read). The second patch adds the airtime scheduler using only the TX airtime information. Finally the third patch adds in RX airtime to the deficit used by the scheduler. This last part is not always a win (see the blog post linked above), but I haven't tested it extensively. So feel free to test with or without the last patch. There is some code duplication between the debugfs code and the scheduler which I haven't removed yet to be able to easily switch between having the scheduler use the airtime values and simply accounting them for debugging purposes. Comments and test results very welcome! :) -Toke Michal Kazior (1): mac80211: skip netdev queue control with software queuing Toke Høiland-Jørgensen (4): ath9k: use mac80211 intermediate software queues ath9k: Add airstame stats to per-station debugfs ath9k: Add a per-station airtime deficit scheduler ath9k: Count RX airtime in airtime deficit. drivers/net/wireless/ath/ath9k/ath9k.h | 27 +++- drivers/net/wireless/ath/ath9k/channel.c | 12 +- drivers/net/wireless/ath/ath9k/debug.h | 29 ++++ drivers/net/wireless/ath/ath9k/debug_sta.c | 144 ++++++++++++++++++- drivers/net/wireless/ath/ath9k/init.c | 1 + drivers/net/wireless/ath/ath9k/main.c | 7 +- drivers/net/wireless/ath/ath9k/recv.c | 50 +++++++ drivers/net/wireless/ath/ath9k/xmit.c | 222 ++++++++++++++++++++++++----- include/net/mac80211.h | 4 - net/mac80211/ieee80211_i.h | 2 +- net/mac80211/iface.c | 18 ++- net/mac80211/main.c | 3 - net/mac80211/sta_info.c | 2 +- net/mac80211/tx.c | 82 ++++++----- net/mac80211/util.c | 11 +- 15 files changed, 511 insertions(+), 103 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues 2016-06-03 16:51 [Make-wifi-fast] [RFC/RFT 0/5] Adding an airtime fairness scheduler to ath9k Toke Høiland-Jørgensen @ 2016-06-03 16:51 ` Toke Høiland-Jørgensen 0 siblings, 0 replies; 6+ messages in thread From: Toke Høiland-Jørgensen @ 2016-06-03 16:51 UTC (permalink / raw) To: linux-wireless, make-wifi-fast, ath9k-devel Cc: Toke Høiland-Jørgensen, Tim Shepard This patch leaves the code for ath9k's internal per-node per-tid queues in place and just modifies the driver to also pull from the new mac80211 intermediate software queues, and implements the .wake_tx_queue method, which will cause mac80211 to deliver packets to be sent via the new intermediate queue. Signed-off-by: Tim Shepard <shep@alum.mit.edu> Reworked to not require the global variable renaming in ath9k. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> --- drivers/net/wireless/ath/ath9k/ath9k.h | 16 +++- drivers/net/wireless/ath/ath9k/debug_sta.c | 7 +- drivers/net/wireless/ath/ath9k/init.c | 1 + drivers/net/wireless/ath/ath9k/main.c | 1 + drivers/net/wireless/ath/ath9k/xmit.c | 119 +++++++++++++++++++++++++---- 5 files changed, 125 insertions(+), 19 deletions(-) diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h index 93b3793..caeae10 100644 --- a/drivers/net/wireless/ath/ath9k/ath9k.h +++ b/drivers/net/wireless/ath/ath9k/ath9k.h @@ -145,8 +145,6 @@ int ath_descdma_setup(struct ath_softc *sc, struct ath_descdma *dd, #define BAW_WITHIN(_start, _bawsz, _seqno) \ ((((_seqno) - (_start)) & 4095) < (_bawsz)) -#define ATH_AN_2_TID(_an, _tidno) (&(_an)->tid[(_tidno)]) - #define IS_HT_RATE(rate) (rate & 0x80) #define IS_CCK_RATE(rate) ((rate >= 0x18) && (rate <= 0x1e)) #define IS_OFDM_RATE(rate) ((rate >= 0x8) && (rate <= 0xf)) @@ -232,8 +230,10 @@ struct ath_buf { struct ath_atx_tid { struct list_head list; + struct sk_buff_head i_q; struct sk_buff_head buf_q; struct sk_buff_head retry_q; + struct ieee80211_txq *swq; struct ath_node *an; struct ath_txq *txq; unsigned long tx_buf[BITS_TO_LONGS(ATH_TID_MAX_BUFS)]; @@ -247,13 +247,13 @@ struct ath_atx_tid { s8 bar_index; bool active; bool clear_ps_filter; + bool swq_nonempty; }; struct ath_node { struct ath_softc *sc; struct ieee80211_sta *sta; /* station struct we're part of */ struct ieee80211_vif *vif; /* interface with which we're associated */ - struct ath_atx_tid tid[IEEE80211_NUM_TIDS]; u16 maxampdu; u8 mpdudensity; @@ -271,6 +271,15 @@ struct ath_node { struct list_head list; }; +static inline +struct ath_atx_tid *ath_an_2_tid(struct ath_node *an, u8 tidno) +{ + struct ieee80211_sta *sta = an->sta; + struct ieee80211_vif *vif = an->vif; + struct ieee80211_txq *swq = sta ? sta->txq[tidno] : vif->txq; + return (struct ath_atx_tid *) swq->drv_priv; +} + struct ath_tx_control { struct ath_txq *txq; struct ath_node *an; @@ -585,6 +594,7 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw, u16 tids, int nframes, enum ieee80211_frame_release_type reason, bool more_data); +void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *swq); /********/ /* VIFs */ diff --git a/drivers/net/wireless/ath/ath9k/debug_sta.c b/drivers/net/wireless/ath/ath9k/debug_sta.c index b66cfa9..0e7f6b5 100644 --- a/drivers/net/wireless/ath/ath9k/debug_sta.c +++ b/drivers/net/wireless/ath/ath9k/debug_sta.c @@ -25,6 +25,7 @@ static ssize_t read_file_node_aggr(struct file *file, char __user *user_buf, { struct ath_node *an = file->private_data; struct ath_softc *sc = an->sc; + struct ieee80211_txq *swq; struct ath_atx_tid *tid; struct ath_txq *txq; u32 len = 0, size = 4096; @@ -52,8 +53,10 @@ static ssize_t read_file_node_aggr(struct file *file, char __user *user_buf, "TID", "SEQ_START", "SEQ_NEXT", "BAW_SIZE", "BAW_HEAD", "BAW_TAIL", "BAR_IDX", "SCHED", "PAUSED"); - for (tidno = 0, tid = &an->tid[tidno]; - tidno < IEEE80211_NUM_TIDS; tidno++, tid++) { + for (tidno = 0; + tidno < IEEE80211_NUM_TIDS; tidno++) { + swq = an->sta->txq[tidno]; + tid = (struct ath_atx_tid *) swq->drv_priv; txq = tid->txq; ath_txq_lock(sc, txq); if (tid->active) { diff --git a/drivers/net/wireless/ath/ath9k/init.c b/drivers/net/wireless/ath/ath9k/init.c index 2ee8624..211736c 100644 --- a/drivers/net/wireless/ath/ath9k/init.c +++ b/drivers/net/wireless/ath/ath9k/init.c @@ -873,6 +873,7 @@ static void ath9k_set_hw_capab(struct ath_softc *sc, struct ieee80211_hw *hw) hw->max_rate_tries = 10; hw->sta_data_size = sizeof(struct ath_node); hw->vif_data_size = sizeof(struct ath_vif); + hw->txq_data_size = sizeof(struct ath_atx_tid); hw->extra_tx_headroom = 4; hw->wiphy->available_antennas_rx = BIT(ah->caps.max_rxchains) - 1; diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 8b63988..6ab56e5 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -2668,4 +2668,5 @@ struct ieee80211_ops ath9k_ops = { .sw_scan_start = ath9k_sw_scan_start, .sw_scan_complete = ath9k_sw_scan_complete, .get_txpower = ath9k_get_txpower, + .wake_tx_queue = ath9k_wake_tx_queue, }; diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c index 8ddd604..cdc8684 100644 --- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -65,6 +65,8 @@ static struct ath_buf *ath_tx_setup_buffer(struct ath_softc *sc, struct ath_txq *txq, struct ath_atx_tid *tid, struct sk_buff *skb); +static int ath_tx_prepare(struct ieee80211_hw *hw, struct sk_buff *skb, + struct ath_tx_control *txctl); enum { MCS_HT20, @@ -118,6 +120,21 @@ static void ath_tx_queue_tid(struct ath_softc *sc, struct ath_txq *txq, list_add_tail(&tid->list, list); } +void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *swq) +{ + struct ath_softc *sc = hw->priv; + struct ath_atx_tid *tid = (struct ath_atx_tid *) swq->drv_priv; + struct ath_txq *txq = tid->txq; + + spin_lock_bh(&txq->axq_lock); + + tid->swq_nonempty = true; + ath_tx_queue_tid(sc, txq, tid); + ath_txq_schedule(sc, txq); + + spin_unlock_bh(&txq->axq_lock); +} + static struct ath_frame_info *get_frame_info(struct sk_buff *skb) { struct ieee80211_tx_info *tx_info = IEEE80211_SKB_CB(skb); @@ -170,12 +187,51 @@ static struct ath_atx_tid * ath_get_skb_tid(struct ath_softc *sc, struct ath_node *an, struct sk_buff *skb) { u8 tidno = skb->priority & IEEE80211_QOS_CTL_TID_MASK; - return ATH_AN_2_TID(an, tidno); + return ath_an_2_tid(an, tidno); } +static void ath_swq_pull(struct ath_atx_tid *tid) +{ + struct sk_buff *skb; + struct ath_tx_control txctl; + struct ath_frame_info *fi; + int r; + + if (!skb_queue_empty(&tid->i_q)) + return; + + if (!tid->swq_nonempty) + return; + + skb = ieee80211_tx_dequeue(tid->an->sc->hw, tid->swq); + if (!skb) { + tid->swq_nonempty = false; + } else { + /* sad to do all this with axq_lock held */ + memset(&txctl, 0, sizeof txctl); + txctl.txq = tid->txq; + txctl.sta = tid->an->sta; + r = ath_tx_prepare(tid->an->sc->hw, skb, &txctl); + if (WARN_ON(r != 0)) { + /** should not happen ??? */ + } else { + /* perhaps not needed here ??? */ + fi = get_frame_info(skb); + fi->txq = skb_get_queue_mapping(skb); + + __skb_queue_tail(&tid->i_q, skb); + ++tid->txq->pending_frames; + } + } + } + + static bool ath_tid_has_buffered(struct ath_atx_tid *tid) { - return !skb_queue_empty(&tid->buf_q) || !skb_queue_empty(&tid->retry_q); + if (!skb_queue_empty(&tid->buf_q) || !skb_queue_empty(&tid->retry_q) || !skb_queue_empty(&tid->i_q)) + return true; + ath_swq_pull(tid); + return !skb_queue_empty(&tid->i_q); } static struct sk_buff *ath_tid_dequeue(struct ath_atx_tid *tid) @@ -185,6 +241,12 @@ static struct sk_buff *ath_tid_dequeue(struct ath_atx_tid *tid) skb = __skb_dequeue(&tid->retry_q); if (!skb) skb = __skb_dequeue(&tid->buf_q); + if (!skb) + skb = __skb_dequeue(&tid->i_q); + if (!skb) { + ath_swq_pull(tid); + skb = __skb_dequeue(&tid->i_q); + } return skb; } @@ -870,6 +932,10 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq, *q = &tid->retry_q; if (skb_queue_empty(*q)) *q = &tid->buf_q; + if (skb_queue_empty(*q)) + *q = &tid->i_q; + if (skb_queue_empty(*q)) + ath_swq_pull(tid); skb = skb_peek(*q); if (!skb) @@ -1482,7 +1548,7 @@ int ath_tx_aggr_start(struct ath_softc *sc, struct ieee80211_sta *sta, ath_dbg(common, XMIT, "%s called\n", __func__); an = (struct ath_node *)sta->drv_priv; - txtid = ATH_AN_2_TID(an, tid); + txtid = ath_an_2_tid(an, tid); txq = txtid->txq; ath_txq_lock(sc, txq); @@ -1517,7 +1583,7 @@ void ath_tx_aggr_stop(struct ath_softc *sc, struct ieee80211_sta *sta, u16 tid) { struct ath_common *common = ath9k_hw_common(sc->sc_ah); struct ath_node *an = (struct ath_node *)sta->drv_priv; - struct ath_atx_tid *txtid = ATH_AN_2_TID(an, tid); + struct ath_atx_tid *txtid = ath_an_2_tid(an, tid); struct ath_txq *txq = txtid->txq; ath_dbg(common, XMIT, "%s called\n", __func__); @@ -1533,6 +1599,7 @@ void ath_tx_aggr_sleep(struct ieee80211_sta *sta, struct ath_softc *sc, struct ath_node *an) { struct ath_common *common = ath9k_hw_common(sc->sc_ah); + struct ieee80211_txq *swq; struct ath_atx_tid *tid; struct ath_txq *txq; bool buffered; @@ -1540,9 +1607,11 @@ void ath_tx_aggr_sleep(struct ieee80211_sta *sta, struct ath_softc *sc, ath_dbg(common, XMIT, "%s called\n", __func__); - for (tidno = 0, tid = &an->tid[tidno]; - tidno < IEEE80211_NUM_TIDS; tidno++, tid++) { + for (tidno = 0; + tidno < IEEE80211_NUM_TIDS; tidno++) { + swq = an->sta->txq[tidno]; + tid = (struct ath_atx_tid *) swq->drv_priv; txq = tid->txq; ath_txq_lock(sc, txq); @@ -1565,15 +1634,18 @@ void ath_tx_aggr_sleep(struct ieee80211_sta *sta, struct ath_softc *sc, void ath_tx_aggr_wakeup(struct ath_softc *sc, struct ath_node *an) { struct ath_common *common = ath9k_hw_common(sc->sc_ah); + struct ieee80211_txq *swq; struct ath_atx_tid *tid; struct ath_txq *txq; int tidno; ath_dbg(common, XMIT, "%s called\n", __func__); - for (tidno = 0, tid = &an->tid[tidno]; - tidno < IEEE80211_NUM_TIDS; tidno++, tid++) { + for (tidno = 0; + tidno < IEEE80211_NUM_TIDS; tidno++) { + swq = an->sta->txq[tidno]; + tid = (struct ath_atx_tid *) swq->drv_priv; txq = tid->txq; ath_txq_lock(sc, txq); @@ -1599,7 +1671,7 @@ void ath_tx_aggr_resume(struct ath_softc *sc, struct ieee80211_sta *sta, ath_dbg(common, XMIT, "%s called\n", __func__); an = (struct ath_node *)sta->drv_priv; - tid = ATH_AN_2_TID(an, tidno); + tid = ath_an_2_tid(an, tidno); txq = tid->txq; ath_txq_lock(sc, txq); @@ -1637,7 +1709,7 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw, if (!(tids & 1)) continue; - tid = ATH_AN_2_TID(an, i); + tid = ath_an_2_tid(an, i); ath_txq_lock(sc, tid->txq); while (nframes > 0) { @@ -2853,12 +2925,18 @@ int ath_tx_init(struct ath_softc *sc, int nbufs) void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an) { + struct ieee80211_txq *swq; + struct ieee80211_sta *sta = an->sta; + struct ieee80211_vif *vif = an->vif; struct ath_atx_tid *tid; int tidno, acno; - for (tidno = 0, tid = &an->tid[tidno]; + for (tidno = 0; tidno < IEEE80211_NUM_TIDS; - tidno++, tid++) { + tidno++) { + swq = sta ? sta->txq[tidno] : vif->txq; + tid = (struct ath_atx_tid *) swq->drv_priv; + tid->swq = swq; tid->an = an; tid->tidno = tidno; tid->seq_start = tid->seq_next = 0; @@ -2866,23 +2944,33 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an) tid->baw_head = tid->baw_tail = 0; tid->active = false; tid->clear_ps_filter = true; + tid->swq_nonempty = false; + __skb_queue_head_init(&tid->i_q); __skb_queue_head_init(&tid->buf_q); __skb_queue_head_init(&tid->retry_q); INIT_LIST_HEAD(&tid->list); acno = TID_TO_WME_AC(tidno); tid->txq = sc->tx.txq_map[acno]; + + if (!sta) + break; /* just one multicast ath_atx_tid */ } } void ath_tx_node_cleanup(struct ath_softc *sc, struct ath_node *an) { + struct ieee80211_txq *swq; + struct ieee80211_sta *sta = an->sta; + struct ieee80211_vif *vif = an->vif; struct ath_atx_tid *tid; struct ath_txq *txq; int tidno; - for (tidno = 0, tid = &an->tid[tidno]; - tidno < IEEE80211_NUM_TIDS; tidno++, tid++) { + for (tidno = 0; + tidno < IEEE80211_NUM_TIDS; tidno++) { + swq = sta ? sta->txq[tidno] : vif->txq; + tid = (struct ath_atx_tid *) swq->drv_priv; txq = tid->txq; ath_txq_lock(sc, txq); @@ -2894,6 +2982,9 @@ void ath_tx_node_cleanup(struct ath_softc *sc, struct ath_node *an) tid->active = false; ath_txq_unlock(sc, txq); + + if (!sta) + break; /* just one multicast ath_atx_tid */ } } -- 2.7.4 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-06 17:28 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <E1b8sbm-00083e-00@www.xplot.org> 2016-06-04 14:32 ` [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues Toke Høiland-Jørgensen [not found] ` <E1b9kkn-0000h7-00@www.xplot.org> 2016-06-06 4:26 ` Dave Taht [not found] ` <E1b9nQv-0001fc-00@www.xplot.org> 2016-06-06 16:55 ` Dave Taht 2016-06-06 17:26 ` Dave Taht 2016-06-06 17:28 ` Toke Høiland-Jørgensen 2016-06-03 16:51 [Make-wifi-fast] [RFC/RFT 0/5] Adding an airtime fairness scheduler to ath9k Toke Høiland-Jørgensen 2016-06-03 16:51 ` [Make-wifi-fast] [RFC/RFT 2/5] ath9k: use mac80211 intermediate software queues Toke Høiland-Jørgensen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox