* Bufferbloat related patches for the iwl3945 [not found] ` <AANLkTik-nK+OaL4vBns8bcJq6QoGL9ohiZ1s=RTQrdrF@mail.gmail.com> @ 2011-02-13 20:22 ` Dave Täht [not found] ` <87ei7bprvw.fsf@cruithne.co.teklibre.org> 1 sibling, 0 replies; 5+ messages in thread From: Dave Täht @ 2011-02-13 20:22 UTC (permalink / raw) To: bloat-devel, bloat Nathaniel Smith beat me to posting patches this weekend. [1] See: http://thread.gmane.org/gmane.linux.kernel.wireless.general/64733 For details. Perhaps this will spark some discussion here and elsewhere. Nathaniel Smith writes: > Thanks for the poke. I just sent the patches. > I guess we'll see if anyone responds this time. > > I actually suspect that the approach I use in the patch is wrong in > the long run, because it needs that magic constant ("2 ms"), and even > then it can't properly take into account task switching latency (if it > takes >2 ms to get back to the driver, then the queue may drain and we > may lose some throughput, but who knows whether that's the case or > not). What would be better would be some way to detect the case where: > -- there were packets waiting at the next level up -- that would have > been queued, except that the driver decided that it had enough packets > queued already -- and then the driver's queue underflowed If we had > this information, then the driver could do a better job of tuning; it > already can measure excess buffer capacity, and then it'd be able to > measure insufficient capacity directly, and adjust its buffer size > accordingly. I think that is a good start. > > If the network layer just had a counter that it incremented every time > its queue transitioned from empty to non-empty or vice-versa, then we > could detect the above situation as: the driver's queue is empty, > incoming packets are choked, and the counter is odd and has the same > value as it did when we last accepted a packet. But it doesn't have > that counter, and I didn't feel inspired to try and get changes > accepted to the generic networking code. > > I think I'd better work on my thesis instead of getting involved in > the bufferbloat list ;-), and similarly am unlikely to follow up much > on the ideas above, but feel free to forward to anyone who might be > interested (or to the list in general). > > Cheers, > -- Nathaniel -- Dave Taht http://nex-6.taht.net 1: In the future I will keep code related stuff on the bloat-devel list ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <87ei7bprvw.fsf@cruithne.co.teklibre.org>]
[parent not found: <AANLkTik_TydO+E4NTNH9k5t59tLajDH2Qe569WWBv9HD@mail.gmail.com>]
[parent not found: <8739nrpgi5.fsf@cruithne.co.teklibre.org>]
[parent not found: <4D5886AA.4000906@openwrt.org>]
[parent not found: <87y65jnzla.fsf@cruithne.co.teklibre.org>]
* Re: Bemused to [not found] ` <87y65jnzla.fsf@cruithne.co.teklibre.org> @ 2011-02-14 2:16 ` Felix Fietkau 0 siblings, 0 replies; 5+ messages in thread From: Felix Fietkau @ 2011-02-14 2:16 UTC (permalink / raw) To: Dave Täht; +Cc: bloat-devel On 2011-02-14 3:01 AM, Dave Täht wrote: > > Felix: > > ReSend your original also to bloat-devel@lists.bufferbloat.net ? Cc'd. > I'm editing down our chat now... Will respond in more detail after I > review my notes there. You might also want to talk to Jean Tourrilhes > > https://lists.bufferbloat.net/pipermail/bloat/2011-February/thread.html > > > Felix Fietkau <nbd@openwrt.org> writes: > >> Hi Nathaniel, Dave, >> >> I'm currently trying to work out a way to combat the bufferbloat issue >> in ath9k, while still ensuring that aggregation does its job to reduce >> airtime utilization properly. > > It would help me if you could describe the current math behind how > airtime utilization works in n. I know how aggregation works - and like it - > but the timeslots used in 802.11n are part of a twisty maze of > standards, all different. > > Pretend I've (and the others on the list) have been living on the moon > for a couple years. The timeslot maze in 802.11n is pretty much the same as with 802.11a/g, I don't have a good description about it, but since there are so many environment factors that play into it as well, I usually just treat it as an unpredictable source of wildly fluctuating extra latency and move on. >> The nice thing about ath9k compared to Intel drivers is that all of the >> aggregation related queueing is done inside the driver instead of some >> opaque hardware queue or firmware. That allows me to be more selective >> in which packets to drop and which ones to keep. > > I found the lagn driver to be, well, laggy. > >> One thing that I need to ensure (based on requirements of some >> commercial projects that I'm working on) is that bufferbloat >> countermeasures must not hurt maximum TCP throughput when testing under >> ideal conditions. Considering the flexibility of being able to work >> directly on an aggregation queueing level, this should not be much of a >> problem, IMHO. > > I'd like a knob that users could use to tune for better > interactivity. One knob for benchmarks, one for users that like to do > voip and play quake, one knob to rule them all, and the dark buffers > bind them. Tunables are good, but I think for this to work properly we need very solid defaults. >> I'm currently trying to come up with some good functions to determine at >> which point packets should be dropped. These will probably have to be >> different for the software aggregation queues compared to the default >> WMM hardware queues where legacy frames are queued up directly. >> >> For aggregation I would like to allow at least the maximum number of >> packets that can fit into one A-MPDU, which depends on the selected >> rate. Since wireless driver queueing will really only have an effect >> when we're running short on airtime, we need to make sure that we reduce >> airtime waste caused by PHY headers, interframe spacing, etc. >> A-MPDU is a very neat way to do that... >> >> Do you have any good suggestions about the implementaton of the queue >> length vs measured latency relationship, or anything else that I should >> take into account? > > I've found in my testing, that TCP vegas is interacting badly with the > levels of buffering and delay in the current ath9k driver, by a lot, so > it's assumptions are invalid. I know that's of little help. Yeah, ath9k currently does way too much buffering for its own good, that's mostly because the aggregation logic is tied closely to hardware buffer management (i.e. a frame gets a descriptor long before it enters the hardware queue for the first time). I've been planning to fix that for a while, and I will eventually get around to it, but it requires quite a bit of code refactoring. First step is to decouple the aggregation logic from the internal descriptors/buffer entries to have more flexible buffer management, later on I intend to move the aggregation logic to mac80211, where it's closer to things that happen in the network stack. This will also allow at least some of the other 802.11n drivers to leverage fixes being done to this. - Felix ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <AANLkTik7xf7kKQuTYEoEHi5Xsraw7H9WYrQ9SqKdZTM-@mail.gmail.com>]
* Combating wireless bufferbloat while maximizing aggregation [not found] ` <AANLkTik7xf7kKQuTYEoEHi5Xsraw7H9WYrQ9SqKdZTM-@mail.gmail.com> @ 2011-02-14 17:18 ` Dave Täht 2011-02-15 3:08 ` Bemused to Felix Fietkau 1 sibling, 0 replies; 5+ messages in thread From: Dave Täht @ 2011-02-14 17:18 UTC (permalink / raw) To: bloat-devel, nbd, njs Just forwarding this thread to the list, in the hope that it moves there, or to Linux-wireless, and out of just our mailboxes. Nathaniel Smith <njs@pobox.com> writes: > On Sun, Feb 13, 2011 at 5:34 PM, Felix Fietkau <nbd@openwrt.org> wrote: >> Hi Nathaniel, Dave, > > Hi Felix, > >> I'm currently trying to work out a way to combat the bufferbloat issue >> in ath9k, while still ensuring that aggregation does its job to reduce >> airtime utilization properly. > > Excellent! I'm not sure I have any particularly useful insights to > give -- my day job is as a linguist, not a network engineer :-) -- but > I'll throw out some quick thoughts and if they're useful, great, and > if not, I won't feel bad. > >> The nice thing about ath9k compared to Intel drivers is that all of the >> aggregation related queueing is done inside the driver instead of some >> opaque hardware queue or firmware. That allows me to be more selective >> in which packets to drop and which ones to keep. > > This is the first place I'm confused. Why would you drop packets > inside the driver? Shouldn't dropping packets be the responsibility of > the Qdisc feeding your driver, since that's where all the smart AMQ > and QoS and user-specified-policy knobs live? My understanding is that > the driver's job is just to take the minimum number of packets at a > time (consistent with throughput, etc.) from the Qdisc and send them > on. Or are you talking about dropping in the sense of spending more > effort on driver-level retransmit in some cases than others? > > For that I have a crazy idea: what if the driver took each potentially > retransmittable packet and handed it *back* to the Qdisc, who then > could apply policy to send it to the back of the queue, jump it to the > front of the queue for immediate retransmission, throw it away if > higher priority traffic has arrived and the queue is full now, etc. > You'd probably need to add some API to tell the Qdisc that the packet > you want to enqueue has already waited once (I imagine the default > dumb Qdisc would want to enqueue such packets at the head of the queue > by default). Perhaps also some way to give up on a packet if it's > waited "too long" (but then again, perhaps not!). But as I think about > this idea it does grow on me. > >> For aggregation I would like to allow at least the maximum number of >> packets that can fit into one A-MPDU, which depends on the selected >> rate. Since wireless driver queueing will really only have an effect >> when we're running short on airtime, we need to make sure that we reduce >> airtime waste caused by PHY headers, interframe spacing, etc. >> A-MPDU is a very neat way to do that... > > If sending N packets is as cheap (in latency terms) as sending 1, then > I don't see how queueing up N packets can hurt any! > > The iwlwifi patches I just sent do the dumbest possible fix, of making > the tx queue have a fixed latency instead of a fixed number of > packets. I found this attractive because I figured I wasn't smart > enough to anticipate all the different things that might affect > transmission rate, so better to just measure what was happening and > adapt. In principle, if A-MPDU is in use, and that lets us send more > packets for the price of one, then this approach would notice that > reflected in our packet throughput and the queue size should increase > to match. > > Obviously this could break if the queue size ever dropped too low -- > you might lose throughput because of the smaller queue size, and then > that would lock in the small queue size, causing loss of throughput... > but I don't see any major downsides to just setting a minimum > allowable queue size, so long as it's accurate. > > In fact, my intuition is that the only thing way to improve on just > queueing up a full A-MPDU aggregated packet would be to wait until > *just before* your transmission time slot rolls around and *then* > queueing up a full A-MPDU aggregated packet. If you get to transmit > every K milliseconds, and you always refill your queue immediately > after transmitting, then in the worst case a high-priority packet > might have to wait 2*K ms (K ms sitting at the head of the Qdisc > waiting for you to open your queue, then another K ms in the driver > waiting to be transmitted). This worst case drops to K ms if you > always refill immediately before transmitting. But the possible gains > here are bounded by whatever uncertainty you have about the upcoming > transmission time, scheduling jitter, and K. I don't know what any of > those constants look like in practice. > > Well, that's my brain dump for now. Make of it what you will! > -- Nathaniel -- Dave Taht http://nex-6.taht.net ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Bemused to [not found] ` <AANLkTik7xf7kKQuTYEoEHi5Xsraw7H9WYrQ9SqKdZTM-@mail.gmail.com> 2011-02-14 17:18 ` Combating wireless bufferbloat while maximizing aggregation Dave Täht @ 2011-02-15 3:08 ` Felix Fietkau 2011-02-15 6:28 ` Nathaniel Smith 1 sibling, 1 reply; 5+ messages in thread From: Felix Fietkau @ 2011-02-15 3:08 UTC (permalink / raw) To: Nathaniel Smith; +Cc: bloat-devel On 2011-02-14 6:33 AM, Nathaniel Smith wrote: > On Sun, Feb 13, 2011 at 5:34 PM, Felix Fietkau <nbd@openwrt.org> wrote: >> The nice thing about ath9k compared to Intel drivers is that all of the >> aggregation related queueing is done inside the driver instead of some >> opaque hardware queue or firmware. That allows me to be more selective >> in which packets to drop and which ones to keep. > > This is the first place I'm confused. Why would you drop packets > inside the driver? Shouldn't dropping packets be the responsibility of > the Qdisc feeding your driver, since that's where all the smart AMQ > and QoS and user-specified-policy knobs live? My understanding is that > the driver's job is just to take the minimum number of packets at a > time (consistent with throughput, etc.) from the Qdisc and send them > on. Or are you talking about dropping in the sense of spending more > effort on driver-level retransmit in some cases than others? It's the driver's responsibility to aggregate packets. For that, I absolutely need queuing under the control of the driver. After a packet has an assigned sequence number, the driver cannot hand control over to an external qdisc unless it is guaranteed that it gets the packet back (possibly with a status info that tells it whether the packet should be dropped or not). If packets were dropped outside of the driver after they've been tracked, gaps in the aggregation reorder window of the receiver would bring the data transfer to an immediate halt. > For that I have a crazy idea: what if the driver took each potentially > retransmittable packet and handed it *back* to the Qdisc, who then > could apply policy to send it to the back of the queue, jump it to the > front of the queue for immediate retransmission, throw it away if > higher priority traffic has arrived and the queue is full now, etc. > You'd probably need to add some API to tell the Qdisc that the packet > you want to enqueue has already waited once (I imagine the default > dumb Qdisc would want to enqueue such packets at the head of the queue > by default). Perhaps also some way to give up on a packet if it's > waited "too long" (but then again, perhaps not!). But as I think about > this idea it does grow on me. For the ath9k case that would mean having to turn the qdisc code into a library that can be completely controlled and that does not free packets by itself. I think that would require major changes to the network stack. >> For aggregation I would like to allow at least the maximum number of >> packets that can fit into one A-MPDU, which depends on the selected >> rate. Since wireless driver queueing will really only have an effect >> when we're running short on airtime, we need to make sure that we reduce >> airtime waste caused by PHY headers, interframe spacing, etc. >> A-MPDU is a very neat way to do that... > > If sending N packets is as cheap (in latency terms) as sending 1, then > I don't see how queueing up N packets can hurt any! > > The iwlwifi patches I just sent do the dumbest possible fix, of making > the tx queue have a fixed latency instead of a fixed number of > packets. I found this attractive because I figured I wasn't smart > enough to anticipate all the different things that might affect > transmission rate, so better to just measure what was happening and > adapt. In principle, if A-MPDU is in use, and that lets us send more > packets for the price of one, then this approach would notice that > reflected in our packet throughput and the queue size should increase > to match. > > Obviously this could break if the queue size ever dropped too low -- > you might lose throughput because of the smaller queue size, and then > that would lock in the small queue size, causing loss of throughput... > but I don't see any major downsides to just setting a minimum > allowable queue size, so long as it's accurate. > > In fact, my intuition is that the only thing way to improve on just > queueing up a full A-MPDU aggregated packet would be to wait until > *just before* your transmission time slot rolls around and *then* > queueing up a full A-MPDU aggregated packet. If you get to transmit > every K milliseconds, and you always refill your queue immediately > after transmitting, then in the worst case a high-priority packet > might have to wait 2*K ms (K ms sitting at the head of the Qdisc > waiting for you to open your queue, then another K ms in the driver > waiting to be transmitted). This worst case drops to K ms if you > always refill immediately before transmitting. But the possible gains > here are bounded by whatever uncertainty you have about the upcoming > transmission time, scheduling jitter, and K. I don't know what any of > those constants look like in practice. The problem with that is that aggregation uses more queues inside the driver than can be made visible as network stack queues. Queueing is done for every traffic identifier (there are 8 of them, which map to 4 hardware queues), for every station individually. Because of that, the driver cannot simply pull in more frames at convenient points in time, because what ends up getting in that case might just be the entirely wrong batch of frames, or a mix of packets for completely different stations, which would also completely kill aggregation performance. For fixing this, I'm considering running the A* algorithm (proposed in the paper that jg mentioned) on each individual per-station per-tid queue. - Felix ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Bemused to 2011-02-15 3:08 ` Bemused to Felix Fietkau @ 2011-02-15 6:28 ` Nathaniel Smith 0 siblings, 0 replies; 5+ messages in thread From: Nathaniel Smith @ 2011-02-15 6:28 UTC (permalink / raw) To: Felix Fietkau; +Cc: bloat-devel On Mon, Feb 14, 2011 at 7:08 PM, Felix Fietkau <nbd@openwrt.org> wrote: > On 2011-02-14 6:33 AM, Nathaniel Smith wrote: >> This is the first place I'm confused. Why would you drop packets >> inside the driver? Shouldn't dropping packets be the responsibility of >> the Qdisc feeding your driver, since that's where all the smart AMQ >> and QoS and user-specified-policy knobs live? My understanding is that >> the driver's job is just to take the minimum number of packets at a >> time (consistent with throughput, etc.) from the Qdisc and send them >> on. Or are you talking about dropping in the sense of spending more >> effort on driver-level retransmit in some cases than others? > It's the driver's responsibility to aggregate packets. For that, I > absolutely need queuing under the control of the driver. After a packet > has an assigned sequence number, the driver cannot hand control over to > an external qdisc unless it is guaranteed that it gets the packet back > (possibly with a status info that tells it whether the packet should be > dropped or not). If packets were dropped outside of the driver after > they've been tracked, gaps in the aggregation reorder window of the > receiver would bring the data transfer to an immediate halt. Ah. I think I understand, but to make sure: the problem is that the 802.11 MAC layer guarantees in-order delivery (at least for packets within the same QoS class). Therefore, if an A-MPDU aggregate is only partially received, then the receiving side can't pass *any* parts up the networking stack -- even the parts that were successfully received -- until after *all* of the parts are successfully retransmitted (or the transmitter says never mind, I'm not going to retransmit). Yes? And this is to avoid TCP getting confused by out-of-order packets (which it might think are lost packets, at least until they arrive and it has to do the D-SACK dance)? How sad. It would obviously be so much better if some reordering were possible -- no-one really wants the MAC layer to be holding onto packets for tens of milliseconds. Surely VoIP people hate this? So an interesting question is if, in some circumstances, it would be better to reorder lost packets in the service of better queue handling. Because if you tell the receiving station to give up on this MPDU, then you can throw the packet back into the Qdisc... >> For that I have a crazy idea: what if the driver took each potentially >> retransmittable packet and handed it *back* to the Qdisc, who then >> could apply policy to send it to the back of the queue, jump it to the >> front of the queue for immediate retransmission, throw it away if >> higher priority traffic has arrived and the queue is full now, etc. >> You'd probably need to add some API to tell the Qdisc that the packet >> you want to enqueue has already waited once (I imagine the default >> dumb Qdisc would want to enqueue such packets at the head of the queue >> by default). Perhaps also some way to give up on a packet if it's >> waited "too long" (but then again, perhaps not!). But as I think about >> this idea it does grow on me. > For the ath9k case that would mean having to turn the qdisc code into a > library that can be completely controlled and that does not free packets > by itself. I think that would require major changes to the network stack. Right -- it might well be a good idea in the long run to reorganize queue handling like this, if it turns out that drivers really need to have their dirty fingers mixed into AQM and stuff; the Qdisc machinery certainly doesn't strike me as the cleanest and most mature design in the kernel. But it's not the easiest place to start... >>> For aggregation I would like to allow at least the maximum number of >>> packets that can fit into one A-MPDU, which depends on the selected >>> rate. Since wireless driver queueing will really only have an effect >>> when we're running short on airtime, we need to make sure that we reduce >>> airtime waste caused by PHY headers, interframe spacing, etc. >>> A-MPDU is a very neat way to do that... >> >> If sending N packets is as cheap (in latency terms) as sending 1, then >> I don't see how queueing up N packets can hurt any! >> >> The iwlwifi patches I just sent do the dumbest possible fix, of making >> the tx queue have a fixed latency instead of a fixed number of >> packets. I found this attractive because I figured I wasn't smart >> enough to anticipate all the different things that might affect >> transmission rate, so better to just measure what was happening and >> adapt. In principle, if A-MPDU is in use, and that lets us send more >> packets for the price of one, then this approach would notice that >> reflected in our packet throughput and the queue size should increase >> to match. >> >> Obviously this could break if the queue size ever dropped too low -- >> you might lose throughput because of the smaller queue size, and then >> that would lock in the small queue size, causing loss of throughput... >> but I don't see any major downsides to just setting a minimum >> allowable queue size, so long as it's accurate. >> >> In fact, my intuition is that the only thing way to improve on just >> queueing up a full A-MPDU aggregated packet would be to wait until >> *just before* your transmission time slot rolls around and *then* >> queueing up a full A-MPDU aggregated packet. If you get to transmit >> every K milliseconds, and you always refill your queue immediately >> after transmitting, then in the worst case a high-priority packet >> might have to wait 2*K ms (K ms sitting at the head of the Qdisc >> waiting for you to open your queue, then another K ms in the driver >> waiting to be transmitted). This worst case drops to K ms if you >> always refill immediately before transmitting. But the possible gains >> here are bounded by whatever uncertainty you have about the upcoming >> transmission time, scheduling jitter, and K. I don't know what any of >> those constants look like in practice. > The problem with that is that aggregation uses more queues inside the > driver than can be made visible as network stack queues. > Queueing is done for every traffic identifier (there are 8 of them, > which map to 4 hardware queues), for every station individually. > Because of that, the driver cannot simply pull in more frames at > convenient points in time, because what ends up getting in that case > might just be the entirely wrong batch of frames, or a mix of packets > for completely different stations, which would also completely kill > aggregation performance. Traffic identifier = 802.11-ese for QoS category, right? Another thing I really don't understand is how 802.11 QoS is expected to interact with everyone-else's version of QoS. Certainly it's useful to have some way to pass QoS categories from one station to another, but within a single station the actual spec'ed mechanisms for handling the different QoS categories seem to just come down to, you can&should reorder high priority packets in front of low priority packets? Is there a requirement that a single A-MPDU cannot contain a mix of different TIDs? It's not obvious to me how per-TID driver queues add value over standard traffic shaping. > For fixing this, I'm considering running the A* algorithm (proposed in > the paper that jg mentioned) on each individual per-station per-tid queue. That seems like it would work, but the cost is that IIUC A* calculates the total appropriate queue size. So if you do this, then you should also set the default txqueuelen to 0, which will disable the kernel's generic QoS and AQM code entirely Since bufferbloat means that the kernel's generic QoS and AQM code don't work *now*, this would still be a huge improvement. But it seems suboptimal in the long run, which is why I keep poking away at better ways to interact with the Qdisc. I take the point about needing separate buffers for separate stations, though. That really does seem to require that your queue management has intimate knowledge of the network situation. (I guess you could somehow clone some template Qdisc for each station, with packets implicitly routed into the correct one? What a mess.) Does traffic for one station even contend with traffic for other stations? -- Nathaniel ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-02-15 6:28 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <87lj3hd00r.fsf@taht.net> [not found] ` <AANLkTi=HGuVvmUFw+TWp9pbO7fAtBBb+U88_PuVW8QSZ@mail.gmail.com> [not found] ` <874o8jlhjd.fsf@cruithne.co.teklibre.org> [not found] ` <AANLkTik-nK+OaL4vBns8bcJq6QoGL9ohiZ1s=RTQrdrF@mail.gmail.com> 2011-02-13 20:22 ` Bufferbloat related patches for the iwl3945 Dave Täht [not found] ` <87ei7bprvw.fsf@cruithne.co.teklibre.org> [not found] ` <AANLkTik_TydO+E4NTNH9k5t59tLajDH2Qe569WWBv9HD@mail.gmail.com> [not found] ` <8739nrpgi5.fsf@cruithne.co.teklibre.org> [not found] ` <4D5886AA.4000906@openwrt.org> [not found] ` <87y65jnzla.fsf@cruithne.co.teklibre.org> 2011-02-14 2:16 ` Bemused to Felix Fietkau [not found] ` <AANLkTik7xf7kKQuTYEoEHi5Xsraw7H9WYrQ9SqKdZTM-@mail.gmail.com> 2011-02-14 17:18 ` Combating wireless bufferbloat while maximizing aggregation Dave Täht 2011-02-15 3:08 ` Bemused to Felix Fietkau 2011-02-15 6:28 ` Nathaniel Smith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox