Bemused to

Mon Feb 14 19:08:10 PST 2011

On 2011-02-14 6:33 AM, Nathaniel Smith wrote:
> On Sun, Feb 13, 2011 at 5:34 PM, Felix Fietkau <nbd at openwrt.org> wrote:
>> The nice thing about ath9k compared to Intel drivers is that all of the
>> aggregation related queueing is done inside the driver instead of some
>> opaque hardware queue or firmware. That allows me to be more selective
>> in which packets to drop and which ones to keep.
> 
> This is the first place I'm confused. Why would you drop packets
> inside the driver? Shouldn't dropping packets be the responsibility of
> the Qdisc feeding your driver, since that's where all the smart AMQ
> and QoS and user-specified-policy knobs live? My understanding is that
> the driver's job is just to take the minimum number of packets at a
> time (consistent with throughput, etc.) from the Qdisc and send them
> on. Or are you talking about dropping in the sense of spending more
> effort on driver-level retransmit in some cases than others?
It's the driver's responsibility to aggregate packets. For that, I
absolutely need queuing under the control of the driver. After a packet
has an assigned sequence number, the driver cannot hand control over to
an external qdisc unless it is guaranteed that it gets the packet back
(possibly with a status info that tells it whether the packet should be
dropped or not). If packets were dropped outside of the driver after
they've been tracked, gaps in the aggregation reorder window of the
receiver would bring the data transfer to an immediate halt.

> For that I have a crazy idea: what if the driver took each potentially
> retransmittable packet and handed it *back* to the Qdisc, who then
> could apply policy to send it to the back of the queue, jump it to the
> front of the queue for immediate retransmission, throw it away if
> higher priority traffic has arrived and the queue is full now, etc.
> You'd probably need to add some API to tell the Qdisc that the packet
> you want to enqueue has already waited once (I imagine the default
> dumb Qdisc would want to enqueue such packets at the head of the queue
> by default). Perhaps also some way to give up on a packet if it's
> waited "too long" (but then again, perhaps not!). But as I think about
> this idea it does grow on me.
For the ath9k case that would mean having to turn the qdisc code into a
library that can be completely controlled and that does not free packets
by itself. I think that would require major changes to the network stack.

>> For aggregation I would like to allow at least the maximum number of
>> packets that can fit into one A-MPDU, which depends on the selected
>> rate. Since wireless driver queueing will really only have an effect
>> when we're running short on airtime, we need to make sure that we reduce
>> airtime waste caused by PHY headers, interframe spacing, etc.
>> A-MPDU is a very neat way to do that...
> 
> If sending N packets is as cheap (in latency terms) as sending 1, then
> I don't see how queueing up N packets can hurt any!
> 
> The iwlwifi patches I just sent do the dumbest possible fix, of making
> the tx queue have a fixed latency instead of a fixed number of
> packets. I found this attractive because I figured I wasn't smart
> enough to anticipate all the different things that might affect
> transmission rate, so better to just measure what was happening and
> adapt. In principle, if A-MPDU is in use, and that lets us send more
> packets for the price of one, then this approach would notice that
> reflected in our packet throughput and the queue size should increase
> to match.
> 
> Obviously this could break if the queue size ever dropped too low --
> you might lose throughput because of the smaller queue size, and then
> that would lock in the small queue size, causing loss of throughput...
> but I don't see any major downsides to just setting a minimum
> allowable queue size, so long as it's accurate.
> 
> In fact, my intuition is that the only thing way to improve on just
> queueing up a full A-MPDU aggregated packet would be to wait until
> *just before* your transmission time slot rolls around and *then*
> queueing up a full A-MPDU aggregated packet. If you get to transmit
> every K milliseconds, and you always refill your queue immediately
> after transmitting, then in the worst case a high-priority packet
> might have to wait 2*K ms (K ms sitting at the head of the Qdisc
> waiting for you to open your queue, then another K ms in the driver
> waiting to be transmitted). This worst case drops to K ms if you
> always refill immediately before transmitting. But the possible gains
> here are bounded by whatever uncertainty you have about the upcoming
> transmission time, scheduling jitter, and K. I don't know what any of
> those constants look like in practice.
The problem with that is that aggregation uses more queues inside the
driver than can be made visible as network stack queues.
Queueing is done for every traffic identifier (there are 8 of them,
which map to 4 hardware queues), for every station individually.
Because of that, the driver cannot simply pull in more frames at
convenient points in time, because what ends up getting in that case
might just be the entirely wrong batch of frames, or a mix of packets
for completely different stations, which would also completely kill
aggregation performance.

For fixing this, I'm considering running the A* algorithm (proposed in
the paper that jg mentioned) on each individual per-station per-tid queue.

- Felix