[Codel] hardware multiqueue in fq_codel?

Dave Taht dave.taht at gmail.com
Thu Jul 11 11:06:58 PDT 2013


On Thu, Jul 11, 2013 at 10:44 AM, Eric Dumazet <eric.dumazet at gmail.com> wrote:
> On Thu, 2013-07-11 at 10:09 -0700, Dave Taht wrote:
>> In my default environments (wifi, mainly) the hardware queues have
>> very different properties.
>>
>> I'm under the impression that in at least a few ethernet devices they
>> are essentially the same. That said, in the sch_mq case, an entirely
>> separate qdisc is created per hardware queue, and it's always been
>> puzzling to me as to how to attempt to use them within a single qdisc
>> in the pull-through manner.
>>
>> logically, you should be able to take the fq_codel hash index (idx %
>> dev->num_tx_queues) and spread out across the hardware queues that
>> way, but I have no idea where that info would go (the skb? the flow?)
>> or even if it were possible as per the pull through problem...
>>
>> (This does not mean that I necessarily think hardware multiqueues are
>> a good idea... (certainly the results I get out of 802.11e are
>> terrible - but it would be nice to have a unified solution for hw
>> multiqueue devices)
>>
>
> We do not have a fixed/unified queue selection.
>
> It can be tweaked by many different things, depending on exact needs.
>
> MQ is not a qdisc per se, it's only a fake one, a demux if you want, so
> that each tx queue has a separate qdisc lock.
>
> If you stick one fq_codel at the top of the hierarchy (instead of MQ),
> then you loose all the pros of having multiple locks : sending packets
> from fq_codel to different queues on hardware makes no sense, since the
> single qdisc lock is the bottleneck.
>
> So if you want fq_codel and MQ, to be able to drive 40G links from many
> cpus, just use :
>
> ETH=eth0
> NQUEUES=16  # or more, check how many tx queues your NIC supports
> tc qd del dev $ETH root 2>/dev/null
> tc qd add dev $ETH root handle 1: mq
> for i in `seq 1 $NQUEUES`
> do
>  tc qd add dev $ETH parent 1:$i fq_codel
> done
>
> Thats only replaces the default pfifo_fast on each slave qdisc by
> fq_codel.

Gotcha. So what I actually did (felix did, in openwrt, actually) was
just make fq_codel the default qdisc to avoid having to inspect things
to set the number of queues in mq and mqprio. I see, for example, that
mq is the default for tg3...

http://snapon.lab.bufferbloat.net/~cero2/deb/patches/0003-Use-FQ_codel-by-default.patch

I just added it to htb and hfsc too:

http://snapon.lab.bufferbloat.net/~cero2/deb/patches/0008-Make-fq_codel-the-default-qdisc-for-htb-and-hfsc.patch

There's a patch to obsolete pfifo_fast entirely in openwrt, which is a
tad premature.

A remaining concern is to what this affects:

A) people that expect ifconfig X txqueuelen Y to do anything will be
misled. Perhaps this could be fixed by having the fq_codel default
limit be txqueuelen rather than the default (and overlarge) limit of
10k, but as tons of people are supplying oddball txqueuelens, I tend
to think just ignoring txqueuelen going forward is more the right
thing.

Do you actually get close to 10k packets outstanding in 10GigE under
any sane circumstances?

B) people that expect pfifo_fast semantics, for which substituting
fq_codel behaves oddly in two ways -

1) if you are explicitly setting skb->priority for the default
pfifo_fast 3 bands  and expecting a result, nothing happens - but in
the general case, people setting skb->priority are trying to get
better latency in the first place, and I really don't think almost
anybody will notice.

2) if you are using a filter on pfifo_fast that expects 3 bands, and
end up using fq_codel by default anyway we get DRR-like behavior over
codel rather than strict prioritization and lose fq_codel's full
benefits... which is still a win IMHO. I am not fond of being able to
starve the other two bands....

3) trying to explicitly set pfifo_fast via tc doesn't work with this patch.

4) ECN processing is enabled by default (but off by default in sysctl)


>
>
>



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


More information about the Codel mailing list