[Bloat] capturing packets and applying qdiscs

Thu Mar 26 21:19:47 EDT 2015

On Thu, Mar 26, 2015 at 2:39 PM, Isaac Konikoff
<konikofi at candelatech.com> wrote:
> Hi All,
>
> Looking for some feedback in my test setup...
>
> Can you please review my setup and let me know how to improve my application
> of the qdiscs? I've been applying manually, but I'm not sure that is the
> best method, or if the values really make sense. Sorry if this has been
> covered ad nauseum in codel or bloat threads over the past 4+ years...
>
> I've been capturing packets on a dedicated monitor box using the following
> method:
>
> tshark -i moni1 -w <file>
>
> where moni1 is ath9k on channel 149 (5745 MHz), width: 40 MHz, center1: 5755
> MHz

For those of you that don't know how to do aircaps, it is pretty easy.
We are going to be doing a lot more of this as make-wifi-fast goes
along, so...

install aircrack-ng via whatever means you have available (works best
on ath9k, seems to work on iwl, don't know about other devices)

run:

airmon-ng start your_wifi_device your_channel

This will create a monX device of some sort, which you can then
capture with tshark or wireshark. There are all sorts of other cool
features here where - for example - you can post-hoc decrypt a wpa
session, etc.

Note that usually you will have trouble using the device for other
things, so I tend to just run it with the ethernet also connected.

We are in dire need of tools that can analyze aircap'd stuff at
different rates, look at beacons, interpacket gaps, wireless g
fallbacks, etc. If anyone knows f anything good, please post to the
list.

> The system under test is a lanforge ath10k ap being driven by another
> lanforge system using ath9k clients to associate and run traffic tests.
>
> The two traffic tests I'm running are:
>
> 1. netperf-wrapper batch consisting of:  tcp_download, tcp_upload,
> tcp_bidirectional, rrul, rrul_be and rtt_fair4be on 4 sta's.

Cool.

> 2. lanforge wifi capacity test using tcp-download incrementing 4 sta's per
> minute up to 64 sta's with each iteration attempting 500Mbps download per x
> number of sta's.
>
> The qdiscs I am using are applied to the virtual ap interface which is the
> egress interface for download tests. I also applied the same qdisc to the
> ap's eth1 for the few upload tests. Is this sane?
>
> qdiscs used, deleting each before trying the next:
> 1. default pfifo_fast
> 2. tc qdisc add dev vap1 root fq_codel
> 3. tc qdisc add dev vap1 root fq_codel target 5ms interval 100ms noecn

1) Test 2 and test 3 are essentially the same, unless you have also
enabled ecn on both sides of the tcp connection with

sysctl -w net.ipv4.tcp_ecn=1 #or the equivalent in sysctl.conf

The ecn vs non-ecn results tend to smoother results for tcp and mildly
higher packet loss on the measurement flows.

2) I do not have an ath10k in front of me. The ath9k presents 4 queues
controlled by
mq (and then some sub-qdisc) when it is in operation, as does the iwl.
Does the ath10k only present one queue?

On the two chipsets mentioned first, the queues are mapped to the 802.11e
VO,VI,BE, and BK queues - very inefficiently. I have long maintained
the VO queue should be obsoleted in favor of the VI queue, and in
general I find wireless-n works better if these queues are entirely
disabled on the AP.

This extremely old piece of code does more of the right thing for the
mq'd style of
wifi interface, although it is pretty wrong for everything else
(notably, we typically only use a reduced quantum of 300 on some low
speed devices, we never got around to making the tg3 work right, and
the tc filter is not the right thing for wifi, either)

https://github.com/dtaht/deBloat/blob/master/src/debloat.sh

> 4. tc qdisc add dev vap1 root fq_codel limit 2000 target 3ms interval 40ms
> noecn

Here there are about 3 assumptions wrong.

1) 1000 packets is still quite enough for even 802.11ac wifi (or so I think).
2) although fiddling with the target and interval is done here, there
is so much underlying buffering that these numbers are not going to
help much in the face of them on wifi. I typically actually run with a
much larger target (30ms) to cope with wifi's mac access jitter - with
the default interval when trying to improve per-station performance
along with...

3) The real parameters that will help wifi on an AP somewhat is to use
a tc dst filter (rather than the default 5 tuple filter) on fq_codel
to sort stuff into per station queues, and to use a quantum in the
4500 range, which accounts for either the max number of packets that
can be put in a txop (42 on wireless-n), and/or 3 big packets -
neither solution being a good one when wifi can handle 64k in a single
burst, and ac, more.

Even then, the results are far less than pleasing. What is needed, and
what we are going to do, is add real per-station queuing at the lowest
layer and then put something fq_codel like on top of each... and that
work hasn't started yet. The tc filter method I just described will
not work on station ids and thus will treat ipv4 and ipv6 traffic for
the same destination differently.

Now I do have wifi results for this stuff - somewhere - and the right
tc filter for dst filtering on a per mq basis, but it turns out I
think I left all that behind a natted box that I can't get back to til
thursday next week.

and as always I appreciate every scrap of data, every experiment,
every result obtained via every method, in order to more fully bracket
the real problems and demonstrate progress against wifi's problems, if
and when we start making it. a tarball of what you got would be nice
to have around.

You will see absolutely terrible per-sta download performance on the
rrul and rrul_be tests in particular with any of the qdiscs.
>
> Any suggestions you have would be helpful.
>
> Thanks,
> Isaac
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb