[Make-wifi-fast] QoS and test setups

Sat May 7 12:50:50 EDT 2016

On Thu, May 5, 2016 at 7:08 PM, Aaron Wood <woody77 at gmail.com> wrote:
> I saw Dave's tests on WMM vs. without, and started thinking about test
> setups for systems when QoS is in use (using classification, not just
> SQM/AQM).
>
> There are a LOT of assumptions made when QoS systems based on marked packets
> is used:
>
> - That traffic X can starve others
> - That traffic X is more/most important
>
> Our test tools are not particularly good at anything other than hammering
> the network (UDP or TCP).  At least TCP has a built-in congestion control.
> I've seen many UDP (or even raw IP) test setups that didn't look anything
> like "real" traffic.

I sat back on this in the hope that someone else would jump forward...
but you asked...

I ran across this distribution today:
https://en.wikipedia.org/wiki/Rayleigh_distribution which looks closer
to reflecting the latency/bandwidth problem we're always looking at.

I found this via this thread:
https://news.ycombinator.com/item?id=11644845 which was fascinating.

I have to admit I have learnt most of my knowledge of statistics
through osmosis and by looking at (largely realtime) data that does
not yield to "normal" distributions like gaussian. So, rather than
coming up with useful methods to reduce stuff to single numbers, I
rely on curves and graphs and being always painfully aware of how
sampling intervals can smooth out real spikes and problems, and try to
convey intuition... and the wifi industry is wedded to charts of "rate
over range for tcp and udp". Getting to rate+latency over range for
those variables would be nice to see happen in their test tools....

There is another distribution that andrew was very hot on a few years
ago: https://en.wikipedia.org/wiki/Tracy%E2%80%93Widom_distribution

I thought something like it could be used to look at basic problems in
factoring in (or factoring out) header overheads, for example.

It would be good if we had a good statistician(s) "on staff"... or
there must be a whole set of mathematician's mailing lists somewhere,
all aching to dive into a more real-world problem?

> I know Dave has wanted an isochronous traffic tool that could simulate voip
> traffic (with in-band one-way latency/jitter/loss measurement capabilities).

d-itg, for which flent has some support for, "does that" but it's a
pita to setup and not exactly safe to use over the open internet.

*Yes*, the fact that the current rrul test suite and most others in
flent do not have an isochronous baseline measurement  - and uses a
rtt-bound measurement instead - leads to very misleading comparison
results when the measurement traffic gets a huge latency reduction.
Measurement traffic thus becomes larger - and the corresponding
observed Bandwidth in most flent tests drops, as we are only measuring
the bulk flows, not the measurement, nor the acks.

using ping-like traffic was "good enough", when we started, and were
cutting latencies by orders of magnitude on a regular basis, but, for
example, I just showed a long term 5x latency reduction for stock wifi
vs michal's patches at 100mbit - from 100ms to 20ms or so, and I have
no idea how the corresponding bandwidth loss is correlated. In a
couple tests the measurement flows also drop into another wifi hw
queue entirely (and I'm pretty convinced that we should always fold
stuff into the nearest queue when we're busy, no matter the marking)

Anyway, I'm digesting a ton of the short term results we got from the
last week of testing michal's patches...

(see the cerowrt blog github repo and compare the stock vs fqmac35
results on the short tests). I *think* that most of the difference in
performance is due to noise on the test (the 120ms burps downward in
bandwidth caused by something else) , and some of the rest can be
accounted for by more measurement traffic, and probably all the rest
due to dql taking too long to ramp up.

The long term result of the fq_codel wifi patch at the mac80211 layer
was *better* all round, bandwidth stayed the same, latency and jitter
got tons better. (if I figure out what was causing the burps - they
don't happen on OSX, just linux) - anyway comparing the baseline
patches to the patch here on the second plot...

http://blog.cerowrt.org/post/predictive_codeling/

Lovely stuff.

But the short term results were noisy and the 10s of seconds long dql
ramp was visible on some of those tests (sorry, no link for those yet,
it was in one of michal's mails)

Also (in flent) I increasingly dislike sampling at 200ms intervals,
and would prefer to be getting insights at 10-20ms intervals. Or
lower! 1ms would be *perfect*. :) I can get --step-size in flent down
to about 40ms before starting to see things like fping "get behind" -
fixing that would require changing fping to use fdtimers to fire stuff
off more precisely than it does, or finding/writing another ping tool.

Linux fdtimers are *amazing*; we use those in tc_iterate.c.

Only way I can think about getting down below 5ms would be to have
better tools for looking at packet captures. I have not had much
chance to look at "teacup" as yet. tcptrace -G + xplot.org and
wireshark's tools are as far as I go. Any other tools for taking apart
captures out there? In particulary, aircaps of wifi traffic,
retransmits, rate changes have been giving me enough of a headache to
want to sit down and tear apart them with wireshark's lua stuff... or
something.

It would be nice to measure latencies in bulk flows, directly.

...

I've long figured if we ever got to the basic isochronous test on the
10ms interval I originally specified, that we'd either revise the rrul
related tests to suit (the rrul2016 "standard"),  or create a new set
called "crrul" - "correct rrul".

We have a few isochronous tests to choose from. there are d-itg tests
in the suite that emulate voip fairly well. The show-stopper thus far
has been that doing things like that (or iperf/netperfs udp flooding
tests) are unsafe to use on the general internet, and I wanted some
form of test that negotiated a 3 way handshake, at least, and also
enforced a time limit on how long it sent traffic.

That said, to heck with it for internal tests as we are doing now.

We have a few simpler tools than d-itg that could be built upon. Avery
has the isoping tests which I'd forked mildly at one point, but never
got around to doing much with. There's also things like what I was
calling "twd" that I gave up on, and there's a very prisice

>
> What other tools do we need, for replicating traffic types that match how
> these QoS types in wifi are meant to be used?  I think we're doing an
> excellent job of showing how they can be abused.  Abusing is pretty easy, at
> this point (rrul, iPerf, etc).

:) Solving for abuse is useful, I think, also.

Solving for real traffic types like HAS and videoconferencing would be better.

Having a steady, non-greedy flow (like a basic music or video stream)
test would be good.

I'd love to have a 3-5 flow HAS-like test to fold into the others.

I was unaware that iperf3 can output json, am not sure what else can
be done with it.

We had tried to use the web10g stuff at one point but the kernel
patches were too invasive. A lot of what was in web10g has probably
made it into the kernel by now, perhaps we can start pulling out more
complete stats with things like netstat -ss or TCP_INFO?

Incidentally - I don't trust d-itg very far. Could use fdtimers, could
use realtime privs.

> -Aaron Wood
>
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast
>

-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org