[Make-wifi-fast] [Starlink] RFC: Latency test case text and example report.

Tue Sep 13 15:09:48 EDT 2022

On 9/13/22 11:32 AM, Dave Taht wrote:
> On Tue, Sep 13, 2022 at 9:57 AM Ben Greear <greearb at candelatech.com> wrote:
>>
>> On 9/13/22 9:12 AM, Dave Taht wrote:
>>> On Tue, Sep 13, 2022 at 8:58 AM Ben Greear <greearb at candelatech.com> wrote:
>>>>
>>>> On 9/13/22 8:39 AM, Dave Taht wrote:
>>>>> hey, ben, I'm curious if this test made it into TR398? Is it possible
>>>>> to setup some of this or parts of TR398 to run over starlink?
>>>>>
>>>>> I'm also curious as to if any commercial ax APs were testing out
>>>>> better than when you tested about this time last year.  I've just gone
>>>>> through 9 months of pure hell getting openwrt's implementation of the
>>>>> mt76 and ath10k to multiplex a lot better, and making some forward
>>>>> progress again (
>>>>> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002/830 )
>>>>> and along the way ran into new problems with location scanning and
>>>>> apple's airdrop....
>>>>>
>>>>> but I just got a batch of dismal results back from the ax210 and
>>>>> mt79... tell me that there's an AP shipping from someone that scales a
>>>>> bit better? Lie if you must...
>>>>
>>>> An mtk7915 based AP that is running recent owrt did better than others.
>>>>
>>>> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/
>>>
>>> I wanted to be happy, but... tcp...
>>>
>>> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/chart-31.png
>>>
>>> what's the chipset driving these tests nowadays?
>>
>> That test was done with MTK virtual stations doing the station load (and multi-gig Eth port
>> sending traffic towards the DUT in download direction).
> 
> Openwrt driver or factory?

I run my own kernel, but it would have been 5.17 plus a bunch of patches from mtk tree that
owrt uses, plus my own hackings.

> 
> The last major patches for openwrt mt76 wifi landed aug 4, I think.
> There are a few more under test now that the OS is stable.
> 
>> My assumption is that much of the TCP latency is very likely caused on the
>> traffic generator itself, so that is why we measure udp latency for pass/fail
>> metrics.
> 
> I fear a great deal of it is real, on the path, in the DUT. However
> there is a lot in the local stack too.
> 
> Here's some things to try. TCP small queues stops being effective (at
> this rate) at oh, 8-12 flows,
> and they start accruing in the stack and look like an RTT inflation.
> A big help is to set TCP_NONSENT_LOWAT to a low value (16k).
> 
> sch_fq is actually worse than fq_codel on the driving host as it too
> accrues packets.
> 
> Trying out tcp reno, and BBR on this workload might show a difference.
> I wish LEDBAT++ was available for linux...

I have not much interest in trying to get the traffic generator to report less TCP latency,
by tuning the traffic generator because whatever it reports, I do not trust it to not be a
significant part of the over-all end-to-end latency.

So, better question for me is how to get precise info on TCP latency through the DUT for
a generic traffic generator.

We put sequence numbers and time-stamps in our traffic generator payloads, so we could
use wireshark/tshark captures on Eth and WiFi to detect latency from when DUT would have received
the pkt on Eth port and transmitted it on WiFi.  It would be...interesting...to take a multi-million packet
capture of 32 stations doing 4 tcp streams each and try to make sense of that.  I don't think
I'd want to spend time on that now, but if you'd like a pair of packet captures to try it yourself, I'd be happy
to generate them and make them available.

Or, if you have other ideas for how to test DUT tcp latency under load/scale without having to overly
trust the packet generator's latency reporting, please let me know.

> 
> 
> ... going theoreticall ...
> 
> There was some really great work on fractional windows that went into
> google's swift congestion control, this is an earlier paper on it:
> 
> https://research.google/pubs/pub49448/
> 
> and a couple really great papers from google and others last week
> from: https://conferences.sigcomm.org/sigcomm/2022/program.html
> 
> 
>>
>> It would take some special logic, like sniffing eth port and air at same time,
>> and matching packets by looking at the packet content closely to really understand DUT TCP latency.
>> I'm not sure that is worth the effort.
> 
> Heh. I of course, think it is, as TCP is the dominant protocol on the
> internet... anyway,
> to get a baseline comparison between tcp behaviors, you could just do
> a pure ethernet test, set it
> to what bandwidth you are getting out of this test via cake, and
> measure the tcp rtts that way. It would be nice to know what the test
> does without wifi in the way.

With no backpressure, I think that test is useless, and with backpressure, I'm
sure there will be lots of latency on the generator, so again back to needing a way to
test DUT directly.

>> But, assuming we can properly measure slow-speed UDP latency through DUT, do you still
>> think that it is possible that DUT is causing significantly different latency to TCP
>> packets?
> 
> Absolutely. It's the return path that's mostly at fault - every two
> tcp packets needs an ack, so
> even if you have working mu-mimo for 4 streams, that's 4 txops
> (minimum) that the clients are going to respond on.
> 
> std Packet caps of this 32 station tcp test would be useful, and
> aircaps would show how effeciently the clients are responding. A lot
> of stations burn a whole txop on a single ack, then get the rest on
> another....

I'll get you some captures next time I have a chance to run this test.

> No, thank you, for sharing. Can you point at some commercial AP we
> could test that does
> better than this on the tcp test?

Not that I recall testing.

Thanks,
Ben

-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com