Re: [Make-wifi-fast] [Starlink] RFC: Latency test case text and example report.

Lets make wifi fast again!
 help / color / mirror / Atom feed

* Re: [Make-wifi-fast] [Starlink] RFC: Latency test case text and example report.
       [not found] <e9d000cf-14de-ed43-f604-72b02d367eb4@candelatech.com>
@ 2021-09-26 22:23 ` Dave Taht
       [not found] ` <CAA93jw6wCD16dcxaHTM-KRAN8sqpmnpZbNakL_rarQWpHuuaQg@mail.gmail.com>
  1 sibling, 0 replies; 4+ messages in thread
From: Dave Taht @ 2021-09-26 22:23 UTC (permalink / raw)
  To: Ben Greear, rpm; +Cc: starlink, Make-Wifi-fast

Thx ben. Why is it we get the most work done on bloat on the weekends?

Adding in the rpm (mostly apple at this point), folk. Their test
shipped last week as part of ios15 and related and is documented here:

https://support.apple.com/en-us/HT212313

I am glad to hear more folk are working on extending tr398. The
numbers you are reporting are at least, better, than with what we were
getting from the
ath10k 5+ years ago, before we reworked the stack. See the 100 station
test here:

https://blog.linuxplumbersconf.org/2016/ocw/system/presentations/3963/original/linuxplumbers_wifi_latency-3Nov.pdf

And I'd hoped that a gang scheduler could be applied on top of that
work to take advantage of the new features in wifi 6.

That said, I don't have any reports of ofdma or du working at all,
from anyone, at this point.

On Sun, Sep 26, 2021 at 2:59 PM Ben Greear <greearb@candelatech.com> wrote:
>
> I have been working on a latency test that I hope can be included in the TR398 issue 3
> document.  It is based somewhat on Toke's paper on buffer bloat and latency testing,
> with a notable change that I'm doing this on 32 stations in part of the test.
>
> I implemented this test case, and an example run against an enterprise grade AX AP
> is here.  There could still be bugs in my implementation, but I think it is at least
> close to correct:
>
> http://www.candelatech.com/examples/tr398v3-latency-report.pdf
>
> TLDR:  Runs OK with single station, but sees 1+second one-way latency with 32 stations and high load, and UDP often
>    is not able to see any throughput at all, I guess due to too many packets being lost
>    or something.  I hope to run against some cutting-edge OpenWRT APs soon.

packet caps are helpful.

>
> One note on TCP Latency:  This is time to transmit a 64k chunk of data over TCP, not a single
> frame.

This number is dependent on the size of the IW as for the minimum
number of round trips required. It's 10 packets in linux, and
recently osx moved from 4 to 10. After that the actual completion time
is governed by loss or marking - and in the case of
truly excessively latencies as you are experiencing tcp tends to send
more packets after a timeout.

(packet caps are helpful)

> My testbed used 32 Intel ax210 radios as stations in this test.
>
> I am interested in feedback from this list if anyone has opinions.

So far as I knew the wifi stack rework and api was now supported by
most of the intel chipsets. AQL was also needed.

please see if you have any "aqm" files: cat /sys/debug/kernel/ieee*/phy*/aqm

>
> Here is text of the test case:
>
> The Latency test intends to verify latency under low, high, and maximum AP traffic load, with
> 1 and 32 stations. Traffic load is 4 bi-directional TCP streams for each station, plus a
> low speed UDP connection to probe latency.
>
> Test Procedure
>
> DUT should be configured for 20Mhz on 2.4Ghz and 80Mhz on 5Ghz and stations should use
> two spatial streams.
>
> 1: For each combination of:  2.4Ghz N, 5Ghz AC, 2.4Ghz AX, 5Ghz AX:
>
> 2: Configure attenuators to emulate 2-meter distance between stations and AP.
>
> 3: Create 32 stations and allow one to associate with the DUT.  The other 31 are admin-down.
>
> 4: Create AP to Station (download) TCP stream, and run for 120 seconds, recoard
>     throughput as 'maximum_load'.  Stop this connection.
>
> 5: Calculate offered_load as 1% of maximum_load.
>
> 6: Create 4 TCP streams on each active station, each configured for Upload and Download rate of
>     offered_load / (4 * active_station_count * 2).
>
> 6: Create 1 UDP stream on each active station, configured for 56kbps traffic Upload and 56kbps traffic Download.
>
> 7: Start all TCP and UDP connections.  Wait 30 seconds to let traffic settle.
>
> 8: Every 10 seconds for 120 seconds, record one-way download latency over the last 10 seconds for each UDP connection.  Depending on test
>     equipment features, this may mean you need to start/stop the UDP every 10 seconds or clear the UDP connection
>     counters.
>
> 9: Calculate offered_load as 70% of maximum_load, and repeat steps 6 - 9 inclusive.
>
> 10: Calculate offered_load as 125% of maximum_load, and repeat steps 6 - 9 inclusive.
>
> 11: Allow the other 31 stations to associate, and repeat steps 5 - 11 inclusive with all 32 stations active.
>
>
> Pass/Fail Criteria
>
> 1: For each test configuration running at 1% of maximum load:  Average of all UDP latency samples must be less than 10ms.
> 2: For each test configuration running at 1% of maximum load:  Maximum of all UDP latency samples must be less than 20ms.
> 3: For each test configuration running at 70% of maximum load:  Average of all UDP latency samples must be less than 20ms.
> 4: For each test configuration running at 70% of maximum load:  Maximum of all UDP latency samples must be less than 40ms.
> 5: For each test configuration running at 125% of maximum load:  Average of all UDP latency samples must be less than 50ms.
> 6: For each test configuration running at 125% of maximum load:  Maximum of all UDP latency samples must be less than 100ms.
> 7: For each test configuration: Each UDP connection upload throughput must be at least 1/2 of requested UDP speed for final 10-second test interval.
> 8: For each test configuration: Each UDP connection download throughput must be at least 1/2 of requested UDP speed for final 10-second test interval.
>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
> _______________________________________________
> Starlink mailing list
> Starlink@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink



-- 
Fixing Starlink's Latencies: https://www.youtube.com/watch?v=c9gLo6Xrwgw

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Make-wifi-fast] [Starlink] RFC: Latency test case text and example report.
       [not found]       ` <95b2dd55-9265-9976-5bd0-52f9a46dd118@candelatech.com>
@ 2022-09-13 18:32         ` Dave Taht
  2022-09-13 19:09           ` Ben Greear
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2022-09-13 18:32 UTC (permalink / raw)
  To: Ben Greear; +Cc: Dave Taht via Starlink, Make-Wifi-fast

On Tue, Sep 13, 2022 at 9:57 AM Ben Greear <greearb@candelatech.com> wrote:
>
> On 9/13/22 9:12 AM, Dave Taht wrote:
> > On Tue, Sep 13, 2022 at 8:58 AM Ben Greear <greearb@candelatech.com> wrote:
> >>
> >> On 9/13/22 8:39 AM, Dave Taht wrote:
> >>> hey, ben, I'm curious if this test made it into TR398? Is it possible
> >>> to setup some of this or parts of TR398 to run over starlink?
> >>>
> >>> I'm also curious as to if any commercial ax APs were testing out
> >>> better than when you tested about this time last year.  I've just gone
> >>> through 9 months of pure hell getting openwrt's implementation of the
> >>> mt76 and ath10k to multiplex a lot better, and making some forward
> >>> progress again (
> >>> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002/830 )
> >>> and along the way ran into new problems with location scanning and
> >>> apple's airdrop....
> >>>
> >>> but I just got a batch of dismal results back from the ax210 and
> >>> mt79... tell me that there's an AP shipping from someone that scales a
> >>> bit better? Lie if you must...
> >>
> >> An mtk7915 based AP that is running recent owrt did better than others.
> >>
> >> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/
> >
> > I wanted to be happy, but... tcp...
> >
> > http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/chart-31.png
> >
> > what's the chipset driving these tests nowadays?
>
> That test was done with MTK virtual stations doing the station load (and multi-gig Eth port
> sending traffic towards the DUT in download direction).

Openwrt driver or factory?

The last major patches for openwrt mt76 wifi landed aug 4, I think.
There are a few more under test now that the OS is stable.

> My assumption is that much of the TCP latency is very likely caused on the
> traffic generator itself, so that is why we measure udp latency for pass/fail
> metrics.

I fear a great deal of it is real, on the path, in the DUT. However
there is a lot in the local stack too.

Here's some things to try. TCP small queues stops being effective (at
this rate) at oh, 8-12 flows,
and they start accruing in the stack and look like an RTT inflation.
A big help is to set TCP_NONSENT_LOWAT to a low value (16k).

sch_fq is actually worse than fq_codel on the driving host as it too
accrues packets.

Trying out tcp reno, and BBR on this workload might show a difference.
I wish LEDBAT++ was available for linux...


... going theoreticall ...

There was some really great work on fractional windows that went into
google's swift congestion control, this is an earlier paper on it:

https://research.google/pubs/pub49448/

and a couple really great papers from google and others last week
from: https://conferences.sigcomm.org/sigcomm/2022/program.html


>
> It would take some special logic, like sniffing eth port and air at same time,
> and matching packets by looking at the packet content closely to really understand DUT TCP latency.
> I'm not sure that is worth the effort.

Heh. I of course, think it is, as TCP is the dominant protocol on the
internet... anyway,
to get a baseline comparison between tcp behaviors, you could just do
a pure ethernet test, set it
to what bandwidth you are getting out of this test via cake, and
measure the tcp rtts that way. It would be nice to know what the test
does without wifi in the way.

>
> But, assuming we can properly measure slow-speed UDP latency through DUT, do you still
> think that it is possible that DUT is causing significantly different latency to TCP
> packets?

Absolutely. It's the return path that's mostly at fault - every two
tcp packets needs an ack, so
even if you have working mu-mimo for 4 streams, that's 4 txops
(minimum) that the clients are going to respond on.

std Packet caps of this 32 station tcp test would be useful, and
aircaps would show how effeciently the clients are responding. A lot
of stations burn a whole txop on a single ack, then get the rest on
another....

>
> Thanks,

No, thank you, for sharing. Can you point at some commercial AP we
could test that does
better than this on the tcp test?

> Ben
>
>
> >
> >> The test was at least tentatively accepted into tr398v3, but I don't think anyone other than ourselves has implemented
> >> or tested it.  I think the pass/fail will need to be adjusted to make it easier to pass.  Some APs were showing
> >> multiple seconds of latency, so maybe a few hundred MS is really OK.
> >>
> >> The test should be able to run over WAN if desired, though it would take a bit
> >> of extra setup to place an upstream LANforge endpoint on a cloud VM.
> >>
> >> If someone@spacex wants to run this test, please contact me off list and we can help
> >> make it happen.
> >>
> >> Thanks,
> >> Ben
> >>
> >>>
> >>> On Sun, Sep 26, 2021 at 2:59 PM Ben Greear <greearb@candelatech.com> wrote:
> >>>>
> >>>> I have been working on a latency test that I hope can be included in the TR398 issue 3
> >>>> document.  It is based somewhat on Toke's paper on buffer bloat and latency testing,
> >>>> with a notable change that I'm doing this on 32 stations in part of the test.
> >>>>
> >>>> I implemented this test case, and an example run against an enterprise grade AX AP
> >>>> is here.  There could still be bugs in my implementation, but I think it is at least
> >>>> close to correct:
> >>>>
> >>>> http://www.candelatech.com/examples/tr398v3-latency-report.pdf
> >>>>
> >>>> TLDR:  Runs OK with single station, but sees 1+second one-way latency with 32 stations and high load, and UDP often
> >>>>      is not able to see any throughput at all, I guess due to too many packets being lost
> >>>>      or something.  I hope to run against some cutting-edge OpenWRT APs soon.
> >>>>
> >>>> One note on TCP Latency:  This is time to transmit a 64k chunk of data over TCP, not a single
> >>>> frame.
> >>>>
> >>>> My testbed used 32 Intel ax210 radios as stations in this test.
> >>>>
> >>>> I am interested in feedback from this list if anyone has opinions.
> >>>>
> >>>> Here is text of the test case:
> >>>>
> >>>> The Latency test intends to verify latency under low, high, and maximum AP traffic load, with
> >>>> 1 and 32 stations. Traffic load is 4 bi-directional TCP streams for each station, plus a
> >>>> low speed UDP connection to probe latency.
> >>>>
> >>>> Test Procedure
> >>>>
> >>>> DUT should be configured for 20Mhz on 2.4Ghz and 80Mhz on 5Ghz and stations should use
> >>>> two spatial streams.
> >>>>
> >>>> 1: For each combination of:  2.4Ghz N, 5Ghz AC, 2.4Ghz AX, 5Ghz AX:
> >>>>
> >>>> 2: Configure attenuators to emulate 2-meter distance between stations and AP.
> >>>>
> >>>> 3: Create 32 stations and allow one to associate with the DUT.  The other 31 are admin-down.
> >>>>
> >>>> 4: Create AP to Station (download) TCP stream, and run for 120 seconds, recoard
> >>>>       throughput as 'maximum_load'.  Stop this connection.
> >>>>
> >>>> 5: Calculate offered_load as 1% of maximum_load.
> >>>>
> >>>> 6: Create 4 TCP streams on each active station, each configured for Upload and Download rate of
> >>>>       offered_load / (4 * active_station_count * 2).
> >>>>
> >>>> 6: Create 1 UDP stream on each active station, configured for 56kbps traffic Upload and 56kbps traffic Download.
> >>>>
> >>>> 7: Start all TCP and UDP connections.  Wait 30 seconds to let traffic settle.
> >>>>
> >>>> 8: Every 10 seconds for 120 seconds, record one-way download latency over the last 10 seconds for each UDP connection.  Depending on test
> >>>>       equipment features, this may mean you need to start/stop the UDP every 10 seconds or clear the UDP connection
> >>>>       counters.
> >>>>
> >>>> 9: Calculate offered_load as 70% of maximum_load, and repeat steps 6 - 9 inclusive.
> >>>>
> >>>> 10: Calculate offered_load as 125% of maximum_load, and repeat steps 6 - 9 inclusive.
> >>>>
> >>>> 11: Allow the other 31 stations to associate, and repeat steps 5 - 11 inclusive with all 32 stations active.
> >>>>
> >>>>
> >>>> Pass/Fail Criteria
> >>>>
> >>>> 1: For each test configuration running at 1% of maximum load:  Average of all UDP latency samples must be less than 10ms.
> >>>> 2: For each test configuration running at 1% of maximum load:  Maximum of all UDP latency samples must be less than 20ms.
> >>>> 3: For each test configuration running at 70% of maximum load:  Average of all UDP latency samples must be less than 20ms.
> >>>> 4: For each test configuration running at 70% of maximum load:  Maximum of all UDP latency samples must be less than 40ms.
> >>>> 5: For each test configuration running at 125% of maximum load:  Average of all UDP latency samples must be less than 50ms.
> >>>> 6: For each test configuration running at 125% of maximum load:  Maximum of all UDP latency samples must be less than 100ms.
> >>>> 7: For each test configuration: Each UDP connection upload throughput must be at least 1/2 of requested UDP speed for final 10-second test interval.
> >>>> 8: For each test configuration: Each UDP connection download throughput must be at least 1/2 of requested UDP speed for final 10-second test interval.
> >>>>
> >>>>
> >>>> --
> >>>> Ben Greear <greearb@candelatech.com>
> >>>> Candela Technologies Inc  http://www.candelatech.com
> >>>> _______________________________________________
> >>>> Starlink mailing list
> >>>> Starlink@lists.bufferbloat.net
> >>>> https://lists.bufferbloat.net/listinfo/starlink
> >>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Ben Greear <greearb@candelatech.com>
> >> Candela Technologies Inc  http://www.candelatech.com
> >>
> >
> >
>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
>


-- 
FQ World Domination pending: https://blog.cerowrt.org/post/state_of_fq_codel/
Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Make-wifi-fast] [Starlink] RFC: Latency test case text and example report.
  2022-09-13 18:32         ` Dave Taht
@ 2022-09-13 19:09           ` Ben Greear
  2022-09-13 19:25             ` Bob McMahon
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Greear @ 2022-09-13 19:09 UTC (permalink / raw)
  To: Dave Taht; +Cc: Dave Taht via Starlink, Make-Wifi-fast

On 9/13/22 11:32 AM, Dave Taht wrote:
> On Tue, Sep 13, 2022 at 9:57 AM Ben Greear <greearb@candelatech.com> wrote:
>>
>> On 9/13/22 9:12 AM, Dave Taht wrote:
>>> On Tue, Sep 13, 2022 at 8:58 AM Ben Greear <greearb@candelatech.com> wrote:
>>>>
>>>> On 9/13/22 8:39 AM, Dave Taht wrote:
>>>>> hey, ben, I'm curious if this test made it into TR398? Is it possible
>>>>> to setup some of this or parts of TR398 to run over starlink?
>>>>>
>>>>> I'm also curious as to if any commercial ax APs were testing out
>>>>> better than when you tested about this time last year.  I've just gone
>>>>> through 9 months of pure hell getting openwrt's implementation of the
>>>>> mt76 and ath10k to multiplex a lot better, and making some forward
>>>>> progress again (
>>>>> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002/830 )
>>>>> and along the way ran into new problems with location scanning and
>>>>> apple's airdrop....
>>>>>
>>>>> but I just got a batch of dismal results back from the ax210 and
>>>>> mt79... tell me that there's an AP shipping from someone that scales a
>>>>> bit better? Lie if you must...
>>>>
>>>> An mtk7915 based AP that is running recent owrt did better than others.
>>>>
>>>> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/
>>>
>>> I wanted to be happy, but... tcp...
>>>
>>> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/chart-31.png
>>>
>>> what's the chipset driving these tests nowadays?
>>
>> That test was done with MTK virtual stations doing the station load (and multi-gig Eth port
>> sending traffic towards the DUT in download direction).
> 
> Openwrt driver or factory?

I run my own kernel, but it would have been 5.17 plus a bunch of patches from mtk tree that
owrt uses, plus my own hackings.

> 
> The last major patches for openwrt mt76 wifi landed aug 4, I think.
> There are a few more under test now that the OS is stable.
> 
>> My assumption is that much of the TCP latency is very likely caused on the
>> traffic generator itself, so that is why we measure udp latency for pass/fail
>> metrics.
> 
> I fear a great deal of it is real, on the path, in the DUT. However
> there is a lot in the local stack too.
> 
> Here's some things to try. TCP small queues stops being effective (at
> this rate) at oh, 8-12 flows,
> and they start accruing in the stack and look like an RTT inflation.
> A big help is to set TCP_NONSENT_LOWAT to a low value (16k).
> 
> sch_fq is actually worse than fq_codel on the driving host as it too
> accrues packets.
> 
> Trying out tcp reno, and BBR on this workload might show a difference.
> I wish LEDBAT++ was available for linux...

I have not much interest in trying to get the traffic generator to report less TCP latency,
by tuning the traffic generator because whatever it reports, I do not trust it to not be a
significant part of the over-all end-to-end latency.

So, better question for me is how to get precise info on TCP latency through the DUT for
a generic traffic generator.

We put sequence numbers and time-stamps in our traffic generator payloads, so we could
use wireshark/tshark captures on Eth and WiFi to detect latency from when DUT would have received
the pkt on Eth port and transmitted it on WiFi.  It would be...interesting...to take a multi-million packet
capture of 32 stations doing 4 tcp streams each and try to make sense of that.  I don't think
I'd want to spend time on that now, but if you'd like a pair of packet captures to try it yourself, I'd be happy
to generate them and make them available.

Or, if you have other ideas for how to test DUT tcp latency under load/scale without having to overly
trust the packet generator's latency reporting, please let me know.

> 
> 
> ... going theoreticall ...
> 
> There was some really great work on fractional windows that went into
> google's swift congestion control, this is an earlier paper on it:
> 
> https://research.google/pubs/pub49448/
> 
> and a couple really great papers from google and others last week
> from: https://conferences.sigcomm.org/sigcomm/2022/program.html
> 
> 
>>
>> It would take some special logic, like sniffing eth port and air at same time,
>> and matching packets by looking at the packet content closely to really understand DUT TCP latency.
>> I'm not sure that is worth the effort.
> 
> Heh. I of course, think it is, as TCP is the dominant protocol on the
> internet... anyway,
> to get a baseline comparison between tcp behaviors, you could just do
> a pure ethernet test, set it
> to what bandwidth you are getting out of this test via cake, and
> measure the tcp rtts that way. It would be nice to know what the test
> does without wifi in the way.

With no backpressure, I think that test is useless, and with backpressure, I'm
sure there will be lots of latency on the generator, so again back to needing a way to
test DUT directly.


>> But, assuming we can properly measure slow-speed UDP latency through DUT, do you still
>> think that it is possible that DUT is causing significantly different latency to TCP
>> packets?
> 
> Absolutely. It's the return path that's mostly at fault - every two
> tcp packets needs an ack, so
> even if you have working mu-mimo for 4 streams, that's 4 txops
> (minimum) that the clients are going to respond on.
> 
> std Packet caps of this 32 station tcp test would be useful, and
> aircaps would show how effeciently the clients are responding. A lot
> of stations burn a whole txop on a single ack, then get the rest on
> another....

I'll get you some captures next time I have a chance to run this test.

> No, thank you, for sharing. Can you point at some commercial AP we
> could test that does
> better than this on the tcp test?

Not that I recall testing.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Make-wifi-fast] [Starlink] RFC: Latency test case text and example report.
  2022-09-13 19:09           ` Ben Greear
@ 2022-09-13 19:25             ` Bob McMahon
  0 siblings, 0 replies; 4+ messages in thread
From: Bob McMahon @ 2022-09-13 19:25 UTC (permalink / raw)
  To: Ben Greear; +Cc: Dave Taht, Dave Taht via Starlink, Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 8183 bytes --]

We find iperf2 <https://sourceforge.net/projects/iperf2/>'s
--tcp-write-prefetch at some small value can be trusted and eliminates send
side bloat. It sets TCP_NOTSENT_LOWAT. This probably should be used
generally as send side bloat adds no value to e2e traffic.

The iperf 2 bounce-back feature we find as useful too. It has both RTT
sampling (read network level RTTs) and app level round trip times in the
same sample report. You'll need version 2.1.8 for this. Openwrt doesn't
have a maintainer for iperf 2 and supports a very old version.

In general, this test seems contrived.  Also, probably a good idea to add
some variable phase shifters to the rig.  Use a 5 branch tree to produce
the distance matrices.  Anyway, a lot can be done on the RF side beyond
attenuators.

Pass/fail should be statistically based. Latency should be analyzed at the
tails of the distributions.

Bob

On Tue, Sep 13, 2022 at 12:09 PM Ben Greear via Make-wifi-fast <
make-wifi-fast@lists.bufferbloat.net> wrote:

> On 9/13/22 11:32 AM, Dave Taht wrote:
> > On Tue, Sep 13, 2022 at 9:57 AM Ben Greear <greearb@candelatech.com>
> wrote:
> >>
> >> On 9/13/22 9:12 AM, Dave Taht wrote:
> >>> On Tue, Sep 13, 2022 at 8:58 AM Ben Greear <greearb@candelatech.com>
> wrote:
> >>>>
> >>>> On 9/13/22 8:39 AM, Dave Taht wrote:
> >>>>> hey, ben, I'm curious if this test made it into TR398? Is it possible
> >>>>> to setup some of this or parts of TR398 to run over starlink?
> >>>>>
> >>>>> I'm also curious as to if any commercial ax APs were testing out
> >>>>> better than when you tested about this time last year.  I've just
> gone
> >>>>> through 9 months of pure hell getting openwrt's implementation of the
> >>>>> mt76 and ath10k to multiplex a lot better, and making some forward
> >>>>> progress again (
> >>>>> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002/830 )
> >>>>> and along the way ran into new problems with location scanning and
> >>>>> apple's airdrop....
> >>>>>
> >>>>> but I just got a batch of dismal results back from the ax210 and
> >>>>> mt79... tell me that there's an AP shipping from someone that scales
> a
> >>>>> bit better? Lie if you must...
> >>>>
> >>>> An mtk7915 based AP that is running recent owrt did better than
> others.
> >>>>
> >>>>
> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/
> >>>
> >>> I wanted to be happy, but... tcp...
> >>>
> >>>
> http://www.candelatech.com/examples/TR-398v2-2022-06-05-08-28-57-6.2.6-latency-virt-sta-new-atf-c/chart-31.png
> >>>
> >>> what's the chipset driving these tests nowadays?
> >>
> >> That test was done with MTK virtual stations doing the station load
> (and multi-gig Eth port
> >> sending traffic towards the DUT in download direction).
> >
> > Openwrt driver or factory?
>
> I run my own kernel, but it would have been 5.17 plus a bunch of patches
> from mtk tree that
> owrt uses, plus my own hackings.
>
> >
> > The last major patches for openwrt mt76 wifi landed aug 4, I think.
> > There are a few more under test now that the OS is stable.
> >
> >> My assumption is that much of the TCP latency is very likely caused on
> the
> >> traffic generator itself, so that is why we measure udp latency for
> pass/fail
> >> metrics.
> >
> > I fear a great deal of it is real, on the path, in the DUT. However
> > there is a lot in the local stack too.
> >
> > Here's some things to try. TCP small queues stops being effective (at
> > this rate) at oh, 8-12 flows,
> > and they start accruing in the stack and look like an RTT inflation.
> > A big help is to set TCP_NONSENT_LOWAT to a low value (16k).
> >
> > sch_fq is actually worse than fq_codel on the driving host as it too
> > accrues packets.
> >
> > Trying out tcp reno, and BBR on this workload might show a difference.
> > I wish LEDBAT++ was available for linux...
>
> I have not much interest in trying to get the traffic generator to report
> less TCP latency,
> by tuning the traffic generator because whatever it reports, I do not
> trust it to not be a
> significant part of the over-all end-to-end latency.
>
> So, better question for me is how to get precise info on TCP latency
> through the DUT for
> a generic traffic generator.
>
> We put sequence numbers and time-stamps in our traffic generator payloads,
> so we could
> use wireshark/tshark captures on Eth and WiFi to detect latency from when
> DUT would have received
> the pkt on Eth port and transmitted it on WiFi.  It would
> be...interesting...to take a multi-million packet
> capture of 32 stations doing 4 tcp streams each and try to make sense of
> that.  I don't think
> I'd want to spend time on that now, but if you'd like a pair of packet
> captures to try it yourself, I'd be happy
> to generate them and make them available.
>
> Or, if you have other ideas for how to test DUT tcp latency under
> load/scale without having to overly
> trust the packet generator's latency reporting, please let me know.
>
> >
> >
> > ... going theoreticall ...
> >
> > There was some really great work on fractional windows that went into
> > google's swift congestion control, this is an earlier paper on it:
> >
> > https://research.google/pubs/pub49448/
> >
> > and a couple really great papers from google and others last week
> > from: https://conferences.sigcomm.org/sigcomm/2022/program.html
> >
> >
> >>
> >> It would take some special logic, like sniffing eth port and air at
> same time,
> >> and matching packets by looking at the packet content closely to really
> understand DUT TCP latency.
> >> I'm not sure that is worth the effort.
> >
> > Heh. I of course, think it is, as TCP is the dominant protocol on the
> > internet... anyway,
> > to get a baseline comparison between tcp behaviors, you could just do
> > a pure ethernet test, set it
> > to what bandwidth you are getting out of this test via cake, and
> > measure the tcp rtts that way. It would be nice to know what the test
> > does without wifi in the way.
>
> With no backpressure, I think that test is useless, and with backpressure,
> I'm
> sure there will be lots of latency on the generator, so again back to
> needing a way to
> test DUT directly.
>
>
> >> But, assuming we can properly measure slow-speed UDP latency through
> DUT, do you still
> >> think that it is possible that DUT is causing significantly different
> latency to TCP
> >> packets?
> >
> > Absolutely. It's the return path that's mostly at fault - every two
> > tcp packets needs an ack, so
> > even if you have working mu-mimo for 4 streams, that's 4 txops
> > (minimum) that the clients are going to respond on.
> >
> > std Packet caps of this 32 station tcp test would be useful, and
> > aircaps would show how effeciently the clients are responding. A lot
> > of stations burn a whole txop on a single ack, then get the rest on
> > another....
>
> I'll get you some captures next time I have a chance to run this test.
>
> > No, thank you, for sharing. Can you point at some commercial AP we
> > could test that does
> > better than this on the tcp test?
>
> Not that I recall testing.
>
> Thanks,
> Ben
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com
>
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: Type: text/html, Size: 10764 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-09-13 19:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e9d000cf-14de-ed43-f604-72b02d367eb4@candelatech.com>
2021-09-26 22:23 ` [Make-wifi-fast] [Starlink] RFC: Latency test case text and example report Dave Taht
     [not found] ` <CAA93jw6wCD16dcxaHTM-KRAN8sqpmnpZbNakL_rarQWpHuuaQg@mail.gmail.com>
     [not found]   ` <a489ee1f-061d-a605-66f3-6213a51d3bd5@candelatech.com>
     [not found]     ` <CAA93jw5c=VPo2FGB5sUEqe2sBDv5kE-GSLW45qPcf+-G-=xgow@mail.gmail.com>
     [not found]       ` <95b2dd55-9265-9976-5bd0-52f9a46dd118@candelatech.com>
2022-09-13 18:32         ` Dave Taht
2022-09-13 19:09           ` Ben Greear
2022-09-13 19:25             ` Bob McMahon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox