Lets make wifi fast again!
 help / color / mirror / Atom feed
* [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
@ 2020-05-14 16:43 Tim Higgins
  2020-05-14 21:38 ` Bob McMahon
  2020-05-15  6:47 ` Erkki Lintunen
  0 siblings, 2 replies; 16+ messages in thread
From: Tim Higgins @ 2020-05-14 16:43 UTC (permalink / raw)
  To: Make-Wifi-fast

[-- Attachment #1: Type: text/html, Size: 1066 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-14 16:43 [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work? Tim Higgins
@ 2020-05-14 21:38 ` Bob McMahon
  2020-05-14 21:42   ` Bob McMahon
  2020-05-15  6:47 ` Erkki Lintunen
  1 sibling, 1 reply; 16+ messages in thread
From: Bob McMahon @ 2020-05-14 21:38 UTC (permalink / raw)
  To: Tim Higgins; +Cc: Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 2248 bytes --]

I haven't looked closely at OFDMA but these latency numbers seem way too
high for it to matter.  Why is the latency so high?  It suggests there may
be queueing delay (bloat) unrelated to media access.

Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per
trigger frame.  EDCA kinda sucks per listen before talk which is about 100
microseconds on average which has to be paid even when no energy detect.
This limits the transmits per second performance to 10K (1/0.0001.). Also
remember that WiFi aggregates so transmissions have multiple packets and
long transmits will consume those 10K tx ops. One way to get around
aggregation is to use voice (VO) access class which many devices won't
aggregate (mileage will vary.). Then take a packets per second
measurement with small packets.  This would give an idea on the frame
scheduling being AP based vs EDCA.

Also, measuring ping time as a proxy for latency isn't ideal. Better to
measure trip times of the actual traffic.  This requires clock sync to a
common reference. GPS atomic clocks are available but it does take some
setup work.

I haven't thought about RU optimizations and that testing so can't really
comment there.

Also, I'd consider replacing the mechanical turn table with variable phase
shifters and set them in the MIMO (or H-Matrix) path.  I use model 8421
from Aeroflex
<https://www.apitech.com/globalassets/documents/products/rf-microwave-microelectronics-power-solutions/rf-components/phase-shifter-subsystem/wmod84208421.pdf>.
Others make them too.

Bob

On Thu, May 14, 2020 at 9:43 AM Tim Higgins <tim@smallnetbuilder.com> wrote:

> Hi folks,
>
> I decided to publish some details of the hoops I've been jumping through
> to try to find benefit from OFDMA.
> It's proving very hard to do.
>
>
> https://www.smallnetbuilder.com/wireless/wireless-features/33222-does-ofdma-really-work-part-1
>
> I'll publish results from real devices next. But I'm still trying
> different things to get SOMETHING to show an improvement from OFDMA.
>
> Suggestions are welcome.
> ===========
> Tim
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast

[-- Attachment #2: Type: text/html, Size: 3256 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-14 21:38 ` Bob McMahon
@ 2020-05-14 21:42   ` Bob McMahon
  2020-05-15 15:20     ` Tim Higgins
  0 siblings, 1 reply; 16+ messages in thread
From: Bob McMahon @ 2020-05-14 21:42 UTC (permalink / raw)
  To: Tim Higgins; +Cc: Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 2823 bytes --]

Also, forgot to mention, for latency don't rely on average as most don't
care about that.  Maybe use the upper 3 stdev, i.e. the 99.97% point.  Our
latency runs will repeat 20 seconds worth of packets and find that then
calculate CDFs of this point in the tail across hundreds of runs under
different conditions. One "slow packet" is all that it takes to screw up
user experience when it comes to latency.

Bob

On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com>
wrote:

> I haven't looked closely at OFDMA but these latency numbers seem way too
> high for it to matter.  Why is the latency so high?  It suggests there may
> be queueing delay (bloat) unrelated to media access.
>
> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per
> trigger frame.  EDCA kinda sucks per listen before talk which is about 100
> microseconds on average which has to be paid even when no energy detect.
> This limits the transmits per second performance to 10K (1/0.0001.). Also
> remember that WiFi aggregates so transmissions have multiple packets and
> long transmits will consume those 10K tx ops. One way to get around
> aggregation is to use voice (VO) access class which many devices won't
> aggregate (mileage will vary.). Then take a packets per second
> measurement with small packets.  This would give an idea on the frame
> scheduling being AP based vs EDCA.
>
> Also, measuring ping time as a proxy for latency isn't ideal. Better to
> measure trip times of the actual traffic.  This requires clock sync to a
> common reference. GPS atomic clocks are available but it does take some
> setup work.
>
> I haven't thought about RU optimizations and that testing so can't really
> comment there.
>
> Also, I'd consider replacing the mechanical turn table with variable phase
> shifters and set them in the MIMO (or H-Matrix) path.  I use model 8421
> from Aeroflex
> <https://www.apitech.com/globalassets/documents/products/rf-microwave-microelectronics-power-solutions/rf-components/phase-shifter-subsystem/wmod84208421.pdf>.
> Others make them too.
>
> Bob
>
> On Thu, May 14, 2020 at 9:43 AM Tim Higgins <tim@smallnetbuilder.com>
> wrote:
>
>> Hi folks,
>>
>> I decided to publish some details of the hoops I've been jumping through
>> to try to find benefit from OFDMA.
>> It's proving very hard to do.
>>
>>
>> https://www.smallnetbuilder.com/wireless/wireless-features/33222-does-ofdma-really-work-part-1
>>
>> I'll publish results from real devices next. But I'm still trying
>> different things to get SOMETHING to show an improvement from OFDMA.
>>
>> Suggestions are welcome.
>> ===========
>> Tim
>> _______________________________________________
>> Make-wifi-fast mailing list
>> Make-wifi-fast@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/make-wifi-fast
>
>

[-- Attachment #2: Type: text/html, Size: 4092 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-14 16:43 [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work? Tim Higgins
  2020-05-14 21:38 ` Bob McMahon
@ 2020-05-15  6:47 ` Erkki Lintunen
  2020-05-15 15:34   ` Tim Higgins
  1 sibling, 1 reply; 16+ messages in thread
From: Erkki Lintunen @ 2020-05-15  6:47 UTC (permalink / raw)
  To: make-wifi-fast


Hi,

thank you for the article, it provided interesting tidbits. What buffles 
me (I'm just a smallnet builder not a wifi rf-chip designer or a wifi 
firmware writer), why not a simple comparative benchmark between APs 
from different generations? Say an AX router with OFDMA on (no need to 
rely firmware does the best, if OFDMA set on or off, very interesting 
that tests showed differences between OFDMA on/off and there is a 
tertiary feature, I think, to switch it on/off as marketing material and 
publications to be taken seriously tout OFDMA is a silver bullet WiFi6 
written on it), an AC router with stock. latest and greatest firmware in 
it and an AC router with OpenWRT and SQM/Cake in it. All tree put 
through the same tests to reveal differences in throughput, latency, 
airtime congestion between STAs and most importantly previous quatities 
measured under maximum load.


- Erkki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-14 21:42   ` Bob McMahon
@ 2020-05-15 15:20     ` Tim Higgins
  2020-05-15 19:36       ` Bob McMahon
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Higgins @ 2020-05-15 15:20 UTC (permalink / raw)
  To: Bob McMahon; +Cc: tim, Make-Wifi-fast

[-- Attachment #1: Type: text/html, Size: 4982 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15  6:47 ` Erkki Lintunen
@ 2020-05-15 15:34   ` Tim Higgins
  2020-05-15 16:25     ` Dave Taht
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Higgins @ 2020-05-15 15:34 UTC (permalink / raw)
  To: Erkki Lintunen; +Cc: Make-Wifi-fast

[-- Attachment #1: Type: text/html, Size: 2412 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 15:34   ` Tim Higgins
@ 2020-05-15 16:25     ` Dave Taht
  2020-05-15 16:41       ` Tim Higgins
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Taht @ 2020-05-15 16:25 UTC (permalink / raw)
  To: Tim Higgins; +Cc: Erkki Lintunen, Make-Wifi-fast

On Fri, May 15, 2020 at 8:34 AM Tim Higgins <tim@timhiggins.com> wrote:
>
> Hi Erkki,
>
> Thanks for your comments.
>
> A simple test is exactly what I'm after. The Part 1 article was basically a look behind the scenes to describe why I used the benchmark I did for the Part2 article. It will compare an AC router and multiple AX routers. If there's an off-the-shelf consumer router with that implements SQM/cake on Wi-Fi, I'm happy to try it. I have an evenroute IQrouter here, but I'm not sure if it implements SQM on Wi-Fi. I'll ask.

"SQM" isn't the thing. SQM is for better managing the ISP uplinks/downlinks.

it's the fq_codel and aqm scheduler for wifi. I'm under the impression
that the evenroute v2 production version now has some of the
aql/fq_codel related code backported to it. I've been off tuning that
up over here: https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/
but that code has not made it back into mainline.

v3 has long had it, but was never optimized properly because I didn't
have one. Still I've never had a testbed as nice as yours,
it would be great, absolutely great, to get some feedback on your
methods you are applying to see how decent it is today! :)

The principal mis-feature in the current codebases for openwrt in
wifi's case is some code we left in there keeping the codel target
too high for 5ghz wifi, and far too large hw retries in the firmware
and driver for these two chipsets. However there is no express
supportfor mu-mimo in our versions as, like you, we discoverd the
concept didn't work - and couldn't ever work as designed - so we
discarded it in favor of better per station scheduling in the first
place.

https://arxiv.org/pdf/1703.00064.pdf
There were so many things left unfinished in that project. We could
have handled 802.11e much better than we do.





-- 
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman

dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 16:25     ` Dave Taht
@ 2020-05-15 16:41       ` Tim Higgins
  0 siblings, 0 replies; 16+ messages in thread
From: Tim Higgins @ 2020-05-15 16:41 UTC (permalink / raw)
  To: Dave Taht; +Cc: tim, Erkki Lintunen, Make-Wifi-fast

[-- Attachment #1: Type: text/html, Size: 3303 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 15:20     ` Tim Higgins
@ 2020-05-15 19:36       ` Bob McMahon
  2020-05-15 19:50         ` Tim Higgins
  0 siblings, 1 reply; 16+ messages in thread
From: Bob McMahon @ 2020-05-15 19:36 UTC (permalink / raw)
  To: Tim Higgins; +Cc: Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 4291 bytes --]

Latency testing under "heavy traffic" isn't ideal.  If the input
rate exceeds the service rate of any queue for any period of time the queue
fills up and latency hits a worst case per that queue depth.  I'd say take
latency measurements when the input rates are below the service rates. The
measurements when service rates are less than input rates are less
about latency and more about bloat.

Also, a good paper is this one on trading bandwidth for ultra low latency
<https://people.csail.mit.edu/alizadeh/papers/hull-nsdi12.pdf>using phantom
queues and ECN.

Another thing to consider is that network engineers tend to have a mioptic
view of latency.  The queueing or delay between the socket writes/reads and
network stack matters too.  Network engineers focus on packets or TCP RTTs
and somewhat overlook a user's true end to end experience.  Avoiding bloat
by slowing down the writes, e.g. ECN or different scheduling, still
contributes to end/end latency between the writes() and the reads() that
too few test for and monitor.

Note: We're moving to trip times of writes to reads (or frames for video)
for our testing. We are also replacing/supplementing pings with TCP
connects as other "latency related" measurements. TCP connects are more
important than ping.

Bob

On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com> wrote:

> Hi Bob,
>
> Thanks for your comments and feedback. Responses below:
>
> On 5/14/2020 5:42 PM, Bob McMahon wrote:
>
> Also, forgot to mention, for latency don't rely on average as most don't
> care about that.  Maybe use the upper 3 stdev, i.e. the 99.97% point.  Our
> latency runs will repeat 20 seconds worth of packets and find that then
> calculate CDFs of this point in the tail across hundreds of runs under
> different conditions. One "slow packet" is all that it takes to screw up
> user experience when it comes to latency.
>
> Thanks for the guidance.
>
>
> On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com>
> wrote:
>
>> I haven't looked closely at OFDMA but these latency numbers seem way too
>> high for it to matter.  Why is the latency so high?  It suggests there may
>> be queueing delay (bloat) unrelated to media access.
>>
>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per
>> trigger frame.  EDCA kinda sucks per listen before talk which is about 100
>> microseconds on average which has to be paid even when no energy detect.
>> This limits the transmits per second performance to 10K (1/0.0001.). Also
>> remember that WiFi aggregates so transmissions have multiple packets and
>> long transmits will consume those 10K tx ops. One way to get around
>> aggregation is to use voice (VO) access class which many devices won't
>> aggregate (mileage will vary.). Then take a packets per second
>> measurement with small packets.  This would give an idea on the frame
>> scheduling being AP based vs EDCA.
>>
>> Also, measuring ping time as a proxy for latency isn't ideal. Better to
>> measure trip times of the actual traffic.  This requires clock sync to a
>> common reference. GPS atomic clocks are available but it does take some
>> setup work.
>>
>> I haven't thought about RU optimizations and that testing so can't really
>> comment there.
>>
>> Also, I'd consider replacing the mechanical turn table with variable
>> phase shifters and set them in the MIMO (or H-Matrix) path.  I use model
>> 8421 from Aeroflex
>> <https://www.apitech.com/globalassets/documents/products/rf-microwave-microelectronics-power-solutions/rf-components/phase-shifter-subsystem/wmod84208421.pdf>.
>> Others make them too.
>>
>> Thanks again for the suggestions. I agree latency is very high when I
> remove the traffic bandwidth caps. I don't know why. One of the key
> questions I've had since starting to mess with OFDMA is whether it helps
> under light or heavy traffic load. All I do know is that things go to hell
> when you load the channel. And RRUL test methods essentially break OFDMA.
>
> I agree using ping isn't ideal. But I'm approaching this as creating a
> test that a consumer audience can understand. Ping is something consumers
> care about and understand.  The octoScope STApals are all ntp sync'd and
> latency measurements using iperf have been done by them.
>
>
>

[-- Attachment #2: Type: text/html, Size: 6328 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 19:36       ` Bob McMahon
@ 2020-05-15 19:50         ` Tim Higgins
  2020-05-15 20:05           ` Bob McMahon
                             ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Tim Higgins @ 2020-05-15 19:50 UTC (permalink / raw)
  To: Bob McMahon; +Cc: tim, Make-Wifi-fast

[-- Attachment #1: Type: text/html, Size: 8740 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 19:50         ` Tim Higgins
@ 2020-05-15 20:05           ` Bob McMahon
  2020-05-15 20:24           ` Toke Høiland-Jørgensen
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Bob McMahon @ 2020-05-15 20:05 UTC (permalink / raw)
  To: Tim Higgins; +Cc: Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 5826 bytes --]

iperf 2.0.14 supports --connect-only tests and also shows the connect
times.  It's currently broken and I plan to fix it soon.

My brother works for NASA and when designing the shuttle engineers focussed
on weight/mass because the energy required to achieve low earth orbit is
driven by that. My brother, a PhD in fracture mechanics, said weight
without consideration for structural integrity trade offs was a mistake.

In that analogy, latency and bloat, while correlated, aren't the same
thing. I think by separating them one can better understand how a system
will perform.  I suspect your tests with 120ms of latency are really
measuring bloat.  Little's law is average queue depth = average effective
arrival rate * average service time.  Bloat is mostly about excessive queue
depths and latency mostly about excessive service times. Since they affect
one another it's easy to conflate them.

Bob



On Fri, May 15, 2020 at 12:50 PM Tim Higgins <tim@smallnetbuilder.com>
wrote:

> Thanks for the additional insights, Bob. How do you measure TCP connects?
>
> Does Dave or anyone else on the bufferbloat team want to comment on Bob's
> comment that latency testing under "heavy traffic" isn't ideal?
>
> My impression is that the rtt_fair_var test I used in the article and
> other RRUL-related Flent tests fully load the connection under test. Am I
> incorrect?
>
> ===
> On 5/15/2020 3:36 PM, Bob McMahon wrote:
>
> Latency testing under "heavy traffic" isn't ideal.  If the input
> rate exceeds the service rate of any queue for any period of time the queue
> fills up and latency hits a worst case per that queue depth.  I'd say take
> latency measurements when the input rates are below the service rates. The
> measurements when service rates are less than input rates are less
> about latency and more about bloat.
>
> Also, a good paper is this one on trading bandwidth for ultra low latency
> <https://people.csail.mit.edu/alizadeh/papers/hull-nsdi12.pdf>using
> phantom queues and ECN.
>
> Another thing to consider is that network engineers tend to have a mioptic
> view of latency.  The queueing or delay between the socket writes/reads and
> network stack matters too.  Network engineers focus on packets or TCP RTTs
> and somewhat overlook a user's true end to end experience.  Avoiding bloat
> by slowing down the writes, e.g. ECN or different scheduling, still
> contributes to end/end latency between the writes() and the reads() that
> too few test for and monitor.
>
> Note: We're moving to trip times of writes to reads (or frames for video)
> for our testing. We are also replacing/supplementing pings with TCP
> connects as other "latency related" measurements. TCP connects are more
> important than ping.
>
> Bob
>
> On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com>
> wrote:
>
>> Hi Bob,
>>
>> Thanks for your comments and feedback. Responses below:
>>
>> On 5/14/2020 5:42 PM, Bob McMahon wrote:
>>
>> Also, forgot to mention, for latency don't rely on average as most don't
>> care about that.  Maybe use the upper 3 stdev, i.e. the 99.97% point.  Our
>> latency runs will repeat 20 seconds worth of packets and find that then
>> calculate CDFs of this point in the tail across hundreds of runs under
>> different conditions. One "slow packet" is all that it takes to screw up
>> user experience when it comes to latency.
>>
>> Thanks for the guidance.
>>
>>
>> On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com>
>> wrote:
>>
>>> I haven't looked closely at OFDMA but these latency numbers seem way too
>>> high for it to matter.  Why is the latency so high?  It suggests there may
>>> be queueing delay (bloat) unrelated to media access.
>>>
>>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per
>>> trigger frame.  EDCA kinda sucks per listen before talk which is about 100
>>> microseconds on average which has to be paid even when no energy detect.
>>> This limits the transmits per second performance to 10K (1/0.0001.). Also
>>> remember that WiFi aggregates so transmissions have multiple packets and
>>> long transmits will consume those 10K tx ops. One way to get around
>>> aggregation is to use voice (VO) access class which many devices won't
>>> aggregate (mileage will vary.). Then take a packets per second
>>> measurement with small packets.  This would give an idea on the frame
>>> scheduling being AP based vs EDCA.
>>>
>>> Also, measuring ping time as a proxy for latency isn't ideal. Better to
>>> measure trip times of the actual traffic.  This requires clock sync to a
>>> common reference. GPS atomic clocks are available but it does take some
>>> setup work.
>>>
>>> I haven't thought about RU optimizations and that testing so can't
>>> really comment there.
>>>
>>> Also, I'd consider replacing the mechanical turn table with variable
>>> phase shifters and set them in the MIMO (or H-Matrix) path.  I use model
>>> 8421 from Aeroflex
>>> <https://www.apitech.com/globalassets/documents/products/rf-microwave-microelectronics-power-solutions/rf-components/phase-shifter-subsystem/wmod84208421.pdf>.
>>> Others make them too.
>>>
>>> Thanks again for the suggestions. I agree latency is very high when I
>> remove the traffic bandwidth caps. I don't know why. One of the key
>> questions I've had since starting to mess with OFDMA is whether it helps
>> under light or heavy traffic load. All I do know is that things go to hell
>> when you load the channel. And RRUL test methods essentially break OFDMA.
>>
>> I agree using ping isn't ideal. But I'm approaching this as creating a
>> test that a consumer audience can understand. Ping is something consumers
>> care about and understand.  The octoScope STApals are all ntp sync'd and
>> latency measurements using iperf have been done by them.
>>
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 9630 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 19:50         ` Tim Higgins
  2020-05-15 20:05           ` Bob McMahon
@ 2020-05-15 20:24           ` Toke Høiland-Jørgensen
  2020-05-15 20:30           ` Dave Taht
       [not found]           ` <mailman.342.1589573120.24343.make-wifi-fast@lists.bufferbloat.net>
  3 siblings, 0 replies; 16+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-05-15 20:24 UTC (permalink / raw)
  To: Tim Higgins, Bob McMahon; +Cc: Make-Wifi-fast

Tim Higgins <tim@smallnetbuilder.com> writes:

> Thanks for the additional insights, Bob. How do you measure TCP connects?
>
> Does Dave or anyone else on the bufferbloat team want to comment on
> Bob's comment that latency testing under "heavy traffic" isn't ideal?

Well, it depends on what you want to measure. Loading the link with
heavy traffic is a good way to show the worst-case behaviour of the
system, as that will fill up the buffers and expose any latent
bufferbloat. Which, as Bob points out, will tend to drown out any other
source of latency, at least if all the queues are dumb FIFOs.

However, if you want to specifically study, say, the media access
latencies of the WiFi link, drowning it out with the order-of-magnitude
higher latencies of bloat in the layers above is obviously going to
obscure the signal somewhat. Which I think was basically Bob's point
about "testing under heavy traffic"?

> My impression is that the rtt_fair_var test I used in the article and
> other RRUL-related Flent tests fully load the connection under test.
> Am I incorrect?

Yeah, the RRUL test (and friends) are specifically designed to load up
the link to show the worst-case latency behaviour, including any
bufferbloat. And, as per the above, as long as the system you're testing
still has unresolved bloat issues, well, that is what you're going to be
seeing most of... :)

-Toke


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 19:50         ` Tim Higgins
  2020-05-15 20:05           ` Bob McMahon
  2020-05-15 20:24           ` Toke Høiland-Jørgensen
@ 2020-05-15 20:30           ` Dave Taht
  2020-05-15 21:35             ` Bob McMahon
       [not found]           ` <mailman.342.1589573120.24343.make-wifi-fast@lists.bufferbloat.net>
  3 siblings, 1 reply; 16+ messages in thread
From: Dave Taht @ 2020-05-15 20:30 UTC (permalink / raw)
  To: Tim Higgins; +Cc: Bob McMahon, Make-Wifi-fast

On Fri, May 15, 2020 at 12:50 PM Tim Higgins <tim@smallnetbuilder.com> wrote:
>
> Thanks for the additional insights, Bob. How do you measure TCP connects?
>
> Does Dave or anyone else on the bufferbloat team want to comment on Bob's comment that latency testing under "heavy traffic" isn't ideal?

I hit save before deciding to reply.

> My impression is that the rtt_fair_var test I used in the article and other RRUL-related Flent tests fully load the connection under test. Am I incorrect?

well, to whatever extent possible by other limits in the hardware.
Under loads like these, other things - such as the rx path, or cpu,
start to fail. I had one box that had a memory leak, overnight testing
like this, showed it up. Another test - with ipv6 - ultimately showed
serious ipv6 traffic was causing a performance sucking cpu trap.
Another test showed IPv6 being seriously outcompeted by ipv4 because
there was 4096 ipv4 flow offloads in the hardware, and only 64 for ipv6....

There are many other tests in the suite - testing a fully loaded
station while other stations are moping along... stuff near and far
away (ATF),


>
> ===
> On 5/15/2020 3:36 PM, Bob McMahon wrote:
>
> Latency testing under "heavy traffic" isn't ideal.

Of course not. But in any real time control system, retaining control
and degrading predictably under load, is a hard requirement
in most other industries besides networking. Imagine if you only
tested your car, at speeds no more than 55mph, on roads that were
never slippery and with curves never exceeding 6 degrees. Then shipped
it, without a performance governor, and rubber bands holding
the steering wheel on that would break at 65mph, and with tires that
only worked at those speeds on those kind of curves.

To stick with the heavy traffic analogy, but in a slower case... I
used to have a car that overheated in heavy stop and go traffic.
Eventually, it caught on fire. (The full story is really funny,
because I was naked at the time, but I'll save it for a posthumous
biography)

> If the input rate exceeds the service rate of any queue for any period of time the queue fills up and latency hits a worst case per that queue depth.

which is what we're all about managing well, and predictably, here at
bufferbloat.net

>I'd say take latency measurements when the input rates are below the service rates.

That is ridiculous. It requires an oracle. It requires a belief system
where users will never exceed your mysterious parameters.

> The measurements when service rates are less than input rates are less about latency and more about bloat.

I have to note that latency measurements are certainly useful on less
loaded networks. Getting an AP out of sleep state is a good one,
another is how fast can you switch stations, under a minimal (say,
voip mostly) load, in the presence of interference.

> Also, a good paper is this one on trading bandwidth for ultra low latency using phantom queues and ECN.

I'm burned out on ecn today. on the high end I rather like cisco's AFD...

https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-738488.html

> Another thing to consider is that network engineers tend to have a mioptic view of latency.  The queueing or delay between the socket writes/reads and network stack matters too.

It certainly does! I'm always giving a long list of everything we've
done to improve the linux stack from app to endpoint.

Over on reddit recently (can't find the link) I talked about how bad
the linux ethernet stack was, pre-bql. I don't think anyone in the
industry
really understood deeply, the effects of packet aggregation in the
multistation case, for wifi. (I'm still unsure if anyone does!). Also
endless retries starving out other stations is huge problem in wifi,
and lte, and is going to become more of one on cable...

We've worked on tons of things - like tcp_lowat, fq, and queuing in
general - jeeze -
https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf
See the slide on smashing latency everywhere in the stack.

And I certainly, now that I can regularly get fiber down below 2ms,
regard the overhead of opus (2.7ms at the higest sampling rate) a real
problem,
along with scheduling delay and jitter in the os in the jamophone
project. It pays to bypass the OS, when you can.

Latency is everywhere, and you have to tackle it, everywhere, but it
helps to focus on what ever is costing you the most latency at a time,
re:

https://en.wikipedia.org/wiki/Gustafson%27s_law

My biggest complaint nowadays about modern cpu architectures is that
they can't context switch faster than a few thousand cycles. I've
advocated that folk look over mill computer's design, which can do it in 5.

>Network engineers focus on packets or TCP RTTs and somewhat overlook a user's true end to end experience.

Heh. I don't. Despite all I say here (Because I viewed the network as
the biggest problem 10 years ago), I have been doing voip and
videoconferencing apps for over 25 years, and basic benchmarks like
eye to eye/ear ear delay and jitter I have always hoped more used.

>  Avoiding bloat by slowing down the writes, e.g. ECN or different scheduling, still contributes to end/end latency between the writes() and the reads() that too few test for and monitor.

I agree that iperf had issues. I hope they are fixed now.

>
> Note: We're moving to trip times of writes to reads (or frames for video) for our testing.

ear to ear or eye to eye delay measurements are GOOD. And a lot of
that delay is still in the stack. One day, perhaps
we can go back to scan lines and not complicated encodings.

>We are also replacing/supplementing pings with TCP connects as other "latency related" measurements. TCP connects are more important than ping.

I wish more folk measured dns lookup delay...

Given the prevalance of ssl, I'd be measuring not just the 3whs, but
that additional set of handshakes.

We do have a bunch of http oriented tests in the flent suite, as well
as for voip. At the time we were developing it,
though, videoconferncing was in its infancy and difficult to model, so
we tended towards using what flows we could get
from real servers and services. I think we now have tools to model
videoconferencing traffic much better today than
we could, but until now, it wasn't much of a priority.

It's also important to note that videoconferencing and gaming traffic
put a very different load on the network - very sensitive to jitter,
not so sensitive to loss. Both are VERY low bandwidth compared to tcp
- gaming is 35kbit/sec for example, on 10 or 20ms intervals.

>
> Bob
>
> On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com> wrote:
>>
>> Hi Bob,
>>
>> Thanks for your comments and feedback. Responses below:
>>
>> On 5/14/2020 5:42 PM, Bob McMahon wrote:
>>
>> Also, forgot to mention, for latency don't rely on average as most don't care about that.  Maybe use the upper 3 stdev, i.e. the 99.97% point.  Our latency runs will repeat 20 seconds worth of packets and find that then calculate CDFs of this point in the tail across hundreds of runs under different conditions. One "slow packet" is all that it takes to screw up user experience when it comes to latency.
>>
>> Thanks for the guidance.
>>
>>
>> On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com> wrote:
>>>
>>> I haven't looked closely at OFDMA but these latency numbers seem way too high for it to matter.  Why is the latency so high?  It suggests there may be queueing delay (bloat) unrelated to media access.
>>>
>>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per trigger frame.  EDCA kinda sucks per listen before talk which is about 100 microseconds on average which has to be paid even when no energy detect.  This limits the transmits per second performance to 10K (1/0.0001.). Also remember that WiFi aggregates so transmissions have multiple packets and long transmits will consume those 10K tx ops. One way to get around aggregation is to use voice (VO) access class which many devices won't aggregate (mileage will vary.). Then take a packets per second measurement with small packets.  This would give an idea on the frame scheduling being AP based vs EDCA.
>>>
>>> Also, measuring ping time as a proxy for latency isn't ideal. Better to measure trip times of the actual traffic.  This requires clock sync to a common reference. GPS atomic clocks are available but it does take some setup work.
>>>
>>> I haven't thought about RU optimizations and that testing so can't really comment there.
>>>
>>> Also, I'd consider replacing the mechanical turn table with variable phase shifters and set them in the MIMO (or H-Matrix) path.  I use model 8421 from Aeroflex. Others make them too.
>>>
>> Thanks again for the suggestions. I agree latency is very high when I remove the traffic bandwidth caps. I don't know why. One of the key questions I've had since starting to mess with OFDMA is whether it helps under light or heavy traffic load. All I do know is that things go to hell when you load the channel. And RRUL test methods essentially break OFDMA.
>>
>> I agree using ping isn't ideal. But I'm approaching this as creating a test that a consumer audience can understand. Ping is something consumers care about and understand.  The octoScope STApals are all ntp sync'd and latency measurements using iperf have been done by them.
>>
>>
>
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast



-- 
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman

dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
       [not found]           ` <mailman.342.1589573120.24343.make-wifi-fast@lists.bufferbloat.net>
@ 2020-05-15 20:38             ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 16+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-05-15 20:38 UTC (permalink / raw)
  To: Bob McMahon, Tim Higgins; +Cc: Make-Wifi-fast

Bob McMahon via Make-wifi-fast <make-wifi-fast@lists.bufferbloat.net>
writes:

> In that analogy, latency and bloat, while correlated, aren't the same
> thing. I think by separating them one can better understand how a system
> will perform.  I suspect your tests with 120ms of latency are really
> measuring bloat.  Little's law is average queue depth = average effective
> arrival rate * average service time.  Bloat is mostly about excessive queue
> depths and latency mostly about excessive service times. Since they affect
> one another it's easy to conflate them.

I'd say this is a bit of an odd definition? In my view, "latency" is the
time it takes a packet to get from one place to another, and it can have
many sources, one of which is bloat. For a rather comprehensive survey
and categorisation of other sources, see [0].

As you say, bloat does tend to drown out other sources of latency, but
that is a bug as far as I'm concerned. Once you fix that, looking at all
the other sources becomes much easier. Optimising those other sources of
latency is worthwhile too, of course, it just doesn't help as much until
you also fix the bloat.

-Toke

[0] https://ieeexplore.ieee.org/abstract/document/6967689


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 20:30           ` Dave Taht
@ 2020-05-15 21:35             ` Bob McMahon
  2020-05-16  5:02               ` Dave Taht
  0 siblings, 1 reply; 16+ messages in thread
From: Bob McMahon @ 2020-05-15 21:35 UTC (permalink / raw)
  To: Dave Taht; +Cc: Tim Higgins, Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 12747 bytes --]

I'm in alignment with Dave's and Toke's posts. I do disagree somewhat with:

>I'd say take latency measurements when the input rates are below the
service rates.

That is ridiculous. It requires an oracle. It requires a belief system
where users will never exceed your mysterious parameters.


What zero-queue or low-queue latency measurements provide is a top end or
best performance, even when monitoring the tail of that CDF.   Things like
AR gaming are driving WiFi "ultra low latency" requirements where
phase/spatial stream matter.  How well an algorithm detects 1 to 2 spatial
streams is starting to matter. 2->1 stream is a relatively easy decision.
Then there is the 802.11ax AP scheduling vs EDCA which is a very difficult
engineering problem but sorely needed.

A major issue as a WiFi QA engineer is how to measure a multivariate system
in a meaningful (and automated) way. Easier said than done. (I find trying
to present Mahalanobis distances
<https://en.wikipedia.org/wiki/Mahalanobis_distance>doesn't work well
especially when compared to a scalar or single number.)  The first scalar
relied upon too much is peak average throughput, particularly without
concern for bloat. This was a huge flaw by the industry as bloat was
inserted everywhere by most everyone providing little to no benefit -
actually a design flaw per energy, transistors, etc.  Engineers attacking
bloat has been a very good thing by my judgment.

Some divide peak average throughput by latency to "get network power"  But
then there is "bloat latency" vs. "service latency."  Note:  with iperf
2.0.14 it's easy to see the difference by using socket read or write rate
limiting. If the link is read rate limited (with -b on the server) the
bloat is going to be exacerbated per a read congestion point.  If it's
write rate limited, -b on the client, the queues shouldn't be in a standing
state.

And then of course, "real world" class of measurements are very hard. And a
chip is usually powered by a battery so energy per useful xfer bit matters
too.

So parameters can be seen as mysterious for sure. Figuring out how to
demystify can be the fun part ;)

Bob



On Fri, May 15, 2020 at 1:30 PM Dave Taht <dave.taht@gmail.com> wrote:

> On Fri, May 15, 2020 at 12:50 PM Tim Higgins <tim@smallnetbuilder.com>
> wrote:
> >
> > Thanks for the additional insights, Bob. How do you measure TCP connects?
> >
> > Does Dave or anyone else on the bufferbloat team want to comment on
> Bob's comment that latency testing under "heavy traffic" isn't ideal?
>
> I hit save before deciding to reply.
>
> > My impression is that the rtt_fair_var test I used in the article and
> other RRUL-related Flent tests fully load the connection under test. Am I
> incorrect?
>
> well, to whatever extent possible by other limits in the hardware.
> Under loads like these, other things - such as the rx path, or cpu,
> start to fail. I had one box that had a memory leak, overnight testing
> like this, showed it up. Another test - with ipv6 - ultimately showed
> serious ipv6 traffic was causing a performance sucking cpu trap.
> Another test showed IPv6 being seriously outcompeted by ipv4 because
> there was 4096 ipv4 flow offloads in the hardware, and only 64 for ipv6....
>
> There are many other tests in the suite - testing a fully loaded
> station while other stations are moping along... stuff near and far
> away (ATF),
>
>
> >
> > ===
> > On 5/15/2020 3:36 PM, Bob McMahon wrote:
> >
> > Latency testing under "heavy traffic" isn't ideal.
>
> Of course not. But in any real time control system, retaining control
> and degrading predictably under load, is a hard requirement
> in most other industries besides networking. Imagine if you only
> tested your car, at speeds no more than 55mph, on roads that were
> never slippery and with curves never exceeding 6 degrees. Then shipped
> it, without a performance governor, and rubber bands holding
> the steering wheel on that would break at 65mph, and with tires that
> only worked at those speeds on those kind of curves.
>
> To stick with the heavy traffic analogy, but in a slower case... I
> used to have a car that overheated in heavy stop and go traffic.
> Eventually, it caught on fire. (The full story is really funny,
> because I was naked at the time, but I'll save it for a posthumous
> biography)
>
> > If the input rate exceeds the service rate of any queue for any period
> of time the queue fills up and latency hits a worst case per that queue
> depth.
>
> which is what we're all about managing well, and predictably, here at
> bufferbloat.net
>
> >I'd say take latency measurements when the input rates are below the
> service rates.
>
> That is ridiculous. It requires an oracle. It requires a belief system
> where users will never exceed your mysterious parameters.
>
> > The measurements when service rates are less than input rates are less
> about latency and more about bloat.
>
> I have to note that latency measurements are certainly useful on less
> loaded networks. Getting an AP out of sleep state is a good one,
> another is how fast can you switch stations, under a minimal (say,
> voip mostly) load, in the presence of interference.
>
> > Also, a good paper is this one on trading bandwidth for ultra low
> latency using phantom queues and ECN.
>
> I'm burned out on ecn today. on the high end I rather like cisco's AFD...
>
>
> https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-738488.html
>
> > Another thing to consider is that network engineers tend to have a
> mioptic view of latency.  The queueing or delay between the socket
> writes/reads and network stack matters too.
>
> It certainly does! I'm always giving a long list of everything we've
> done to improve the linux stack from app to endpoint.
>
> Over on reddit recently (can't find the link) I talked about how bad
> the linux ethernet stack was, pre-bql. I don't think anyone in the
> industry
> really understood deeply, the effects of packet aggregation in the
> multistation case, for wifi. (I'm still unsure if anyone does!). Also
> endless retries starving out other stations is huge problem in wifi,
> and lte, and is going to become more of one on cable...
>
> We've worked on tons of things - like tcp_lowat, fq, and queuing in
> general - jeeze -
> https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf
> See the slide on smashing latency everywhere in the stack.
>
> And I certainly, now that I can regularly get fiber down below 2ms,
> regard the overhead of opus (2.7ms at the higest sampling rate) a real
> problem,
> along with scheduling delay and jitter in the os in the jamophone
> project. It pays to bypass the OS, when you can.
>
> Latency is everywhere, and you have to tackle it, everywhere, but it
> helps to focus on what ever is costing you the most latency at a time,
> re:
>
> https://en.wikipedia.org/wiki/Gustafson%27s_law
>
> My biggest complaint nowadays about modern cpu architectures is that
> they can't context switch faster than a few thousand cycles. I've
> advocated that folk look over mill computer's design, which can do it in 5.
>
> >Network engineers focus on packets or TCP RTTs and somewhat overlook a
> user's true end to end experience.
>
> Heh. I don't. Despite all I say here (Because I viewed the network as
> the biggest problem 10 years ago), I have been doing voip and
> videoconferencing apps for over 25 years, and basic benchmarks like
> eye to eye/ear ear delay and jitter I have always hoped more used.
>
> >  Avoiding bloat by slowing down the writes, e.g. ECN or different
> scheduling, still contributes to end/end latency between the writes() and
> the reads() that too few test for and monitor.
>
> I agree that iperf had issues. I hope they are fixed now.
>
> >
> > Note: We're moving to trip times of writes to reads (or frames for
> video) for our testing.
>
> ear to ear or eye to eye delay measurements are GOOD. And a lot of
> that delay is still in the stack. One day, perhaps
> we can go back to scan lines and not complicated encodings.
>
> >We are also replacing/supplementing pings with TCP connects as other
> "latency related" measurements. TCP connects are more important than ping.
>
> I wish more folk measured dns lookup delay...
>
> Given the prevalance of ssl, I'd be measuring not just the 3whs, but
> that additional set of handshakes.
>
> We do have a bunch of http oriented tests in the flent suite, as well
> as for voip. At the time we were developing it,
> though, videoconferncing was in its infancy and difficult to model, so
> we tended towards using what flows we could get
> from real servers and services. I think we now have tools to model
> videoconferencing traffic much better today than
> we could, but until now, it wasn't much of a priority.
>
> It's also important to note that videoconferencing and gaming traffic
> put a very different load on the network - very sensitive to jitter,
> not so sensitive to loss. Both are VERY low bandwidth compared to tcp
> - gaming is 35kbit/sec for example, on 10 or 20ms intervals.
>
> >
> > Bob
> >
> > On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com>
> wrote:
> >>
> >> Hi Bob,
> >>
> >> Thanks for your comments and feedback. Responses below:
> >>
> >> On 5/14/2020 5:42 PM, Bob McMahon wrote:
> >>
> >> Also, forgot to mention, for latency don't rely on average as most
> don't care about that.  Maybe use the upper 3 stdev, i.e. the 99.97%
> point.  Our latency runs will repeat 20 seconds worth of packets and find
> that then calculate CDFs of this point in the tail across hundreds of runs
> under different conditions. One "slow packet" is all that it takes to screw
> up user experience when it comes to latency.
> >>
> >> Thanks for the guidance.
> >>
> >>
> >> On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com>
> wrote:
> >>>
> >>> I haven't looked closely at OFDMA but these latency numbers seem way
> too high for it to matter.  Why is the latency so high?  It suggests there
> may be queueing delay (bloat) unrelated to media access.
> >>>
> >>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling
> per trigger frame.  EDCA kinda sucks per listen before talk which is about
> 100 microseconds on average which has to be paid even when no energy
> detect.  This limits the transmits per second performance to 10K
> (1/0.0001.). Also remember that WiFi aggregates so transmissions have
> multiple packets and long transmits will consume those 10K tx ops. One way
> to get around aggregation is to use voice (VO) access class which many
> devices won't aggregate (mileage will vary.). Then take a packets per
> second measurement with small packets.  This would give an idea on the
> frame scheduling being AP based vs EDCA.
> >>>
> >>> Also, measuring ping time as a proxy for latency isn't ideal. Better
> to measure trip times of the actual traffic.  This requires clock sync to a
> common reference. GPS atomic clocks are available but it does take some
> setup work.
> >>>
> >>> I haven't thought about RU optimizations and that testing so can't
> really comment there.
> >>>
> >>> Also, I'd consider replacing the mechanical turn table with variable
> phase shifters and set them in the MIMO (or H-Matrix) path.  I use model
> 8421 from Aeroflex. Others make them too.
> >>>
> >> Thanks again for the suggestions. I agree latency is very high when I
> remove the traffic bandwidth caps. I don't know why. One of the key
> questions I've had since starting to mess with OFDMA is whether it helps
> under light or heavy traffic load. All I do know is that things go to hell
> when you load the channel. And RRUL test methods essentially break OFDMA.
> >>
> >> I agree using ping isn't ideal. But I'm approaching this as creating a
> test that a consumer audience can understand. Ping is something consumers
> care about and understand.  The octoScope STApals are all ntp sync'd and
> latency measurements using iperf have been done by them.
> >>
> >>
> >
> > _______________________________________________
> > Make-wifi-fast mailing list
> > Make-wifi-fast@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/make-wifi-fast
>
>
>
> --
> "For a successful technology, reality must take precedence over public
> relations, for Mother Nature cannot be fooled" - Richard Feynman
>
> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
>

[-- Attachment #2: Type: text/html, Size: 15083 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work?
  2020-05-15 21:35             ` Bob McMahon
@ 2020-05-16  5:02               ` Dave Taht
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Taht @ 2020-05-16  5:02 UTC (permalink / raw)
  To: Bob McMahon; +Cc: Tim Higgins, Make-Wifi-fast

On Fri, May 15, 2020 at 2:35 PM Bob McMahon <bob.mcmahon@broadcom.com> wrote:
>
> I'm in alignment with Dave's and Toke's posts. I do disagree somewhat with:
>
> >I'd say take latency measurements when the input rates are below the service rates.
>
> That is ridiculous. It requires an oracle. It requires a belief system
> where users will never exceed your mysterious parameters.

If you couldn't tell, I've had a long week. Here's to shared virtual
beverage of choice, all round?

>
> What zero-queue or low-queue latency measurements provide is a top end or best performance, even when monitoring the tail of that CDF.   Things like AR gaming are driving WiFi "ultra low latency" requirements where phase/spatial stream matter.  How well an algorithm detects 1 to 2 spatial streams is starting to matter. 2->1 stream is a relatively easy decision. Then there is the 802.11ax AP scheduling vs EDCA which is a very difficult engineering problem but sorely needed.
>
> A major issue as a WiFi QA engineer is how to measure a multivariate system in a meaningful (and automated) way. Easier said than done. (I find trying to present Mahalanobis distances doesn't work well especially when compared to a scalar or single number.)  The first scalar relied upon too much is peak average throughput, particularly without concern for bloat. This was a huge flaw by the industry as bloat was inserted everywhere by most everyone providing little to no benefit - actually a design flaw per energy, transistors, etc.  Engineers attacking bloat has been a very good thing by my judgment.
>
> Some divide peak average throughput by latency to "get network power"  But then there is "bloat latency" vs. "service latency."  Note:  with iperf 2.0.14 it's easy to see the difference by using socket read or write rate limiting. If the link is read rate limited (with -b on the server) the bloat is going to be exacerbated per a read congestion point.  If it's write rate limited, -b on the client, the queues shouldn't be in a standing state.
>
> And then of course, "real world" class of measurements are very hard. And a chip is usually powered by a battery so energy per useful xfer bit matters too.

yes, most are. the ones I worry about most, aren't.

Aside from that, agree totally with what you say, especially
multivariate systems. Need an AI to interpret the rrul tests.



>
> So parameters can be seen as mysterious for sure. Figuring out how to demystify can be the fun part ;)

/me clinks glass over the virtual bar
gnight!
>
> Bob
>
>
>
> On Fri, May 15, 2020 at 1:30 PM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> On Fri, May 15, 2020 at 12:50 PM Tim Higgins <tim@smallnetbuilder.com> wrote:
>> >
>> > Thanks for the additional insights, Bob. How do you measure TCP connects?
>> >
>> > Does Dave or anyone else on the bufferbloat team want to comment on Bob's comment that latency testing under "heavy traffic" isn't ideal?
>>
>> I hit save before deciding to reply.
>>
>> > My impression is that the rtt_fair_var test I used in the article and other RRUL-related Flent tests fully load the connection under test. Am I incorrect?
>>
>> well, to whatever extent possible by other limits in the hardware.
>> Under loads like these, other things - such as the rx path, or cpu,
>> start to fail. I had one box that had a memory leak, overnight testing
>> like this, showed it up. Another test - with ipv6 - ultimately showed
>> serious ipv6 traffic was causing a performance sucking cpu trap.
>> Another test showed IPv6 being seriously outcompeted by ipv4 because
>> there was 4096 ipv4 flow offloads in the hardware, and only 64 for ipv6....
>>
>> There are many other tests in the suite - testing a fully loaded
>> station while other stations are moping along... stuff near and far
>> away (ATF),
>>
>>
>> >
>> > ===
>> > On 5/15/2020 3:36 PM, Bob McMahon wrote:
>> >
>> > Latency testing under "heavy traffic" isn't ideal.
>>
>> Of course not. But in any real time control system, retaining control
>> and degrading predictably under load, is a hard requirement
>> in most other industries besides networking. Imagine if you only
>> tested your car, at speeds no more than 55mph, on roads that were
>> never slippery and with curves never exceeding 6 degrees. Then shipped
>> it, without a performance governor, and rubber bands holding
>> the steering wheel on that would break at 65mph, and with tires that
>> only worked at those speeds on those kind of curves.
>>
>> To stick with the heavy traffic analogy, but in a slower case... I
>> used to have a car that overheated in heavy stop and go traffic.
>> Eventually, it caught on fire. (The full story is really funny,
>> because I was naked at the time, but I'll save it for a posthumous
>> biography)
>>
>> > If the input rate exceeds the service rate of any queue for any period of time the queue fills up and latency hits a worst case per that queue depth.
>>
>> which is what we're all about managing well, and predictably, here at
>> bufferbloat.net
>>
>> >I'd say take latency measurements when the input rates are below the service rates.
>>
>> That is ridiculous. It requires an oracle. It requires a belief system
>> where users will never exceed your mysterious parameters.
>>
>> > The measurements when service rates are less than input rates are less about latency and more about bloat.
>>
>> I have to note that latency measurements are certainly useful on less
>> loaded networks. Getting an AP out of sleep state is a good one,
>> another is how fast can you switch stations, under a minimal (say,
>> voip mostly) load, in the presence of interference.
>>
>> > Also, a good paper is this one on trading bandwidth for ultra low latency using phantom queues and ECN.
>>
>> I'm burned out on ecn today. on the high end I rather like cisco's AFD...
>>
>> https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-738488.html
>>
>> > Another thing to consider is that network engineers tend to have a mioptic view of latency.  The queueing or delay between the socket writes/reads and network stack matters too.
>>
>> It certainly does! I'm always giving a long list of everything we've
>> done to improve the linux stack from app to endpoint.
>>
>> Over on reddit recently (can't find the link) I talked about how bad
>> the linux ethernet stack was, pre-bql. I don't think anyone in the
>> industry
>> really understood deeply, the effects of packet aggregation in the
>> multistation case, for wifi. (I'm still unsure if anyone does!). Also
>> endless retries starving out other stations is huge problem in wifi,
>> and lte, and is going to become more of one on cable...
>>
>> We've worked on tons of things - like tcp_lowat, fq, and queuing in
>> general - jeeze -
>> https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf
>> See the slide on smashing latency everywhere in the stack.
>>
>> And I certainly, now that I can regularly get fiber down below 2ms,
>> regard the overhead of opus (2.7ms at the higest sampling rate) a real
>> problem,
>> along with scheduling delay and jitter in the os in the jamophone
>> project. It pays to bypass the OS, when you can.
>>
>> Latency is everywhere, and you have to tackle it, everywhere, but it
>> helps to focus on what ever is costing you the most latency at a time,
>> re:
>>
>> https://en.wikipedia.org/wiki/Gustafson%27s_law
>>
>> My biggest complaint nowadays about modern cpu architectures is that
>> they can't context switch faster than a few thousand cycles. I've
>> advocated that folk look over mill computer's design, which can do it in 5.
>>
>> >Network engineers focus on packets or TCP RTTs and somewhat overlook a user's true end to end experience.
>>
>> Heh. I don't. Despite all I say here (Because I viewed the network as
>> the biggest problem 10 years ago), I have been doing voip and
>> videoconferencing apps for over 25 years, and basic benchmarks like
>> eye to eye/ear ear delay and jitter I have always hoped more used.
>>
>> >  Avoiding bloat by slowing down the writes, e.g. ECN or different scheduling, still contributes to end/end latency between the writes() and the reads() that too few test for and monitor.
>>
>> I agree that iperf had issues. I hope they are fixed now.
>>
>> >
>> > Note: We're moving to trip times of writes to reads (or frames for video) for our testing.
>>
>> ear to ear or eye to eye delay measurements are GOOD. And a lot of
>> that delay is still in the stack. One day, perhaps
>> we can go back to scan lines and not complicated encodings.
>>
>> >We are also replacing/supplementing pings with TCP connects as other "latency related" measurements. TCP connects are more important than ping.
>>
>> I wish more folk measured dns lookup delay...
>>
>> Given the prevalance of ssl, I'd be measuring not just the 3whs, but
>> that additional set of handshakes.
>>
>> We do have a bunch of http oriented tests in the flent suite, as well
>> as for voip. At the time we were developing it,
>> though, videoconferncing was in its infancy and difficult to model, so
>> we tended towards using what flows we could get
>> from real servers and services. I think we now have tools to model
>> videoconferencing traffic much better today than
>> we could, but until now, it wasn't much of a priority.
>>
>> It's also important to note that videoconferencing and gaming traffic
>> put a very different load on the network - very sensitive to jitter,
>> not so sensitive to loss. Both are VERY low bandwidth compared to tcp
>> - gaming is 35kbit/sec for example, on 10 or 20ms intervals.
>>
>> >
>> > Bob
>> >
>> > On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com> wrote:
>> >>
>> >> Hi Bob,
>> >>
>> >> Thanks for your comments and feedback. Responses below:
>> >>
>> >> On 5/14/2020 5:42 PM, Bob McMahon wrote:
>> >>
>> >> Also, forgot to mention, for latency don't rely on average as most don't care about that.  Maybe use the upper 3 stdev, i.e. the 99.97% point.  Our latency runs will repeat 20 seconds worth of packets and find that then calculate CDFs of this point in the tail across hundreds of runs under different conditions. One "slow packet" is all that it takes to screw up user experience when it comes to latency.
>> >>
>> >> Thanks for the guidance.
>> >>
>> >>
>> >> On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com> wrote:
>> >>>
>> >>> I haven't looked closely at OFDMA but these latency numbers seem way too high for it to matter.  Why is the latency so high?  It suggests there may be queueing delay (bloat) unrelated to media access.
>> >>>
>> >>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per trigger frame.  EDCA kinda sucks per listen before talk which is about 100 microseconds on average which has to be paid even when no energy detect.  This limits the transmits per second performance to 10K (1/0.0001.). Also remember that WiFi aggregates so transmissions have multiple packets and long transmits will consume those 10K tx ops. One way to get around aggregation is to use voice (VO) access class which many devices won't aggregate (mileage will vary.). Then take a packets per second measurement with small packets.  This would give an idea on the frame scheduling being AP based vs EDCA.
>> >>>
>> >>> Also, measuring ping time as a proxy for latency isn't ideal. Better to measure trip times of the actual traffic.  This requires clock sync to a common reference. GPS atomic clocks are available but it does take some setup work.
>> >>>
>> >>> I haven't thought about RU optimizations and that testing so can't really comment there.
>> >>>
>> >>> Also, I'd consider replacing the mechanical turn table with variable phase shifters and set them in the MIMO (or H-Matrix) path.  I use model 8421 from Aeroflex. Others make them too.
>> >>>
>> >> Thanks again for the suggestions. I agree latency is very high when I remove the traffic bandwidth caps. I don't know why. One of the key questions I've had since starting to mess with OFDMA is whether it helps under light or heavy traffic load. All I do know is that things go to hell when you load the channel. And RRUL test methods essentially break OFDMA.
>> >>
>> >> I agree using ping isn't ideal. But I'm approaching this as creating a test that a consumer audience can understand. Ping is something consumers care about and understand.  The octoScope STApals are all ntp sync'd and latency measurements using iperf have been done by them.
>> >>
>> >>
>> >
>> > _______________________________________________
>> > Make-wifi-fast mailing list
>> > Make-wifi-fast@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/make-wifi-fast
>>
>>
>>
>> --
>> "For a successful technology, reality must take precedence over public
>> relations, for Mother Nature cannot be fooled" - Richard Feynman
>>
>> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729



-- 
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman

dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-05-16  5:02 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-14 16:43 [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work? Tim Higgins
2020-05-14 21:38 ` Bob McMahon
2020-05-14 21:42   ` Bob McMahon
2020-05-15 15:20     ` Tim Higgins
2020-05-15 19:36       ` Bob McMahon
2020-05-15 19:50         ` Tim Higgins
2020-05-15 20:05           ` Bob McMahon
2020-05-15 20:24           ` Toke Høiland-Jørgensen
2020-05-15 20:30           ` Dave Taht
2020-05-15 21:35             ` Bob McMahon
2020-05-16  5:02               ` Dave Taht
     [not found]           ` <mailman.342.1589573120.24343.make-wifi-fast@lists.bufferbloat.net>
2020-05-15 20:38             ` Toke Høiland-Jørgensen
2020-05-15  6:47 ` Erkki Lintunen
2020-05-15 15:34   ` Tim Higgins
2020-05-15 16:25     ` Dave Taht
2020-05-15 16:41       ` Tim Higgins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox