From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 999DA3B29D for ; Fri, 15 May 2020 16:05:19 -0400 (EDT) Received: by mail-ed1-x532.google.com with SMTP id l3so3196376edq.13 for ; Fri, 15 May 2020 13:05:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=89ycsdfzfvWCuPJEJRGQgc4Wh64iyQMG/IjIzzKTD2w=; b=cXCP1QpMO3PawlHk61EV6z+FNd2GtK459XOPItshD+mkx2gmtEMNRWAdnfUlALiQQm 67myPrhk5u/ZFqWUJdoH+/FbDE+lFDUiMg/t8ERL9+qxAs44ZAayZIt4MshPdSbWX/VP BnbtuT7P9PqL5afwLG3M0O099Tte7+0fVnT9E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=89ycsdfzfvWCuPJEJRGQgc4Wh64iyQMG/IjIzzKTD2w=; b=tSAUHnFyDiY6kX3Uu/kuLMQXU/tOC0Mj3d+ooH95008xVUd2kIXQ+CveODzuASfGX9 t9B2HD9Gt6eJ3WImfdeMUMft8mhx7EYBWidwoWmNlB9ODqGyGQxUdaoQfrgAYnN2Dlwa 89L3g3dljCdQzb38pkBGQaNnHfN58F/0RoFXreqkjTBKH/f6SaQMDvHq6Mq2+5c+Js1g W+1oYUzFwBcAPeIhx4H2Ch1gpr1gOw/tRGhnB+wV/ettjtcSNmAJodT7yJ/JVzdyx7Sj Fyi5Xq/dQjTh20/3lHZJOr7V+uIEQ05bvF+0xVlbyPQVxf2vlx5LDiF3pTL6TdQMBR8B 2UGw== X-Gm-Message-State: AOAM532Sxu7YaB2vf8MxYeLiGBKt4r5/s07pW8PwqVHY6YlitEjQ3j8M XWcOc+4zrfVYZsYN4ECPQwQ2U4VwY8RovBGWxxtSSL9C X-Google-Smtp-Source: ABdhPJyBRMTH+I74fnZdmJQU9SBIynhrS1i4x4hNNgmSqpBuU2S+reyXuXdc60W4fa5FQ+wceH/MLvx3WvBmAx/mlxQ= X-Received: by 2002:a50:b701:: with SMTP id g1mr2752527ede.259.1589573118330; Fri, 15 May 2020 13:05:18 -0700 (PDT) MIME-Version: 1.0 References: <56a03e99-3337-bf4a-4743-deb93abb9592@smallnetbuilder.com> <6d8fc82a-cd32-31ba-023c-4dcd7822bb1d@smallnetbuilder.com> In-Reply-To: From: Bob McMahon Date: Fri, 15 May 2020 13:05:06 -0700 Message-ID: Subject: Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really Work? To: Tim Higgins Cc: Make-Wifi-fast Content-Type: multipart/alternative; boundary="0000000000001063bd05a5b55589" X-List-Received-Date: Fri, 15 May 2020 20:05:19 -0000 --0000000000001063bd05a5b55589 Content-Type: text/plain; charset="UTF-8" iperf 2.0.14 supports --connect-only tests and also shows the connect times. It's currently broken and I plan to fix it soon. My brother works for NASA and when designing the shuttle engineers focussed on weight/mass because the energy required to achieve low earth orbit is driven by that. My brother, a PhD in fracture mechanics, said weight without consideration for structural integrity trade offs was a mistake. In that analogy, latency and bloat, while correlated, aren't the same thing. I think by separating them one can better understand how a system will perform. I suspect your tests with 120ms of latency are really measuring bloat. Little's law is average queue depth = average effective arrival rate * average service time. Bloat is mostly about excessive queue depths and latency mostly about excessive service times. Since they affect one another it's easy to conflate them. Bob On Fri, May 15, 2020 at 12:50 PM Tim Higgins wrote: > Thanks for the additional insights, Bob. How do you measure TCP connects? > > Does Dave or anyone else on the bufferbloat team want to comment on Bob's > comment that latency testing under "heavy traffic" isn't ideal? > > My impression is that the rtt_fair_var test I used in the article and > other RRUL-related Flent tests fully load the connection under test. Am I > incorrect? > > === > On 5/15/2020 3:36 PM, Bob McMahon wrote: > > Latency testing under "heavy traffic" isn't ideal. If the input > rate exceeds the service rate of any queue for any period of time the queue > fills up and latency hits a worst case per that queue depth. I'd say take > latency measurements when the input rates are below the service rates. The > measurements when service rates are less than input rates are less > about latency and more about bloat. > > Also, a good paper is this one on trading bandwidth for ultra low latency > using > phantom queues and ECN. > > Another thing to consider is that network engineers tend to have a mioptic > view of latency. The queueing or delay between the socket writes/reads and > network stack matters too. Network engineers focus on packets or TCP RTTs > and somewhat overlook a user's true end to end experience. Avoiding bloat > by slowing down the writes, e.g. ECN or different scheduling, still > contributes to end/end latency between the writes() and the reads() that > too few test for and monitor. > > Note: We're moving to trip times of writes to reads (or frames for video) > for our testing. We are also replacing/supplementing pings with TCP > connects as other "latency related" measurements. TCP connects are more > important than ping. > > Bob > > On Fri, May 15, 2020 at 8:20 AM Tim Higgins > wrote: > >> Hi Bob, >> >> Thanks for your comments and feedback. Responses below: >> >> On 5/14/2020 5:42 PM, Bob McMahon wrote: >> >> Also, forgot to mention, for latency don't rely on average as most don't >> care about that. Maybe use the upper 3 stdev, i.e. the 99.97% point. Our >> latency runs will repeat 20 seconds worth of packets and find that then >> calculate CDFs of this point in the tail across hundreds of runs under >> different conditions. One "slow packet" is all that it takes to screw up >> user experience when it comes to latency. >> >> Thanks for the guidance. >> >> >> On Thu, May 14, 2020 at 2:38 PM Bob McMahon >> wrote: >> >>> I haven't looked closely at OFDMA but these latency numbers seem way too >>> high for it to matter. Why is the latency so high? It suggests there may >>> be queueing delay (bloat) unrelated to media access. >>> >>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per >>> trigger frame. EDCA kinda sucks per listen before talk which is about 100 >>> microseconds on average which has to be paid even when no energy detect. >>> This limits the transmits per second performance to 10K (1/0.0001.). Also >>> remember that WiFi aggregates so transmissions have multiple packets and >>> long transmits will consume those 10K tx ops. One way to get around >>> aggregation is to use voice (VO) access class which many devices won't >>> aggregate (mileage will vary.). Then take a packets per second >>> measurement with small packets. This would give an idea on the frame >>> scheduling being AP based vs EDCA. >>> >>> Also, measuring ping time as a proxy for latency isn't ideal. Better to >>> measure trip times of the actual traffic. This requires clock sync to a >>> common reference. GPS atomic clocks are available but it does take some >>> setup work. >>> >>> I haven't thought about RU optimizations and that testing so can't >>> really comment there. >>> >>> Also, I'd consider replacing the mechanical turn table with variable >>> phase shifters and set them in the MIMO (or H-Matrix) path. I use model >>> 8421 from Aeroflex >>> . >>> Others make them too. >>> >>> Thanks again for the suggestions. I agree latency is very high when I >> remove the traffic bandwidth caps. I don't know why. One of the key >> questions I've had since starting to mess with OFDMA is whether it helps >> under light or heavy traffic load. All I do know is that things go to hell >> when you load the channel. And RRUL test methods essentially break OFDMA. >> >> I agree using ping isn't ideal. But I'm approaching this as creating a >> test that a consumer audience can understand. Ping is something consumers >> care about and understand. The octoScope STApals are all ntp sync'd and >> latency measurements using iperf have been done by them. >> >> >> > --0000000000001063bd05a5b55589 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
iperf 2.0.14 supports --connect-only tests and also shows = the connect times.=C2=A0 It's currently broken and I plan to fix it soo= n.

My brother works for NASA and when designing the shut= tle engineers focussed on weight/mass because the energy required to achiev= e low earth orbit is driven by that. My brother, a PhD in fracture mechanic= s, said weight without consideration for structural integrity trade offs wa= s a mistake.=C2=A0

In that analogy, latency and bloat, while correla= ted, aren't the same thing. I think by separating them one can better u= nderstand how a system will perform.=C2=A0 I suspect your tests with 120ms = of latency are really measuring=C2=A0bloat.=C2=A0 Little's law is avera= ge queue depth =3D average effective arrival rate * average service time.= =C2=A0 Bloat is mostly about excessive queue depths and latency mostly abou= t excessive service times. Since they affect one another it's easy to c= onflate them.

Bob=C2=A0

=C2=A0=C2=A0

On Fri, May 15, 202= 0 at 12:50 PM Tim Higgins <ti= m@smallnetbuilder.com> wrote:
=20 =20 =20
Thanks for the additional insights, Bob. How do you measure TCP connects?

Does Dave or anyone else on the bufferbloat team want to comment on Bob's comment that latency testing under "heavy traffic&q= uot; isn't ideal?

My impression is that the rtt_fair_var test I used in the article and other RRUL-related Flent tests fully load the connection under test. Am I incorrect?

=20
=3D=3D= =3D
On 5/15/2020 3:36 PM, Bob McMahon wrote:
=20
Latency testing under "heavy traffic" isn&= #39;t ideal.=C2=A0 If the input rate=C2=A0exceeds the service rate of any queue for an= y period of time the queue fills up and latency hits a worst case per=C2=A0that queue depth.=C2=A0 I'd say take latency measureme= nts when the input rates are below the service rates. The measurements when service rates are less than input rates are less about=C2=A0latency and more about bloat.

Also, a good paper is this one on trading bandwidth= for ultra low latency using phantom queues and ECN.

Another thing to consider is that network engineers=C2=A0tend to ha= ve a mioptic view of latency.=C2=A0 The queueing or delay between the socket writes/reads and network stack matters too.=C2=A0 Network engineers focus on packets or TCP RTTs and somewhat overlook a user's true end to end experience.=C2=A0 Avoiding bloat by slow= ing down the writes, e.g. ECN or different scheduling, still contributes to end/end latency between the writes() and the reads() that too few test for and monitor.

Note: We're moving to trip times of writes to reads (or frames for video) for our testing. We are also replacing/supplementing=C2=A0pings with TCP connects as other "latency related" measurements. TCP connects are more imp= ortant than ping.

Bob

On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com> wrote:
Hi Bob,

Thanks for your comments and feedback. Responses below:

On 5/14/2020 5:42 PM, Bob McMahon wrote:
Also, forgot to mention, for latency don'= ;t rely on average as most don't care about that.=C2=A0 Ma= ybe use the upper 3 stdev, i.e. the 99.97% point.=C2=A0 Our latency runs will repeat 20 seconds worth of packets and find that then calculate CDFs of this point in the tail across hundreds=C2=A0of runs under different conditions. On= e "slow packet" is all that it takes to screw up us= er experience when it comes to latency.=C2=A0

= Thanks for the guidance.

On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com> wrote:
I haven't looked closely at OFDMA bu= t these latency numbers seem way too high for it to matter.=C2=A0 Why is the latency so high?=C2=A0 It sugg= ests there may be queueing delay (bloat) unrelated to media access.

Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per trigger frame.=C2=A0 EDCA kinda sucks per listen before talk which is about 100 microseconds on average which has to be paid even when no energy detect.=C2=A0 This limits the transmits per second performance to 10K (1/0.0001.). Also remember that WiFi aggregates so transmissions have multiple packets and long=C2=A0transmits will consume those 10K tx ops. One way to get around aggregation is to use voice (VO) access class which many devices won't aggregate (mileage will vary.). Then take a packets per second measurement=C2=A0with small packets.= =C2=A0 This would give an idea on the frame scheduling being AP based vs EDCA.=C2=A0=C2=A0

Also, measuring ping time as a proxy for latency isn't ideal. Better to measure trip times of the actual traffic.=C2=A0 This requires clock sync to a common reference. GPS atomic clocks are available but it does take some setup work.

I haven't thought about RU optimizations and that testing so can't really comment there.=C2=A0

Also, I'd consider replacing the mechanical turn table=C2=A0with variable phase shifters and set them in the MIMO (or H-Matrix) path.=C2=A0 I use model 8421 from Aeroflex. Others make them too.

= Thanks again for the suggestions. I agree latency is very high when I remove the traffic bandwidth caps. I don't know why. One of the key questions I've had since starting t= o mess with OFDMA is whether it helps under light or heavy traffic load. All I do know is that things go to hell when you load the channel. And RRUL test methods essentially break OFDMA.

I agree using ping isn't ideal. But I'm approaching= this as creating a test that a consumer audience can understand. Ping is something consumers care about and understand.=C2=A0 The octoScope STApals are all ntp sync= 9;d and latency measurements using iperf have been done by them.



--0000000000001063bd05a5b55589--