From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dave.taht@gmail.com>
Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com
 [IPv6:2607:f8b0:4864:20::d43])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id A5E6E3B29D
 for <make-wifi-fast@lists.bufferbloat.net>;
 Fri, 15 May 2020 16:30:46 -0400 (EDT)
Received: by mail-io1-xd43.google.com with SMTP id x5so4157751ioh.6
 for <make-wifi-fast@lists.bufferbloat.net>;
 Fri, 15 May 2020 13:30:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=Efw1avmy0NcYsHS7LYLH3begQttV/Y3QWb9ACLmNp70=;
 b=aAfeoAxfhkHbzTzP46dM15XoulLWqL2T15/5jsRz42Y36Rlu2UaIr9lVxrxVjXFnEp
 cyEr9KudXtr2JysU/jwYB0KJ+PQYoOD+81Nu1ElFRy1O1RbKSZNzoBss/qvdnDaHEq4t
 d41C13kjMFk6BclVSgfIneSObAgYPWRkbkcGzMxcwIKtZPKkZsFQaWFKA7XkYWc32+V+
 dzuDGLw+4QJpiPDo5u4p0BT6geRhBC7j87VBMtmpIEeJmBp1OebTm7QBI91f/oSXIGZ1
 vMPhqDQgUVOzecTMbc+f1/6S30hEXOoUQ7rrw4yBTkR2gyR9m9rr1XrYskWb83OZb/Of
 N7bg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=Efw1avmy0NcYsHS7LYLH3begQttV/Y3QWb9ACLmNp70=;
 b=U/DXytrLFrStWhq5kEUR+l9Z11gh/wL3C3Ol9O/DHq1UCkkBsZvZVHByBObE9FVPPw
 11OFKyLoSGX+A60ohB6lxhZpG7IhntIqa9usmdTNOA4LRXNcaV6+LCNSGH484MS5NVjf
 47w41f9TsR7wv1Tqmma7hcTyUpLcMALPF6jLJPAnQNT/Xf34oZnTMJriJQjmlRVoyvbB
 S+cZH0j4zIa90serK0J4rI5Cm6eQTy4BqkK6ovB9CrQr1qatpbpGYclYVbqA/OhhJZpv
 IamnVhDWPMRpzGowy8y0l2KS9TaakSdldgBsnG+IdQrnMWHq9ohMWFVEW6b98aF9ZRqB
 GgEg==
X-Gm-Message-State: AOAM530Q32f5Wh9IJf97Dra8xEMC0bxzSp8DDHorTwGOf8HdjAMXTpt5
 zjQcQAfXEtV6wsIQY1s8BxYJb6M1fASXDI4OeIR4OpANMrg=
X-Google-Smtp-Source: ABdhPJyCCKxcnsVUQTdzmUR5Q7vX+NjO1C0BCjaq5n/tIWqKEngP2y3ESigI4kNnzZI1r3j5a5MYWHkTW6ncTuAIU6k=
X-Received: by 2002:a05:6638:1014:: with SMTP id
 r20mr4813938jab.29.1589574645992; 
 Fri, 15 May 2020 13:30:45 -0700 (PDT)
MIME-Version: 1.0
References: <56a03e99-3337-bf4a-4743-deb93abb9592@smallnetbuilder.com>
 <CAHb6LvoBiFUmzfOuhj03x43weuVssy69PQfqbqF7qi3jvAhgXQ@mail.gmail.com>
 <CAHb6LvqQw7ScPVir0=vnLYzLAjrV88d0kD6LHGGqy8isfMbs_Q@mail.gmail.com>
 <6d8fc82a-cd32-31ba-023c-4dcd7822bb1d@smallnetbuilder.com>
 <CAHb6LvqZqeQjdmkvUtg=Qd9RhjyL2=3jxJ-oRmBs1KOXz6tNpQ@mail.gmail.com>
 <f2df3e22-4300-1af4-c2c3-65919fa5c51f@smallnetbuilder.com>
In-Reply-To: <f2df3e22-4300-1af4-c2c3-65919fa5c51f@smallnetbuilder.com>
From: Dave Taht <dave.taht@gmail.com>
Date: Fri, 15 May 2020 13:30:33 -0700
Message-ID: <CAA93jw7P5A1i5jWZuypYMrM5mmJV9Rzu+1Hp2mKwr0jcjgyRHw@mail.gmail.com>
To: Tim Higgins <tim@smallnetbuilder.com>
Cc: Bob McMahon <bob.mcmahon@broadcom.com>, 
 Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Make-wifi-fast] SmallNetBuilder article: Does OFDMA Really
	Work?
X-BeenThere: make-wifi-fast@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: <make-wifi-fast.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/make-wifi-fast>
List-Post: <mailto:make-wifi-fast@lists.bufferbloat.net>
List-Help: <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 15 May 2020 20:30:46 -0000

On Fri, May 15, 2020 at 12:50 PM Tim Higgins <tim@smallnetbuilder.com> wrot=
e:
>
> Thanks for the additional insights, Bob. How do you measure TCP connects?
>
> Does Dave or anyone else on the bufferbloat team want to comment on Bob's=
 comment that latency testing under "heavy traffic" isn't ideal?

I hit save before deciding to reply.

> My impression is that the rtt_fair_var test I used in the article and oth=
er RRUL-related Flent tests fully load the connection under test. Am I inco=
rrect?

well, to whatever extent possible by other limits in the hardware.
Under loads like these, other things - such as the rx path, or cpu,
start to fail. I had one box that had a memory leak, overnight testing
like this, showed it up. Another test - with ipv6 - ultimately showed
serious ipv6 traffic was causing a performance sucking cpu trap.
Another test showed IPv6 being seriously outcompeted by ipv4 because
there was 4096 ipv4 flow offloads in the hardware, and only 64 for ipv6....

There are many other tests in the suite - testing a fully loaded
station while other stations are moping along... stuff near and far
away (ATF),


>
> =3D=3D=3D
> On 5/15/2020 3:36 PM, Bob McMahon wrote:
>
> Latency testing under "heavy traffic" isn't ideal.

Of course not. But in any real time control system, retaining control
and degrading predictably under load, is a hard requirement
in most other industries besides networking. Imagine if you only
tested your car, at speeds no more than 55mph, on roads that were
never slippery and with curves never exceeding 6 degrees. Then shipped
it, without a performance governor, and rubber bands holding
the steering wheel on that would break at 65mph, and with tires that
only worked at those speeds on those kind of curves.

To stick with the heavy traffic analogy, but in a slower case... I
used to have a car that overheated in heavy stop and go traffic.
Eventually, it caught on fire. (The full story is really funny,
because I was naked at the time, but I'll save it for a posthumous
biography)

> If the input rate exceeds the service rate of any queue for any period of=
 time the queue fills up and latency hits a worst case per that queue depth=
.

which is what we're all about managing well, and predictably, here at
bufferbloat.net

>I'd say take latency measurements when the input rates are below the servi=
ce rates.

That is ridiculous. It requires an oracle. It requires a belief system
where users will never exceed your mysterious parameters.

> The measurements when service rates are less than input rates are less ab=
out latency and more about bloat.

I have to note that latency measurements are certainly useful on less
loaded networks. Getting an AP out of sleep state is a good one,
another is how fast can you switch stations, under a minimal (say,
voip mostly) load, in the presence of interference.

> Also, a good paper is this one on trading bandwidth for ultra low latency=
 using phantom queues and ECN.

I'm burned out on ecn today. on the high end I rather like cisco's AFD...

https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-serie=
s-switches/white-paper-c11-738488.html

> Another thing to consider is that network engineers tend to have a miopti=
c view of latency.  The queueing or delay between the socket writes/reads a=
nd network stack matters too.

It certainly does! I'm always giving a long list of everything we've
done to improve the linux stack from app to endpoint.

Over on reddit recently (can't find the link) I talked about how bad
the linux ethernet stack was, pre-bql. I don't think anyone in the
industry
really understood deeply, the effects of packet aggregation in the
multistation case, for wifi. (I'm still unsure if anyone does!). Also
endless retries starving out other stations is huge problem in wifi,
and lte, and is going to become more of one on cable...

We've worked on tons of things - like tcp_lowat, fq, and queuing in
general - jeeze -
https://conferences.sigcomm.org/sigcomm/2014/doc/slides/137.pdf
See the slide on smashing latency everywhere in the stack.

And I certainly, now that I can regularly get fiber down below 2ms,
regard the overhead of opus (2.7ms at the higest sampling rate) a real
problem,
along with scheduling delay and jitter in the os in the jamophone
project. It pays to bypass the OS, when you can.

Latency is everywhere, and you have to tackle it, everywhere, but it
helps to focus on what ever is costing you the most latency at a time,
re:

https://en.wikipedia.org/wiki/Gustafson%27s_law

My biggest complaint nowadays about modern cpu architectures is that
they can't context switch faster than a few thousand cycles. I've
advocated that folk look over mill computer's design, which can do it in 5.

>Network engineers focus on packets or TCP RTTs and somewhat overlook a use=
r's true end to end experience.

Heh. I don't. Despite all I say here (Because I viewed the network as
the biggest problem 10 years ago), I have been doing voip and
videoconferencing apps for over 25 years, and basic benchmarks like
eye to eye/ear ear delay and jitter I have always hoped more used.

>  Avoiding bloat by slowing down the writes, e.g. ECN or different schedul=
ing, still contributes to end/end latency between the writes() and the read=
s() that too few test for and monitor.

I agree that iperf had issues. I hope they are fixed now.

>
> Note: We're moving to trip times of writes to reads (or frames for video)=
 for our testing.

ear to ear or eye to eye delay measurements are GOOD. And a lot of
that delay is still in the stack. One day, perhaps
we can go back to scan lines and not complicated encodings.

>We are also replacing/supplementing pings with TCP connects as other "late=
ncy related" measurements. TCP connects are more important than ping.

I wish more folk measured dns lookup delay...

Given the prevalance of ssl, I'd be measuring not just the 3whs, but
that additional set of handshakes.

We do have a bunch of http oriented tests in the flent suite, as well
as for voip. At the time we were developing it,
though, videoconferncing was in its infancy and difficult to model, so
we tended towards using what flows we could get
from real servers and services. I think we now have tools to model
videoconferencing traffic much better today than
we could, but until now, it wasn't much of a priority.

It's also important to note that videoconferencing and gaming traffic
put a very different load on the network - very sensitive to jitter,
not so sensitive to loss. Both are VERY low bandwidth compared to tcp
- gaming is 35kbit/sec for example, on 10 or 20ms intervals.

>
> Bob
>
> On Fri, May 15, 2020 at 8:20 AM Tim Higgins <tim@smallnetbuilder.com> wro=
te:
>>
>> Hi Bob,
>>
>> Thanks for your comments and feedback. Responses below:
>>
>> On 5/14/2020 5:42 PM, Bob McMahon wrote:
>>
>> Also, forgot to mention, for latency don't rely on average as most don't=
 care about that.  Maybe use the upper 3 stdev, i.e. the 99.97% point.  Our=
 latency runs will repeat 20 seconds worth of packets and find that then ca=
lculate CDFs of this point in the tail across hundreds of runs under differ=
ent conditions. One "slow packet" is all that it takes to screw up user exp=
erience when it comes to latency.
>>
>> Thanks for the guidance.
>>
>>
>> On Thu, May 14, 2020 at 2:38 PM Bob McMahon <bob.mcmahon@broadcom.com> w=
rote:
>>>
>>> I haven't looked closely at OFDMA but these latency numbers seem way to=
o high for it to matter.  Why is the latency so high?  It suggests there ma=
y be queueing delay (bloat) unrelated to media access.
>>>
>>> Also, one aspect is that OFDMA is replacing EDCA with AP scheduling per=
 trigger frame.  EDCA kinda sucks per listen before talk which is about 100=
 microseconds on average which has to be paid even when no energy detect.  =
This limits the transmits per second performance to 10K (1/0.0001.). Also r=
emember that WiFi aggregates so transmissions have multiple packets and lon=
g transmits will consume those 10K tx ops. One way to get around aggregatio=
n is to use voice (VO) access class which many devices won't aggregate (mil=
eage will vary.). Then take a packets per second measurement with small pac=
kets.  This would give an idea on the frame scheduling being AP based vs ED=
CA.
>>>
>>> Also, measuring ping time as a proxy for latency isn't ideal. Better to=
 measure trip times of the actual traffic.  This requires clock sync to a c=
ommon reference. GPS atomic clocks are available but it does take some setu=
p work.
>>>
>>> I haven't thought about RU optimizations and that testing so can't real=
ly comment there.
>>>
>>> Also, I'd consider replacing the mechanical turn table with variable ph=
ase shifters and set them in the MIMO (or H-Matrix) path.  I use model 8421=
 from Aeroflex. Others make them too.
>>>
>> Thanks again for the suggestions. I agree latency is very high when I re=
move the traffic bandwidth caps. I don't know why. One of the key questions=
 I've had since starting to mess with OFDMA is whether it helps under light=
 or heavy traffic load. All I do know is that things go to hell when you lo=
ad the channel. And RRUL test methods essentially break OFDMA.
>>
>> I agree using ping isn't ideal. But I'm approaching this as creating a t=
est that a consumer audience can understand. Ping is something consumers ca=
re about and understand.  The octoScope STApals are all ntp sync'd and late=
ncy measurements using iperf have been done by them.
>>
>>
>
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast


--=20
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman

dave@taht.net <Dave T=C3=A4ht> CTO, TekLibre, LLC Tel: 1-831-435-0729