[Make-wifi-fast] Status of the industry on over buffering at the WiFi air interface

Bob McMahon bob.mcmahon at broadcom.com
Thu Feb 13 01:27:14 EST 2020

Internally, we have telemetry as packets move through the end/end logic
subsystems.  A python controller receives all the telemetry from separate
netlink sockets.  It also maps all the time domains, e.g., TSF, into the
GPS time domain.  Then one can see exactly where packets are at any moment
in time.  We also produce stacked bar plots for each packet latency after
it moves from end.  Then produce clusters from there as there are millions
of packets.  Typically our main goal is to show our customers we're not the
problem and show that it's either their os/stack or air time, things we
don't control. (I argue we have more control over EDCA then we'd admit,
late bindings, e.g. MCS rate selection, etc., and per packet adaptive EDCAs
seem interesting)

This type of WiFi network telemetry isn't supported outside of internal
tools.  There is some movement towards inserting network telemetry inside
TCP headers but not much. I believe SDN guys use it inside of data
centers.  If it's useful, adding it to open source tooling might be doable
though I'd need to do some thinking about the technical details a bit.  A
first obstacle is figuring out a common time domain or how to provide
sufficient information without one.

Something like this could help drive ECN type features - not sure.  The
network engineering teams are so silo'd both within orgs and across
companies it's hard to truly optimize end/end problems.  The OSI layering
model tends to get in the way too, at least from an eng silo'ing


On Wed, Feb 12, 2020 at 5:56 PM David P. Reed <dpreed at deepplum.com> wrote:

> I know this is hard to measure, in general. Especially to isolate the
> issue because it combines packet scheduling, the AP's own activity, and the
> insertion of excess buffering in each device's hardware and driver
> software.
> However, what I'm looking for is evidence that helps locate the problem,
> which of course is a "distributed scheduling and buffering" problem, unlike
> the simple bufferbloat we all saw in the CMTS's of DOCSIS 2.0,, ALU's LTE
> deployments in the early days of 4G (at ATT Wireless), or the overbuffering
> in Arista Networks's switches, which were quite simple to measure and
> diagnose.
> On Wednesday, February 12, 2020 7:36pm, "Bob McMahon" <
> bob.mcmahon at broadcom.com> said:
> > hmm, not sure if this helps but "excess queueing" can be hard to define.
> >
> > Do you know the operating systems for the WiFi devices and if tooling can
> > be loaded upon them?  iperf clients samples RTT and CWND for linux
> > machines. Iperf 2.0.14 (in development) has a lot of latency related
> > features
> >
> > Also, if there is control over the AIFS one can set that for the high
> rates
> > devices such that they always win and the lower rate ones always lose.
> If
> > that solves things it does suggest WiFi tx queues developing per the TXOP
> > arbitration and air transmission as an issue.  Standard cwmin/cwmax isn't
> > as effective though it won't allow high rates to starve low rates devices
> > as AIFS might (depending upon the values)
> >
> > I use latency to measure the performance and define bounds that way and
> > it's very specific to use cases.  IT does require clock sync. My devices
> > have GPS disciplined oscillators which aren't common.
> >
> > As an aside, the HULL approach of phantom queues looks interesting.
> > https://people.csail.mit.edu/alizadeh/papers/hull-nsdi12.pdf
> >
> > Bob
> >
> > On Wed, Feb 12, 2020 at 4:08 PM David P. Reed <dpreed at deepplum.com>
> wrote:
> >
> >> A friend of mine (not a network expert, but a gadget freak), has been
> >> deploying wireless security cameras at his home and vacation home. He
> uses
> >> a single WiFi AP in each place, serving the security cameras etc.
> >>
> >> What he observes is this:
> >>
> >> Whenever anyone on a laptop in one of the homes uploads a modest sized
> >> file (over the same WiFi) the security systems all lose data.
> >>
> >> Now I can't go to his home to diagnose this, but I've asked him to check
> >> out his cable bufferbloat using dslreports, and he gets no bufferbloat
> >> there. But it sure looks like *severe* lag under load is affecting the
> >> security camera feed to the cloud servers that the company that sells
> the
> >> security cameras provides.
> >>
> >> So, is there a way to simply *diagnose* the WiFi air link for excess
> >> queueing in all the high rate WiFi devices? Something a non-net-head
> could
> >> do?
> >>
> >> The situation around congestion control in the industry continues to
> >> royally suck, in my opinion. The vendors don't care, the ISPs don't care
> >> (they can sell a higher speed connection than is actually needed and
> >> super-fabulous MIMO gadgets that still don't quite solve the problem).
> >>
> >> I'm an old guy, basically retired. I'm sad because the young folks
> remain
> >> clueless.
> >>
> >> And it's been decades since bufferbloat was discuvered, and the basic
> >> issue of congestion signalling being needed. I'm sure 5G (whatever it
> >> really is) is not paying attention to this network level congestion
> issue...
> >>
> >> _______________________________________________
> >> Make-wifi-fast mailing list
> >> Make-wifi-fast at lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/make-wifi-fast
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/make-wifi-fast/attachments/20200212/347147a5/attachment-0001.html>

More information about the Make-wifi-fast mailing list