From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A26483B29D for ; Thu, 13 Feb 2020 01:27:26 -0500 (EST) Received: by mail-wr1-x429.google.com with SMTP id w15so5191785wru.4 for ; Wed, 12 Feb 2020 22:27:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FvR2XL2u8gmCZftx/XYn2zgg42D5tS6YpMM/eYqZvo8=; b=f1FyptE0voU7mL5iY+HQ5CAAipfEupK35zESWVBfNPePmUONcosaMouiolwXEYxrrV tUj2ufQqvf5IEp1W2qWc/2UNKz6BTXe4duW0Z4a8pGaRcJZoIgLDILdO/t7DQhJfWDkF mkYgex5yqDswVPpTF7x7qj7a087VI8cbkayFU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FvR2XL2u8gmCZftx/XYn2zgg42D5tS6YpMM/eYqZvo8=; b=LoORwGU1eZYVZJMQja7Mb7+Ryy3f+qFV9edA3V3VAuSNZiriB0V7YaYTozl1H7nsPF 88LJUjEpcWBVb+xYjJ95edgfy2ZwqyZS0+RQS8hmyLs1Xpb+8mWCx+D14ujw5UtsFEp6 CoZvNjr92NP4beHhCHW8LD89whKqbXTv5OGh5b3k+yQsCzfvKiHANdMwwU6Bn9YENjtJ WP58nKynTJybd5Zh6MD7R4Wm77hHCEqGj+6kjsFovP8GXRvm5PXkpGuZvjmfmaC0+SFO fmVgUby8uIioZLXnNnNvLXCkar68hkL/2YII9HeCsEdZiQpsIpPL3juNM6PL0P+TP6pT o70A== X-Gm-Message-State: APjAAAUR+us3loIdt8/BYWyiG8/9qxkMjdEIFgkOpqveXvrpaaho5E6R gz5I17pTa6TDqeqtW4rDTBa8OcAUHc+NmbPinH7HcQtR X-Google-Smtp-Source: APXvYqxEsEJNwXCW0pSp+D+v3C4S9jp0MomwHxCf43Ih5b688o4i++JqA5LuJ3FzPnWg3F6SZtEroN5YP7vlSd/+n8c= X-Received: by 2002:a5d:6087:: with SMTP id w7mr19394376wrt.36.1581575245441; Wed, 12 Feb 2020 22:27:25 -0800 (PST) MIME-Version: 1.0 References: <1581552513.586428831@apps.rackspace.com> <1581559003.730714516@apps.rackspace.com> In-Reply-To: <1581559003.730714516@apps.rackspace.com> From: Bob McMahon Date: Wed, 12 Feb 2020 22:27:14 -0800 Message-ID: Subject: Re: [Make-wifi-fast] Status of the industry on over buffering at the WiFi air interface To: "David P. Reed" Cc: Make-Wifi-fast Content-Type: multipart/alternative; boundary="000000000000b10fc2059e6f2efd" X-List-Received-Date: Thu, 13 Feb 2020 06:27:26 -0000 --000000000000b10fc2059e6f2efd Content-Type: text/plain; charset="UTF-8" Internally, we have telemetry as packets move through the end/end logic subsystems. A python controller receives all the telemetry from separate netlink sockets. It also maps all the time domains, e.g., TSF, into the GPS time domain. Then one can see exactly where packets are at any moment in time. We also produce stacked bar plots for each packet latency after it moves from end. Then produce clusters from there as there are millions of packets. Typically our main goal is to show our customers we're not the problem and show that it's either their os/stack or air time, things we don't control. (I argue we have more control over EDCA then we'd admit, late bindings, e.g. MCS rate selection, etc., and per packet adaptive EDCAs seem interesting) This type of WiFi network telemetry isn't supported outside of internal tools. There is some movement towards inserting network telemetry inside TCP headers but not much. I believe SDN guys use it inside of data centers. If it's useful, adding it to open source tooling might be doable though I'd need to do some thinking about the technical details a bit. A first obstacle is figuring out a common time domain or how to provide sufficient information without one. Something like this could help drive ECN type features - not sure. The network engineering teams are so silo'd both within orgs and across companies it's hard to truly optimize end/end problems. The OSI layering model tends to get in the way too, at least from an eng silo'ing perspective. Bob On Wed, Feb 12, 2020 at 5:56 PM David P. Reed wrote: > I know this is hard to measure, in general. Especially to isolate the > issue because it combines packet scheduling, the AP's own activity, and the > insertion of excess buffering in each device's hardware and driver > software. > > However, what I'm looking for is evidence that helps locate the problem, > which of course is a "distributed scheduling and buffering" problem, unlike > the simple bufferbloat we all saw in the CMTS's of DOCSIS 2.0,, ALU's LTE > deployments in the early days of 4G (at ATT Wireless), or the overbuffering > in Arista Networks's switches, which were quite simple to measure and > diagnose. > > On Wednesday, February 12, 2020 7:36pm, "Bob McMahon" < > bob.mcmahon@broadcom.com> said: > > > hmm, not sure if this helps but "excess queueing" can be hard to define. > > > > Do you know the operating systems for the WiFi devices and if tooling can > > be loaded upon them? iperf clients samples RTT and CWND for linux > > machines. Iperf 2.0.14 (in development) has a lot of latency related > > features > > > > Also, if there is control over the AIFS one can set that for the high > rates > > devices such that they always win and the lower rate ones always lose. > If > > that solves things it does suggest WiFi tx queues developing per the TXOP > > arbitration and air transmission as an issue. Standard cwmin/cwmax isn't > > as effective though it won't allow high rates to starve low rates devices > > as AIFS might (depending upon the values) > > > > I use latency to measure the performance and define bounds that way and > > it's very specific to use cases. IT does require clock sync. My devices > > have GPS disciplined oscillators which aren't common. > > > > As an aside, the HULL approach of phantom queues looks interesting. > > https://people.csail.mit.edu/alizadeh/papers/hull-nsdi12.pdf > > > > Bob > > > > On Wed, Feb 12, 2020 at 4:08 PM David P. Reed > wrote: > > > >> A friend of mine (not a network expert, but a gadget freak), has been > >> deploying wireless security cameras at his home and vacation home. He > uses > >> a single WiFi AP in each place, serving the security cameras etc. > >> > >> What he observes is this: > >> > >> Whenever anyone on a laptop in one of the homes uploads a modest sized > >> file (over the same WiFi) the security systems all lose data. > >> > >> Now I can't go to his home to diagnose this, but I've asked him to check > >> out his cable bufferbloat using dslreports, and he gets no bufferbloat > >> there. But it sure looks like *severe* lag under load is affecting the > >> security camera feed to the cloud servers that the company that sells > the > >> security cameras provides. > >> > >> So, is there a way to simply *diagnose* the WiFi air link for excess > >> queueing in all the high rate WiFi devices? Something a non-net-head > could > >> do? > >> > >> The situation around congestion control in the industry continues to > >> royally suck, in my opinion. The vendors don't care, the ISPs don't care > >> (they can sell a higher speed connection than is actually needed and > >> super-fabulous MIMO gadgets that still don't quite solve the problem). > >> > >> I'm an old guy, basically retired. I'm sad because the young folks > remain > >> clueless. > >> > >> And it's been decades since bufferbloat was discuvered, and the basic > >> issue of congestion signalling being needed. I'm sure 5G (whatever it > >> really is) is not paying attention to this network level congestion > issue... > >> > >> _______________________________________________ > >> Make-wifi-fast mailing list > >> Make-wifi-fast@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/make-wifi-fast > > > > > --000000000000b10fc2059e6f2efd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Internally, we have telemetry as packets move through the = end/end logic subsystems.=C2=A0 A python controller receives all the teleme= try from separate netlink sockets.=C2=A0 It also maps all the time domains,= e.g., TSF, into the GPS time domain.=C2=A0 Then one can see exactly where= =C2=A0packets are at any moment in time.=C2=A0 We also produce stacked bar = plots for each packet latency after it moves from end.=C2=A0 Then produce c= lusters from there as there are millions of packets.=C2=A0 Typically our ma= in goal is to show our customers we're not the problem and show that it= 's either their os/stack or air time, things we don't control. (I a= rgue we have more control over EDCA then=C2=A0we'd admit, late bindings= , e.g. MCS rate selection, etc., and per packet adaptive EDCAs seem interes= ting)

This type of WiFi network telemetry=C2=A0isn't supported o= utside of internal tools.=C2=A0 There is some movement towards inserting ne= twork telemetry inside TCP headers but not much. I believe SDN guys use it = inside of data centers.=C2=A0 If it's useful, adding it to open source = tooling might be doable though I'd need to do some thinking=C2=A0about = the technical details a bit.=C2=A0 A first obstacle is figuring out a commo= n time domain or how to provide sufficient information without one.

= Something like this could help drive ECN type features - not sure.=C2=A0 Th= e network engineering teams are so silo'd=C2=A0both within orgs and acr= oss companies it's hard to truly optimize end/end problems.=C2=A0 The O= SI layering model tends to get in the way too, at least from an eng silo= 9;ing perspective.=C2=A0

Bob

On Wed, Feb 12, 2020 at 5:56 PM D= avid P. Reed <dpreed@deepplum.com= > wrote:
= I know this is hard to measure, in general. Especially to isolate the issue= because it combines packet scheduling, the AP's own activity, and the = insertion of excess buffering in each device's hardware and driver soft= ware.

However, what I'm looking for is evidence that helps locate the problem= , which of course is a "distributed scheduling and buffering" pro= blem, unlike the simple bufferbloat we all saw in the CMTS's of DOCSIS = 2.0,, ALU's LTE deployments in the early days of 4G (at ATT Wireless), = or the overbuffering in Arista Networks's switches, which were quite si= mple to measure and diagnose.

On Wednesday, February 12, 2020 7:36pm, "Bob McMahon" <bob.mcmahon@broadcom= .com> said:

> hmm, not sure if this helps but "excess queueing" can be har= d to define.
>
> Do you know the operating systems for the WiFi devices and if tooling = can
> be loaded upon them?=C2=A0 iperf clients samples RTT and CWND for linu= x
> machines. Iperf 2.0.14 (in development) has a lot of latency related > features
>
> Also, if there is control over the AIFS one can set that for the high = rates
> devices such that they always win and the lower rate ones always lose.= =C2=A0 If
> that solves things it does suggest WiFi tx queues developing per the T= XOP
> arbitration and air transmission as an issue.=C2=A0 Standard cwmin/cwm= ax isn't
> as effective though it won't allow high rates to starve low rates = devices
> as AIFS might (depending upon the values)
>
> I use latency to measure the performance and define bounds that way an= d
> it's very specific to use cases.=C2=A0 IT does require clock sync.= My devices
> have GPS disciplined oscillators which aren't common.
>
> As an aside, the HULL approach of phantom queues looks interesting. > https://people.csail.mit.edu/alizad= eh/papers/hull-nsdi12.pdf
>
> Bob
>
> On Wed, Feb 12, 2020 at 4:08 PM David P. Reed <dpreed@deepplum.com> wrote:
>
>> A friend of mine (not a network expert, but a gadget freak), has b= een
>> deploying wireless security cameras at his home and vacation home.= He uses
>> a single WiFi AP in each place, serving the security cameras etc.<= br> >>
>> What he observes is this:
>>
>> Whenever anyone on a laptop in one of the homes uploads a modest s= ized
>> file (over the same WiFi) the security systems all lose data.
>>
>> Now I can't go to his home to diagnose this, but I've aske= d him to check
>> out his cable bufferbloat using dslreports, and he gets no bufferb= loat
>> there. But it sure looks like *severe* lag under load is affecting= the
>> security camera feed to the cloud servers that the company that se= lls the
>> security cameras provides.
>>
>> So, is there a way to simply *diagnose* the WiFi air link for exce= ss
>> queueing in all the high rate WiFi devices? Something a non-net-he= ad could
>> do?
>>
>> The situation around congestion control in the industry continues = to
>> royally suck, in my opinion. The vendors don't care, the ISPs = don't care
>> (they can sell a higher speed connection than is actually needed a= nd
>> super-fabulous MIMO gadgets that still don't quite solve the p= roblem).
>>
>> I'm an old guy, basically retired. I'm sad because the you= ng folks remain
>> clueless.
>>
>> And it's been decades since bufferbloat was discuvered, and th= e basic
>> issue of congestion signalling being needed. I'm sure 5G (what= ever it
>> really is) is not paying attention to this network level congestio= n issue...
>>
>> _______________________________________________
>> Make-wifi-fast mailing list
>> Make-wifi-fast@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo= /make-wifi-fast
>


--000000000000b10fc2059e6f2efd--