From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id EC44A3B2A4 for ; Mon, 11 Feb 2019 13:07:49 -0500 (EST) Received: by mail-ed1-x536.google.com with SMTP id d9so4806396edh.12 for ; Mon, 11 Feb 2019 10:07:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EFfcELt46suMDZiHDQS3pmEoi12fETdGR0x3OrryCEw=; b=BIKuvYG1faSLk8yFJDAgGXdIATi1uRPFmyogDrdYOjW0r1ex/qbc1cOdnOsS8l3tXa Y9Yja/AuURXDxTIQav91jR+0/zFdvcw8/BdP9BwvCJ8AYp+pnGUVrbXJrUm3x1Zf8NGu YfaRlaWD3s1QOyAvOM0lFwiI31bE7zRmFCumk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EFfcELt46suMDZiHDQS3pmEoi12fETdGR0x3OrryCEw=; b=g6rkXYdQWAfQ6CXOqv0rS3s/MeVrZZPPlWWZ3o+qpmckiND+43+okloMRkS0j/jGWU CMn69wrLCJQG0hM29caRQ0Fdme5DRYDMXWvzjyrKiEtAlXErM0i6KCoAhDSG4wE2CPYQ 44yjU8AoTKdNidy+/eTGFS1XELN8XhbM4fWQ8Zd9g863/TuztZIdMZHe77p5tW3vLP2p JmG2Xdn7taycSkOS13ykim3qR/vNEPDF4Ga553HX0Gl+KPmKYs6GFB+oTrO8OAR+qiJq cErJLiRiB9/U3OLgfNsPzYARBRVLMXRHSCv6pLRCfABe0qySJZ8f3/chLFdvdBrs9AVT UyRQ== X-Gm-Message-State: AHQUAubQH1UgPE9y2yu0Wsj0U6tTUo7vjIphaICWzHurvkK0HlgtdYR7 hTYbF/fcMyR04Sh+5dkI0zn3PSRpZw6tQBReaINVD5TnRz0= X-Google-Smtp-Source: AHgI3IYEvpIQmdaPnweeYucgM1lq3Uvx5FGwOrFRL02NCXjh5yzuShIjlzKRQQPN9eLUNhtz8/7JGV2Pfp5nPfNcQEc= X-Received: by 2002:a17:906:6043:: with SMTP id p3mr27531159ejj.72.1549908468613; Mon, 11 Feb 2019 10:07:48 -0800 (PST) MIME-Version: 1.0 References: <3CB198EB-2844-42DB-9E5A-26708BFD7304@gmail.com> <1549825225.478726910@apps.rackspace.com> <1549896429.35521705@apps.rackspace.com> In-Reply-To: <1549896429.35521705@apps.rackspace.com> From: Bob McMahon Date: Mon, 11 Feb 2019 23:37:37 +0530 Message-ID: To: "David P. Reed" Cc: Jonathan Morton , Make-Wifi-fast Content-Type: multipart/alternative; boundary="000000000000b503010581a22f2e" Subject: Re: [Make-wifi-fast] bloated ath10k X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Feb 2019 18:07:50 -0000 --000000000000b503010581a22f2e Content-Type: text/plain; charset="UTF-8" Hi David, Thanks for the discussion and tolerating hearing what you already know. Per Jaffe, TCP can't be optimized to achieve maximum network power (avg throughput / latency) because an end host (or TCP end) only has only local knowledge. I think that's why things like HULL are setting the CE bit and why we have so many TCP control loops none of which can determine the true "near congestion point." With wireless links, the STAs aren't limited to local knowledge because their PHYs are always listening to the medium. I think this helps chip fw towards the ability to detect the input rate as exceeding the possible service rate (as defined by "congestion" and/or "free spectrum" - whatever those terms mean as neither are related to physics. ;) If this is the case, the chip fw can determine, and may be able to predict, with STA state when the "queued traffic on LAN reached threshold." Per 802.11ax the AP will need to have virtual output queues to work with downlink OFDMA. Those transmits are truly "in parallel" which requires the multiple queues on the AP. So I don't perceive the two queues feeding one queue "engineering phenomena" lasting forever. I'm no mesh expert but it does seem suboptimal that a transmit from STA A to STA B needs to be forwarded by an AP if STA B received the energy from A at an SINR that it allows it to decode the packet. In this case, the ack could come from B to A directly and the AP could drop what it heard. No ack from the real destination, then the AP forwards. Though with 802.11ax, it's all centered around the AP for scheduling , i.e. all transmits even on STAs are "triggered" and the AP sets the transmit powers so uplink OFDMA works. I'm not sure how well 802.11ax plays with mesh. Just some more random thoughts and thanks again for the discussion. Bob On Mon, Feb 11, 2019 at 8:17 PM David P. Reed wrote: > Bob - I was focusing on the standard Listen-Before-Talk CSMA MAC approach > in the current 802.11 through the ac variant. 802.11ax is, of course, a > whole different MAC layer, and its queuing management issues for IP > end-to-end congestion management will be different. > > > > It seems likely to me that by being more oriented around AP-centered > scheduling it will behave more like a single shared queue without fairness > at the IP flow level. Since bufferbloat is essentially caused by queuing > below the IP layer without providing timely feedback to the IP endpoints, I > don't think 802.11ax can fix bufferbloat by itself. Some kind of "queued > traffic on LAN reached threshold" message needs to be made available to the > IP forwarding mechanism in each MAC STA and AP in 802.11ax in order to > mitigate lag under load. And IP flow-level fairness needs to be > implementable by the IP forwarding layet (ideally collectively among all > sharing the up channel, which AP scheduling of the uplink can achieve, > maybe). > > > > I was peripherally involved in the "self-cancellation" PHY research at MIT > at one point, and I know of the other experiments out in the Bay Area. It > does look promising, but the issue I was pointing out is still there: if > STA A and STA B send to each other, the AP-STA links are still a shared > queue that can transmit only one packet at a time. So while there is some > simultaneity from the PHY level, there are still two uplink queues feeding > into a single downlink queue in the paths between A and B when there is an > AP involved as an intermediary. Thats what I was pointing out as the > complex queue-dynamics issue. > > > > Glad to hear that full duplex is being explored for commercial use now. It > will certainly double the capacity using the same airtime for twice as > much. I won't spend time pointing out that there are even better > opportunities that get more than 2x for an N-node WLAN, by going to > physical level repeaters. (but that's still all theory, not reduced to > practice in labs, except a little bit). > > > > > > -----Original Message----- > From: "Bob McMahon" > Sent: Sunday, February 10, 2019 9:18pm > To: "David P. Reed" > Cc: "Jonathan Morton" , "Make-Wifi-fast" < > make-wifi-fast@lists.bufferbloat.net> > Subject: Re: [Make-wifi-fast] bloated ath10k > > Just a reminder with 802.11ax uplink OFDMA that STA A and B transmits can > be "concurrent." Also, there are companies working on full duplex per > self cancellation. Kumu networks is one > http://kumunetworks.com/ > Bob > > On Mon, Feb 11, 2019 at 12:30 AM David P. Reed > wrote: > >> Side note: between two stations talking through an AP, it's not half >> duplex. It's kind-of quarter-duplex. Each packet between STA A and STA B >> cannot be concurrent with the subsequent packet from A to B, and both >> transmissions of a packet from B to A. >> >> That actually has a significant effect on "queue depth". Please don't >> call it "interference", because no packet corruption happens. >> >> >> >> This is why merely fixing the queueing discipline in the AP alone doesn't >> necessarily ameliorate bufferbloat. The queues in the STA's need to be >> managed, too. >> >> >> >> You guys know that implicitly, I know I'm not telling you anything you >> don't know. But this queueing needs to be managed in such a way that the >> backoff in TCP is signaled. That is, packets need to be dropped or marked. >> You can't fix this in the forwarding paths alone. >> >> >> >> -----Original Message----- >> From: "Jonathan Morton" >> Sent: Sunday, February 10, 2019 7:38am >> To: "Adrian Popescu" >> Cc: make-wifi-fast@lists.bufferbloat.net >> Subject: Re: [Make-wifi-fast] bloated ath10k >> >> > On 10 Feb, 2019, at 2:24 pm, Adrian Popescu >> wrote: >> > >> > My attempts to use SQM and codel to reduce wifi bloat didn't seem to >> get me very far. 802.11ac seems more reliable and it seems to be more >> bloated. ath9k can go as low as 3-5 milliseconds. ath10k is usually in the >> 20-50 milliseconds range (or more, based on the number of stations). I >> usually test with a single client as I don't expect latency to improve with >> more clients. >> >> Some things are unavoidable when you move to a shared, half-duplex, noisy >> medium like wifi versus a switched, full-duplex, error-free medium like >> Ethernet. >> >> If you're getting 20-50ms under load, then I think things are working >> quite well with wifi. We can wish for better, but not long ago you could be >> looking at multiple seconds in bad cases. At the levels you're now seeing, >> ordinary interactive protocols like DNS and HTTP will work reliably and >> quickly, and even VoIP should be able to cope; you'll likely only really >> notice a problem with online games. >> >> Ath9k has some advantages over ath10k in this area. Its MAC is managed at >> a lower level by the driver, so we have much more control over it when >> trying to debloat. I think we still have more control over ath10k than most >> of the alternatives. >> >> If low latency is mission critical, though - just run a wire. >> >> - Jonathan Morton >> >> _______________________________________________ >> Make-wifi-fast mailing list >> Make-wifi-fast@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/make-wifi-fast >> _______________________________________________ >> Make-wifi-fast mailing list >> Make-wifi-fast@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/make-wifi-fast > > --000000000000b503010581a22f2e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi David,

Thanks for the discussion and= tolerating hearing what you already know.=C2=A0=C2=A0

=
Per Jaffe, TCP can't be optimized to achieve maximum network power= (avg throughput / latency) because an end host=C2=A0 (or TCP end) only has= only local knowledge.=C2=A0 I think that's why things like HULL are se= tting the CE bit and why we have so many TCP control loops none of which ca= n determine the true "near congestion point."=C2=A0 With wireless= links, the STAs aren't limited to local knowledge because their PHYs a= re always listening to the medium.=C2=A0 I think this helps chip fw towards= the ability to detect the input rate as exceeding the possible service rat= e (as defined by "congestion" and/or "free spectrum" - = whatever those terms mean as neither are related to physics. ;)=C2=A0 If th= is is the case, the chip fw can determine, and may be able to predict, with= STA state when the "queued traffic on LAN reached threshold."

Per 802.11ax the AP will need to have virtual output= queues to work with downlink OFDMA.=C2=A0 Those transmits are truly "= in parallel" which requires the multiple queues on the AP.=C2=A0 So I = don't perceive the two queues feeding one queue "engineering pheno= mena" lasting forever.

I'm no mesh expert but it does seem = suboptimal that a transmit from STA A to STA B needs to be forwarded by an = AP if STA B received the energy from A at an SINR that it allows it to deco= de the packet.=C2=A0 In this case, the ack could come from B to A directly = and the AP could drop what it heard.=C2=A0 No ack from the real destination= , then the AP forwards.

Though with 802.11ax, it's all centered = around the AP for scheduling , i.e. all transmits even on STAs are "tr= iggered" and the AP sets the transmit powers so uplink OFDMA works.=C2= =A0 =C2=A0 I'm not sure how well 802.11ax plays with mesh.

Just = some more random thoughts and thanks again for the discussion.
Bob

On Mon, Feb 11, 2019 at 8:17 PM David P. Reed <dpreed@deepplum.com> wrote:

Bob - I was focusing on the standard Listen-Before-= Talk CSMA MAC approach in the current 802.11 through the ac variant. 802.11= ax is, of course, a whole different MAC layer, and its queuing management i= ssues for IP end-to-end congestion management will be different.

=C2=A0=

It see= ms likely to me that by being more oriented around AP-centered scheduling i= t will behave more like a single shared queue without fairness at the IP fl= ow level. Since bufferbloat is essentially caused by queuing below the IP l= ayer without providing timely feedback to the IP endpoints, I don't thi= nk 802.11ax can fix bufferbloat by itself. Some kind of "queued traffi= c on LAN reached threshold" message needs to be made available to the = IP forwarding mechanism in each MAC STA and AP in 802.11ax in order to miti= gate lag under load. And IP flow-level fairness needs to be implementable b= y the IP forwarding layet (ideally collectively among all sharing the up ch= annel, which AP scheduling of the uplink can achieve, maybe).

=C2=A0=

I was = peripherally involved in the "self-cancellation" PHY research at = MIT at one point, and I know of the other experiments out in the Bay Area. = It does look promising, but the issue I was pointing out is still there: if= STA A and STA B send to each other, the AP-STA links are still a shared qu= eue that can transmit only one packet at a time. So while there is some sim= ultaneity from the PHY level, there are still two uplink queues feeding int= o a single downlink queue in the paths between A and B when there is an AP = involved as an intermediary. Thats what I was pointing out as the complex q= ueue-dynamics issue.

=C2=A0=

Glad t= o hear that full duplex is being explored for commercial use now. It will c= ertainly double the capacity using the same airtime for twice as much. I wo= n't spend time pointing out that there are even better opportunities th= at get more than 2x for an N-node WLAN, by going to physical level repeater= s. (but that's still all theory, not reduced to practice in labs, excep= t a little bit).

=C2=A0=

=C2=A0=

-----O= riginal Message-----
From: "Bob McMahon" <
bob.mcmahon@broadcom.com&g= t;
Sent: Sunday, February 10, 2019 9:18pm
To: "David P. Reed&quo= t; <dpreed@deep= plum.com>
Cc: "Jonathan Morton" <chromatix99@gmail.com>, &quo= t;Make-Wifi-fast" <make-wifi-fast@lists.bufferbloat.net>
S= ubject: Re: [Make-wifi-fast] bloated ath10k

Just a reminder with 802.11ax uplink OFDMA that STA A and = B transmits can be "concurrent."=C2=A0 =C2=A0 Also, there are com= panies working on full duplex per self cancellation.=C2=A0 Kumu networks is= one
Bob

On Mon, Feb 11, 2019 at 12:30 AM Davi= d P. Reed <dpre= ed@deepplum.com> wrote:

Side n= ote: between two stations talking through an AP, it's not half duplex. = It's kind-of quarter-duplex. Each packet between STA A and STA B cannot= =C2=A0 be concurrent with the subsequent packet from A to B, and both trans= missions of a packet from B to A.

That a= ctually has a significant effect on "queue depth".=C2=A0 Please d= on't call it "interference", because no packet corruption hap= pens.

=C2=A0=

This i= s why merely fixing the queueing discipline in the AP alone doesn't nec= essarily ameliorate bufferbloat. The queues in the STA's need to be man= aged, too.

=C2=A0=

You gu= ys know that implicitly, I know I'm not telling you anything you don= 9;t know. But this queueing needs to be managed in such a way that the back= off in TCP is signaled. That is, packets need to be dropped or marked. You = can't fix this in the forwarding paths alone.

=C2=A0=

-----O= riginal Message-----
From: "Jonathan Morton" <chromatix99@gmail.com>=
Sent: Sunday, February 10, 2019 7:38am
To: "Adrian Popescu"= ; <adrian= nnpopescu@gmail.com>
Cc: make-wifi-fast@lists.bufferbloat.net<= br>Subject: Re: [Make-wifi-fast] bloated ath10k

> O= n 10 Feb, 2019, at 2:24 pm, Adrian Popescu <adriannnpopescu@gmail.com> wrote:=
>
> My attempts to use SQM and codel to reduce wifi bloat did= n't seem to get me very far. 802.11ac seems more reliable and it seems = to be more bloated. ath9k can go as low as 3-5 milliseconds. ath10k is usua= lly in the 20-50 milliseconds range (or more, based on the number of statio= ns). I usually test with a single client as I don't expect latency to i= mprove with more clients.

Some things are unavoidable when you move = to a shared, half-duplex, noisy medium like wifi versus a switched, full-du= plex, error-free medium like Ethernet.

If you're getting 20-50ms= under load, then I think things are working quite well with wifi. We can w= ish for better, but not long ago you could be looking at multiple seconds i= n bad cases. At the levels you're now seeing, ordinary interactive prot= ocols like DNS and HTTP will work reliably and quickly, and even VoIP shoul= d be able to cope; you'll likely only really notice a problem with onli= ne games.

Ath9k has some advantages over ath10k in this area. Its MA= C is managed at a lower level by the driver, so we have much more control o= ver it when trying to debloat. I think we still have more control over ath1= 0k than most of the alternatives.

If low latency is mission critical= , though - just run a wire.

- Jonathan Morton

______________= _________________________________
Make-wifi-fast mailing list
Make-wif= i-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/= listinfo/make-wifi-fast

_______________________________________________
Make-wifi-fast mailing = list
Make-wifi-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/make-wifi-fast
--000000000000b503010581a22f2e--