From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@reed.com>
Received: from smtp78.iad3a.emailsrvr.com (smtp78.iad3a.emailsrvr.com
 [173.203.187.78])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id DDEA73BA8E
 for <make-wifi-fast@lists.bufferbloat.net>;
 Mon,  9 Oct 2017 16:41:51 -0400 (EDT)
Received: from smtp18.relay.iad3a.emailsrvr.com (localhost [127.0.0.1])
 by smtp18.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id B79D9251DD;
 Mon,  9 Oct 2017 16:41:51 -0400 (EDT)
Received: from app27.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140])
 by smtp18.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 8965625356;
 Mon,  9 Oct 2017 16:41:51 -0400 (EDT)
X-Sender-Id: dpreed@reed.com
Received: from app27.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12);
 Mon, 09 Oct 2017 16:41:51 -0400
Received: from reed.com (localhost.localdomain [127.0.0.1])
 by app27.wa-webapps.iad3a (Postfix) with ESMTP id 6FDFFC0080;
 Mon,  9 Oct 2017 16:41:51 -0400 (EDT)
Received: by apps.rackspace.com
 (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) 
 with HTTP; Mon, 9 Oct 2017 16:41:51 -0400 (EDT)
X-Auth-ID: dpreed@reed.com
Date: Mon, 9 Oct 2017 16:41:51 -0400 (EDT)
From: dpreed@reed.com
To: "Dave Taht" <dave.taht@gmail.com>
Cc: make-wifi-fast@lists.bufferbloat.net,
 "Johannes Berg" <johannes@sipsolutions.net>
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="----=_20171009164151000000_34663"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <CAA93jw6afzp=dNKmxGoT-v3pqT-DztEypQK3s8BJB5m7aKyZNg@mail.gmail.com>
References: <CAA93jw6afzp=dNKmxGoT-v3pqT-DztEypQK3s8BJB5m7aKyZNg@mail.gmail.com>
Message-ID: <1507581711.45638427@apps.rackspace.com>
X-Mailer: webmail/12.9.5-RC
Subject: Re: [Make-wifi-fast] less latency, more filling... for wifi
X-BeenThere: make-wifi-fast@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: <make-wifi-fast.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/make-wifi-fast>
List-Post: <mailto:make-wifi-fast@lists.bufferbloat.net>
List-Help: <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 09 Oct 2017 20:41:51 -0000

------=_20171009164151000000_34663
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AIt's worth setting a stretch latency goal that is in principle achievabl=
e.=0A =0AI get the sense that the wireless group obsesses over maximum chan=
nel utilization rather than excellent latency.  This is where it's importan=
t to put latency as a primary goal, and utilization as the secondary goal, =
rather than vice versa.=0A =0AIt's easy to get at this by observing that th=
e minimum latency on the shared channel is achieved by round-robin scheduli=
ng of packets that are of sufficient size that per packet overhead doesn't =
dominate.=0A =0ASo only aggregate when there are few contenders for the cha=
nnel, or the packets are quite small compared to the per-packet overhead. W=
hen there are more contenders, still aggregate small packets, but only thos=
e that are actually waiting. But large packets shouldn't be aggregated.=0A =
=0AMulticast should be avoided by higher level protocols for the most part,=
 and the latency of multicast should be a non-issue. In wireless, it's kind=
 of a dumb idea anyway, given that stations have widely varying propagation=
 characteristics. Do just enough to support DHCP and so forth.=0A =0AIt's s=
o much fun for tha hardware designers to throw in stuff that only helps in =
marketing benchmarks (like getting a few percent on throughput in lab condi=
tions that never happen in the field) that it is tempting for OS driver wri=
ters to use those features (like deep queues and offload processing bells a=
nd whistles). But the real issue to be solved is that turn-taking "bloat" t=
hat comes from too much attempt to aggregate, to handle the "sole transmitt=
er to dedicated receiver case" etc.=0A =0AI use 10 GigE in my house. I don'=
t use it because I want to do 10 Gig File Transfers all day and measure the=
m. I use it because (properly managed) it gives me *low latency*. That low =
latency is what matters, not throughput. My average load, if spread out acr=
oss 24 hours, could be handled by 802.11b for the entire house.=0A =0AWe ar=
e soon going to have 802.11ax in the home. That's approximately 10 Gb/sec, =
but wireless. No TV streaming can fill it. It's not for continuous isochron=
ous traffic at all.=0A =0AWhat it is for is *low latency*. So if the adapte=
rs and the drivers won't give me that low latency, what good is 10 Gb/sec a=
t all. This is true for 802.11ac, as well.=0A =0AWe aren't building Dragste=
rs fueled with nitro, to run down 1/4 mile of track but unable to steer.=0A=
 =0AInstead, we want to be able to connect musical instruments in an electr=
onic symphony, where timing is everything.=0A =0A=0A=0AOn Monday, October 9=
, 2017 4:13pm, "Dave Taht" <dave.taht@gmail.com> said:=0A=0A=0A=0A> There w=
ere five ideas I'd wanted to pursue at some point. I''m not=0A> presently o=
n linux-wireless, nor do I have time to pay attention right=0A> now - but I=
'm enjoying that thread passively.=0A> =0A> To get those ideas "out there" =
again:=0A> =0A> * adding a fixed length fq'd queue for multicast.=0A> =0A> =
* Reducing retransmits at low rates=0A> =0A> See the recent paper:=0A> =0A>=
 "Resolving Bufferbloat in TCP Communication over IEEE 802.11 n WLAN by=0A>=
 Reducing MAC Retransmission Limit at Low Data Rate" (I'd paste a link=0A> =
but for some reason that doesn't work well)=0A> =0A> Even with their simple=
 bi-modal model it worked pretty well.=0A> =0A> It also reduces contention =
with "bad" stations more automagically.=0A> =0A> * Less buffering at the dr=
iver.=0A> =0A> Presently (ath9k) there are two-three aggregates stacked up =
at the driver.=0A> =0A> With a good estimate for how long it will take to s=
ervice one, forming=0A> another within that deadline seems feasible, so you=
 only need to have=0A> one in the hardware itself.=0A> =0A> Simple example:=
 you have data in the hardware projected to take a=0A> minimum of 4ms to tr=
ansmit. Don't form a new aggregate and submit it=0A> to the hardware for 3.=
5ms.=0A> =0A> I know full well that a "good" estimate is hard, and things l=
ike=0A> mu-mimo complicate things. Still, I'd like to get below 20ms of=0A>=
 latency within the driver, and this is one way to get there.=0A> =0A> * Re=
ducing the size of a txop under contention=0A> =0A> if you have 5 stations =
getting blasted away at 5ms each, and one that=0A> only wants 1ms worth of =
traffic, "soon", temporarily reducing the size=0A> of the txop for everybod=
y so you can service more stations faster,=0A> seems useful.=0A> =0A> * Mer=
ging acs when sane to do so=0A> =0A> sane aggregation in general works bett=
er than prioritizing does, as=0A> shown in ending the anomaly.=0A> =0A> --=
=0A> =0A> Dave T=C3=A4ht=0A> CEO, TekLibre, LLC=0A> http://www.teklibre.com=
=0A> Tel: 1-669-226-2619=0A> ______________________________________________=
_=0A> Make-wifi-fast mailing list=0A> Make-wifi-fast@lists.bufferbloat.net=
=0A> https://lists.bufferbloat.net/listinfo/make-wifi-fast
------=_20171009164151000000_34663
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"verdana" size=3D"2"><p style=3D"margin:0;padding:0;font-famil=
y: verdana; font-size: 10pt; overflow-wrap: break-word;">It's worth setting=
 a stretch latency goal that is in principle achievable.</p>=0A<p style=3D"=
margin:0;padding:0;font-family: verdana; font-size: 10pt; overflow-wrap: br=
eak-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: verdana=
; font-size: 10pt; overflow-wrap: break-word;">I get the sense that the wir=
eless group obsesses over maximum channel utilization rather than excellent=
 latency.&nbsp; This is where it's important to put latency as a primary go=
al, and utilization as the secondary goal, rather than vice versa.</p>=0A<p=
 style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overflo=
w-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-famil=
y: verdana; font-size: 10pt; overflow-wrap: break-word;">It's easy to get a=
t this by observing that the minimum latency on the shared channel is achie=
ved by round-robin scheduling of packets that are of sufficient size that p=
er packet overhead doesn't dominate.</p>=0A<p style=3D"margin:0;padding:0;f=
ont-family: verdana; font-size: 10pt; overflow-wrap: break-word;">&nbsp;</p=
>=0A<p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; o=
verflow-wrap: break-word;">So only aggregate when there are few contenders =
for the channel, or the packets are quite small compared to the per-packet =
overhead. When there are more contenders, still aggregate small packets, bu=
t only those that are actually waiting. But large packets shouldn't be aggr=
egated.</p>=0A<p style=3D"margin:0;padding:0;font-family: verdana; font-siz=
e: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padd=
ing:0;font-family: verdana; font-size: 10pt; overflow-wrap: break-word;">Mu=
lticast should be avoided by higher level protocols for the most part, and =
the latency of multicast should be a non-issue. In wireless, it's kind of a=
 dumb idea anyway, given that stations have widely varying propagation char=
acteristics. Do just enough to support DHCP and so forth.</p>=0A<p style=3D=
"margin:0;padding:0;font-family: verdana; font-size: 10pt; overflow-wrap: b=
reak-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: verdan=
a; font-size: 10pt; overflow-wrap: break-word;">It's so much fun for tha ha=
rdware designers to throw in stuff that only helps in marketing benchmarks =
(like getting a few percent on throughput in lab conditions that never happ=
en in the field) that it is tempting for OS driver writers to use those fea=
tures (like deep queues and offload processing bells and whistles). But the=
 real issue to be solved is that turn-taking "bloat" that comes from too mu=
ch attempt to aggregate, to handle the "sole transmitter to dedicated recei=
ver case" etc.</p>=0A<p style=3D"margin:0;padding:0;font-family: verdana; f=
ont-size: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin=
:0;padding:0;font-family: verdana; font-size: 10pt; overflow-wrap: break-wo=
rd;">I use 10 GigE in my house. I don't use it because I want to do 10 Gig =
File Transfers all day and measure them. I use it because (properly managed=
) it gives me *low latency*. That low latency is what matters, not throughp=
ut. My average load, if spread out across 24 hours, could be handled by 802=
.11b for the entire house.</p>=0A<p style=3D"margin:0;padding:0;font-family=
: verdana; font-size: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p sty=
le=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overflow-wr=
ap: break-word;">We are soon going to have 802.11ax in the home. That's app=
roximately 10 Gb/sec, but wireless. No TV streaming can fill it. It's not f=
or continuous isochronous traffic at all.</p>=0A<p style=3D"margin:0;paddin=
g:0;font-family: verdana; font-size: 10pt; overflow-wrap: break-word;">&nbs=
p;</p>=0A<p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10=
pt; overflow-wrap: break-word;">What it is for is *low latency*. So if the =
adapters and the drivers won't give me that low latency, what good is 10 Gb=
/sec at all. This is true for 802.11ac, as well.</p>=0A<p style=3D"margin:0=
;padding:0;font-family: verdana; font-size: 10pt; overflow-wrap: break-word=
;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: verdana; font-s=
ize: 10pt; overflow-wrap: break-word;">We aren't building Dragsters fueled =
with nitro, to run down 1/4 mile of track but unable to steer.</p>=0A<p sty=
le=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overflow-wr=
ap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: v=
erdana; font-size: 10pt; overflow-wrap: break-word;">Instead, we want to be=
 able to connect musical instruments in an electronic symphony, where timin=
g is everything.</p>=0A<p style=3D"margin:0;padding:0;font-family: verdana;=
 font-size: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<!--WM_COMPOSE_S=
IGNATURE_START--><!--WM_COMPOSE_SIGNATURE_END-->=0A<p style=3D"margin:0;pad=
ding:0;font-family: verdana; font-size: 10pt; overflow-wrap: break-word;"><=
br /><br />On Monday, October 9, 2017 4:13pm, "Dave Taht" &lt;dave.taht@gma=
il.com&gt; said:<br /><br /></p>=0A<div id=3D"SafeStyles1507580504">=0A<p s=
tyle=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overflow-=
wrap: break-word;">&gt; There were five ideas I'd wanted to pursue at some =
point. I''m not<br />&gt; presently on linux-wireless, nor do I have time t=
o pay attention right<br />&gt; now - but I'm enjoying that thread passivel=
y.<br />&gt; <br />&gt; To get those ideas "out there" again:<br />&gt; <br=
 />&gt; * adding a fixed length fq'd queue for multicast.<br />&gt; <br />&=
gt; * Reducing retransmits at low rates<br />&gt; <br />&gt; See the recent=
 paper:<br />&gt; <br />&gt; "Resolving Bufferbloat in TCP Communication ov=
er IEEE 802.11 n WLAN by<br />&gt; Reducing MAC Retransmission Limit at Low=
 Data Rate" (I'd paste a link<br />&gt; but for some reason that doesn't wo=
rk well)<br />&gt; <br />&gt; Even with their simple bi-modal model it work=
ed pretty well.<br />&gt; <br />&gt; It also reduces contention with "bad" =
stations more automagically.<br />&gt; <br />&gt; * Less buffering at the d=
river.<br />&gt; <br />&gt; Presently (ath9k) there are two-three aggregate=
s stacked up at the driver.<br />&gt; <br />&gt; With a good estimate for h=
ow long it will take to service one, forming<br />&gt; another within that =
deadline seems feasible, so you only need to have<br />&gt; one in the hard=
ware itself.<br />&gt; <br />&gt; Simple example: you have data in the hard=
ware projected to take a<br />&gt; minimum of 4ms to transmit. Don't form a=
 new aggregate and submit it<br />&gt; to the hardware for 3.5ms.<br />&gt;=
 <br />&gt; I know full well that a "good" estimate is hard, and things lik=
e<br />&gt; mu-mimo complicate things. Still, I'd like to get below 20ms of=
<br />&gt; latency within the driver, and this is one way to get there.<br =
/>&gt; <br />&gt; * Reducing the size of a txop under contention<br />&gt; =
<br />&gt; if you have 5 stations getting blasted away at 5ms each, and one=
 that<br />&gt; only wants 1ms worth of traffic, "soon", temporarily reduci=
ng the size<br />&gt; of the txop for everybody so you can service more sta=
tions faster,<br />&gt; seems useful.<br />&gt; <br />&gt; * Merging acs wh=
en sane to do so<br />&gt; <br />&gt; sane aggregation in general works bet=
ter than prioritizing does, as<br />&gt; shown in ending the anomaly.<br />=
&gt; <br />&gt; --<br />&gt; <br />&gt; Dave T=C3=A4ht<br />&gt; CEO, TekLi=
bre, LLC<br />&gt; http://www.teklibre.com<br />&gt; Tel: 1-669-226-2619<br=
 />&gt; _______________________________________________<br />&gt; Make-wifi=
-fast mailing list<br />&gt; Make-wifi-fast@lists.bufferbloat.net<br />&gt;=
 https://lists.bufferbloat.net/listinfo/make-wifi-fast</p>=0A</div></font>
------=_20171009164151000000_34663--