Lets make wifi fast again!
 help / color / mirror / Atom feed
* [Make-wifi-fast] Make - fast - wifi project plan review from 2014....
@ 2020-09-18 16:05 Dave Taht
  2020-09-18 20:58 ` Jonathan Foulkes
  2020-09-19  7:33 ` Bob McMahon
  0 siblings, 2 replies; 6+ messages in thread
From: Dave Taht @ 2020-09-18 16:05 UTC (permalink / raw)
  To: Make-Wifi-fast

I recently had cause to go review the original make-wifi-fast project
plan ( https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit)

(and related presentation:
https://www.youtube.com/watch?v=Rb-UnHDw02o&t=25m30s had the fun bit)

I'm glad that since that time ATF and mesh networking became
realities, fq_codel and per station queuing gained support in various
products, and AQL started to work on ath10k, but I'm pretty sure
things in that document like rate and power aware scheduling
(minstrel-bluse), excessive counter based hw retries, and other
problems we identified back then are still problems, not to mention
the recent ofdma work....

I have been observing pretty bad behavior with a lot of 802.11ac
access points around, (recently one that
went 4Mbits over 40 feet through glass outdoors, but 600 indoors and
10 feet) but have nothing but guesses as to the causes. Infinite
retries? Everything on 160 mhz wide channels?

Has there been any good news or good tools lately?

I pulled my ax200s out of the box and was going to see if there was
any progress there.

-- 
"For a successful technology, reality must take precedence over public
relations, for Mother Nature cannot be fooled" - Richard Feynman

dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Make-wifi-fast] Make - fast - wifi project plan review from 2014....
  2020-09-18 16:05 [Make-wifi-fast] Make - fast - wifi project plan review from 2014 Dave Taht
@ 2020-09-18 20:58 ` Jonathan Foulkes
  2020-09-19  7:33 ` Bob McMahon
  1 sibling, 0 replies; 6+ messages in thread
From: Jonathan Foulkes @ 2020-09-18 20:58 UTC (permalink / raw)
  To: Dave Taht; +Cc: Make-Wifi-fast

[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]

Dave, You are not the only one noticing widespread wifi issues, and I’m beginning to think something is amiss in the core 80211 or related code, as ever since 18.06 (and up to recently released 19.07.4) I’m observing issues with sensitivity to congestion, drastic decreases in throughput at distance, and on MT76, very frequent ‘drops’ of network connectivity, yet stations remain enrolled and show good signal levels. Sometimes if comes back in 10 seconds, other times minutes. And often, a restart is the only answer.

LEDE 17.x had very good Wifi on the Archer C7, then 18.x and 19.x are nowhere near as good, especially on 2.4.

Current 5.x Master looks better on Mt76, but still see some issues.

A shame, as when the units are operating, and one is reasonably within range of 5Ghz, the throughput and low-latencies are quite good.

Are there any guides on how to gather relevant stats to help document these issues so the devs can zero in on the problems?

Cheers,

Jonathan

> On Sep 18, 2020, at 12:05 PM, Dave Taht <dave.taht@gmail.com> wrote:
> 
> I have been observing pretty bad behavior with a lot of 802.11ac
> access points around, 


[-- Attachment #2: Type: text/html, Size: 3034 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Make-wifi-fast] Make - fast - wifi project plan review from 2014....
  2020-09-18 16:05 [Make-wifi-fast] Make - fast - wifi project plan review from 2014 Dave Taht
  2020-09-18 20:58 ` Jonathan Foulkes
@ 2020-09-19  7:33 ` Bob McMahon
  2020-09-19 10:40   ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 6+ messages in thread
From: Bob McMahon @ 2020-09-19  7:33 UTC (permalink / raw)
  To: Dave Taht; +Cc: Make-Wifi-fast


[-- Attachment #1.1: Type: text/plain, Size: 11608 bytes --]

On the tools, iperf 2.0.14 is going through a lot of development.  My hope
is to have the code done soon so it can be tested internally at Broadcom.
We're testing with WiFi , to 100G NICs and thousands of parallel threads.
I've been able to find time for this refactoring per COVID-19 stay at home
work.

What I think the industry should move to is measuring both throughput and
latency in a direct manner.  2.0.14 also supports full duplex traffic (as
well as --reverse)  TCP server output shows the following (these are 10G
NICs)

[rjmcmahon@localhost iperf2-code]$ src/iperf -s -i 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.10%enp2s0 port 5001 connected with 192.168.1.80 port
47420 (trip-times) (MSS=1448) (peer 2.0.14-alpha)
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=16.0K)
  Burst Latency avg/min/max/stdev (cnt/size) inP NetPwr
[  4] 0.00-1.00 sec  1.09 GBytes  9.34 Gbits/sec  18733
 2469:2552:2753:2456:2230:2272:1859:2142     2.988/ 0.971/ 3.668/ 0.370 ms
(8908/131072) 3.34 MByte 390759.84
[  4] 1.00-2.00 sec  1.10 GBytes  9.42 Gbits/sec  19844
 2690:2984:3211:2858:2255:2039:1893:1914     3.000/ 2.320/ 3.704/ 0.346 ms
(8979/131073) 3.37 MByte 392263.52
[  4] 2.00-3.00 sec  1.10 GBytes  9.41 Gbits/sec  18897
 2458:2668:2764:2412:2216:2300:2019:2060     3.003/ 2.310/ 3.665/ 0.347 ms
(8978/131070) 3.37 MByte 391878.92
[  4] 3.00-4.00 sec  1.10 GBytes  9.42 Gbits/sec  18389
 2339:2542:2443:2268:2211:2232:2144:2210     3.009/ 2.315/ 3.659/ 0.347 ms
(8979/131073) 3.38 MByte 391101.00
[  4] 4.00-5.00 sec  1.10 GBytes  9.41 Gbits/sec  19468
 2588:2889:3017:2623:2250:2221:1947:1933     2.971/ 2.259/ 3.671/ 0.364 ms
(8979/131069) 3.33 MByte 396075.85
[  4] 5.00-6.00 sec  1.10 GBytes  9.41 Gbits/sec  18547
 2357:2596:2582:2344:2170:2192:2104:2202     2.971/ 2.276/ 3.699/ 0.365 ms
(8978/131072) 3.34 MByte 396149.20
[  4] 6.00-7.00 sec  1.10 GBytes  9.42 Gbits/sec  18479
 2363:2598:2430:2332:2234:2184:2155:2183     2.976/ 2.279/ 3.667/ 0.363 ms
(8978/131084) 3.34 MByte 395486.89
[  4] 7.00-8.00 sec  1.10 GBytes  9.42 Gbits/sec  18506
 2387:2549:2519:2339:2229:2183:2060:2240     2.971/ 2.266/ 3.667/ 0.365 ms
(8979/131071) 3.33 MByte 396155.84
[  4] 8.00-9.00 sec  1.10 GBytes  9.41 Gbits/sec  18732
 2398:2640:2750:2352:2113:2286:2030:2163     2.973/ 2.271/ 3.691/ 0.364 ms
(8979/131059) 3.34 MByte 395780.90
[  4] 9.00-10.00 sec  1.10 GBytes  9.41 Gbits/sec  19585
 2659:2901:3073:2619:2285:2221:1854:1973     2.976/ 2.264/ 3.666/ 0.361 ms
(8978/131081) 3.34 MByte 395467.57
[  4] 10.00-10.00 sec  3.17 MBytes  9.51 Gbits/sec  51    0:6:20:0:0:19:6:0
    3.112/ 2.410/ 3.609/ 0.406 ms (26/127692) 2.92 MByte 381912.79
[  4] 0.00-10.00 sec  11.0 GBytes  9.41 Gbits/sec  189231
 24708:26925:27562:24603:22193:22149:20071:21020     2.983/ 0.971/ 3.704/
0.360 ms (89741/131072) 3.35 MByte 394144.05

Some bidir output looks like:

[rjmcmahon@localhost iperf2-code]$ src/iperf -c 192.168.1.10 --trip-times
--bidir
------------------------------------------------------------
Client connecting to 192.168.1.10, TCP port 5001 with pid 4322 (1 flows)
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.80%enp2s0 port 47928 connected with 192.168.1.10 port
5001 (bidir) (trip-times) (MSS=1448) (ct=0.37 ms)
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry
Cwnd/RTT        NetPwr
[  3] 0.00-10.00 sec  10.9 GBytes  9.35 Gbits/sec  89183/0          0
3021K/2079 us  562251.48
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=16.0K)
  Burst Latency avg/min/max/stdev (cnt/size) inP NetPwr
[  3] 0.00-10.00 sec  10.9 GBytes  9.39 Gbits/sec  174319
 21097:23110:24661:21619:18723:17600:13153:34356     2.664/ 1.045/ 6.521/
0.235 ms (89550/131072) 2.98 MByte 440455.93
[ ID] Interval       Transfer     Bandwidth
[FD3] 0.00-10.00 sec  21.8 GBytes  18.7 Gbits/sec


Man page notes:

NOTES
       Numeric options: Some numeric options support format characters per
'<value>c' (e.g. 10M) where the c format characters are k,m,g,K,M,G.
Lowercase format characters are 10^3 based and uppercase are 2^n based,
e.g. 1k  =  1000,  1K  =  1024,  1m  =
       1,000,000 and 1M = 1,048,576

       Rate  limiting: The -b option supports read and write rate limiting
at the application level.  The -b option on the client also supports
variable offered loads through the <mean>,<standard deviation> format, e.g.
 -b 100m,10m. The distribution used
       is log normal. Similar for the isochronous option. The -b on the
server rate limits the reads. Socket based pacing is also supported using
the --fq-rate long option. This will work with the --reverse and --bidir
options as well.

       Synchronized clocks: The --trip-times option indicates that the
client's and server's clocks are synchronized to a common reference.
Network Time Protocol (NTP) or Precision Time Protocol (PTP) are commonly
used for this.  The  reference  clock(s)
       error and the synchronization protocols will affect the accuracy of
any end to end latency measurements.

       Binding  is done at the logical level (ip address or layer 3) using
the -B option and at the device (or layer 2) level using the percent (%)
separator for both the client and tne server. On the client, the -B option
affects the bind(2) system call,
       and will set the source ip address and the source port, e.g. iperf
-c <host> -B 192.168.100.2:6002. This controls the packet's source values
but not routing.  These can be confusing in that a route or device lookup
may not be  that  of  the  device
       with  the configured source IP.  So, for example, if the IP address
of eth0 is used for -B and the routing table for the destination IP address
resolves the output interface to be eth1, then the host will send the
packet out device eth1 while using
       the source IP address of eth0 in the packet.  To affect the physical
output interface (e.g. dual homed systems) either use -c <host>%<dev>
(requires root) which bypasses this host route table lookup, or configure
policy routing per each  -B  source
       address  and set the output interface appropriately in the policy
routes. On the server or receive, only packets destined to -B IP address
will be received. It's also useful for multicast. For example, iperf -s -B
224.0.0.1%eth0 will only accept ip
       multicast packets with dest ip 224.0.0.1 that are received on the
eth0 interface, while iperf -s -B 224.0.0.1 will receive those packets on
any interface, Finally, the device specifier is required for v6 link-local,
e.g. -c  [v6addr]%<dev>  -V,  to
       select the output interface.

       Reverse and bidirectional traffic: The --reverse (-R) and --bidir
options can be confusing when compared to the legacy options of -r and -d.
It's suggested to use --reverse if you want to test through a NAT firewall
(or -R on non-windows systems).
       This applies role reversal of the test after opening the full duplex
socket. The latter two of -d and -r remain supported for legacy support and
compatibility reasons.  These open new sockets in the  opposite  direction
 vs  treat  the  originating
       socket as full duplex. Firewall piercing is typically required to
use -d and -r if a NAT gateway is in the path. That's part of the reason
it's highly encouraged to use the newer --reverse and --bidir and deprecate
the use of the -r and -d options.

       Also,  the  --reverse -b <rate> setting behaves differently for TCP
and UDP. For TCP it will rate limit the read side, i.e. the iperf client
(role reversed to act as a server) reading from the full duplex socket.
This will in turn flow control the
       reverse traffic per standard TCP congestion control. The --reverse
-b <rate> will be applied on transmit (i.e. the server role reversed to act
as a client) for UDP since there is no flow control with UDP. There is no
option to directly  rate  limit
       the writes with TCP testing when using --reverse.

       TCP  Connect  times:  The TCP connect time (or three way handshake)
can be seen on the iperf client when the -e (--enhancedreports) option is
set. Look for the ct=<value> in the connected message, e.g.in '[ 3] local
192.168.1.4 port 48736 connected
       with 192.168.1.1 port 5001 (ct=1.84 ms)' shows the 3WHS took 1.84
milliseconds.

       Little's Law in queueing theory is a theorem that determines the
average number of items (L) in a stationary queuing system based on the
average waiting time (W) of an item within a system and the average number
of items arriving at the system  per
       unit of time (lambda). Mathematically, it's L = lambda * W. As used
here, the units are bytes. The arrival rate is taken from the writes.

       Network  power:  The network power (NetPwr) metric is experimental.
It's a convenience function defined as throughput/delay.  For TCP
transmits, the delay is the sampled RTT times.  For TCP receives, the delay
is the write to read latency.  For UDP
       the delay is the end/end latency.  Don't confuse this with the
physics definition of power (delta energy/delta time) but more of a measure
of a desirable property divided by an undesirable property. Also note, one
must use -i interval with  TCP  to
       get this as that's what sets the RTT sampling rate. The metric is
scaled to assist with human readability.

       Fast Sampling: Use ./configure --enable-fastsampling and then
compile from source to enable four digit (e.g. 1.0000) precision in
reports' timestamps. Useful for sub-millisecond sampling.

Bob


On Fri, Sep 18, 2020 at 9:05 AM Dave Taht <dave.taht@gmail.com> wrote:

> I recently had cause to go review the original make-wifi-fast project
> plan (
> https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit
> )
>
> (and related presentation:
> https://www.youtube.com/watch?v=Rb-UnHDw02o&t=25m30s had the fun bit)
>
> I'm glad that since that time ATF and mesh networking became
> realities, fq_codel and per station queuing gained support in various
> products, and AQL started to work on ath10k, but I'm pretty sure
> things in that document like rate and power aware scheduling
> (minstrel-bluse), excessive counter based hw retries, and other
> problems we identified back then are still problems, not to mention
> the recent ofdma work....
>
> I have been observing pretty bad behavior with a lot of 802.11ac
> access points around, (recently one that
> went 4Mbits over 40 feet through glass outdoors, but 600 indoors and
> 10 feet) but have nothing but guesses as to the causes. Infinite
> retries? Everything on 160 mhz wide channels?
>
> Has there been any good news or good tools lately?
>
> I pulled my ax200s out of the box and was going to see if there was
> any progress there.
>
> --
> "For a successful technology, reality must take precedence over public
> relations, for Mother Nature cannot be fooled" - Richard Feynman
>
> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
> _______________________________________________
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast

[-- Attachment #1.2: Type: text/html, Size: 13261 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4163 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Make-wifi-fast] Make - fast - wifi project plan review from 2014....
  2020-09-19  7:33 ` Bob McMahon
@ 2020-09-19 10:40   ` Toke Høiland-Jørgensen
  2020-09-19 18:11     ` Bob McMahon
  0 siblings, 1 reply; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-09-19 10:40 UTC (permalink / raw)
  To: Bob McMahon, Dave Taht; +Cc: Make-Wifi-fast

Bob McMahon via Make-wifi-fast <make-wifi-fast@lists.bufferbloat.net>
writes:

> On the tools, iperf 2.0.14 is going through a lot of development.  My hope
> is to have the code done soon so it can be tested internally at Broadcom.
> We're testing with WiFi , to 100G NICs and thousands of parallel threads.
> I've been able to find time for this refactoring per COVID-19 stay at home
> work.

Any updates on my wishlist for Flent support on-par with netperf[0]? :)

-Toke

[0] https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-January/002648.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Make-wifi-fast] Make - fast - wifi project plan review from 2014....
  2020-09-19 10:40   ` Toke Høiland-Jørgensen
@ 2020-09-19 18:11     ` Bob McMahon
  0 siblings, 0 replies; 6+ messages in thread
From: Bob McMahon @ 2020-09-19 18:11 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Dave Taht, Make-Wifi-fast


[-- Attachment #1.1: Type: text/plain, Size: 752 bytes --]

Sorry, no movement on flent and iperf 2.0.14 by me.

Bob

On Sat, Sep 19, 2020 at 3:40 AM Toke Høiland-Jørgensen <toke@toke.dk> wrote:

> Bob McMahon via Make-wifi-fast <make-wifi-fast@lists.bufferbloat.net>
> writes:
>
> > On the tools, iperf 2.0.14 is going through a lot of development.  My
> hope
> > is to have the code done soon so it can be tested internally at Broadcom.
> > We're testing with WiFi , to 100G NICs and thousands of parallel threads.
> > I've been able to find time for this refactoring per COVID-19 stay at
> home
> > work.
>
> Any updates on my wishlist for Flent support on-par with netperf[0]? :)
>
> -Toke
>
> [0]
> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-January/002648.html
>

[-- Attachment #1.2: Type: text/html, Size: 1275 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4163 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Make-wifi-fast] Make - fast - wifi project plan review from 2014....
       [not found] <mailman.0.1600531202.4968.make-wifi-fast@lists.bufferbloat.net>
@ 2020-09-25  7:24 ` Jon Pike
  0 siblings, 0 replies; 6+ messages in thread
From: Jon Pike @ 2020-09-25  7:24 UTC (permalink / raw)
  To: make-wifi-fast

[-- Attachment #1: Type: text/plain, Size: 2319 bytes --]

>
>
> ---------- Forwarded message ----------
> From: Jonathan Foulkes <jf@jonathanfoulkes.com>
> To: Dave Taht <dave.taht@gmail.com>
> Cc: Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>
> Bcc:
> Date: Fri, 18 Sep 2020 16:58:32 -0400
> Subject: Re: [Make-wifi-fast] Make - fast - wifi project plan review from
> 2014....
> Dave, You are not the only one noticing widespread wifi issues, and I’m
> beginning to think something is amiss in the core 80211 or related code, as
> ever since 18.06 (and up to recently released 19.07.4) I’m observing issues
> with sensitivity to congestion, drastic decreases in throughput at
> distance, and on MT76, very frequent ‘drops’ of network connectivity, yet
> stations remain enrolled and show good signal levels. Sometimes if comes
> back in 10 seconds, other times minutes. And often, a restart is the only
> answer.
>
> LEDE 17.x had very good Wifi on the Archer C7, then 18.x and 19.x are
> nowhere near as good, especially on 2.4.
>
> Current 5.x Master looks better on Mt76, but still see some issues.
>
> A shame, as when the units are operating, and one is reasonably within
> range of 5Ghz, the throughput and low-latencies are quite good.
>
> Are there any guides on how to gather relevant stats to help document
> these issues so the devs can zero in on the problems?
>
> Cheers,
>
> Jonathan
>


You definitely aren't among the few.  Its been a continuing issue I've been
having as you describe.  It gets mentioned occasionally on the OpenWrt
forums... but never seems to get a critical mass of attention.  I and
others have started threads, that have faded out over time.

Now, it seeming that a general thought is developing that "C7's just arent
good for 2.4ghz"  hiding that they WERE, and now have an issue.

Anyway, even though I'm mostly a lurking layperson..  I have lots of
collected logs over time, and have lots of observational data on this
issue.  Not much in the regular logs, though.  A while back in history it
was more the 5ghz radio dropping,  but in the last several months to a year
its been almot always the 2.4ghz radio, similar to what you are
describing.  I would like very much to know how to find out the necessary
info so that the devs can get a handle on it.
Count me in...

[-- Attachment #2: Type: text/html, Size: 3183 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-25  7:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-18 16:05 [Make-wifi-fast] Make - fast - wifi project plan review from 2014 Dave Taht
2020-09-18 20:58 ` Jonathan Foulkes
2020-09-19  7:33 ` Bob McMahon
2020-09-19 10:40   ` Toke Høiland-Jørgensen
2020-09-19 18:11     ` Bob McMahon
     [not found] <mailman.0.1600531202.4968.make-wifi-fast@lists.bufferbloat.net>
2020-09-25  7:24 ` Jon Pike

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox