[Make-wifi-fast] emulating wifi better - coupling qdiscs in netem?
Pete Heist
pete at heistp.net
Sat Jun 16 18:53:03 EDT 2018
> On Jun 16, 2018, at 12:30 AM, Dave Taht <dave.taht at gmail.com> wrote:
>
> Eric just suggested using the iptables NFQUEUE ability to toss
> packets to userspace.
>
> https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queue/ <https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queue/>
> For wifi, at least, timings are not hugely critical, a few hundred
> usec is something userspace can handle reasonably accurately. I like
> very much being able to separate out mcast and treat that correctly in
> userspace, also. I did want to be below 10usec (wifi "bus"
> arbitration), which I am dubious about....
>
> Now as for an implementation language? C++ C? Go? Python? The
> condition of the wrapper library for go leaves a bit to be desired
> ( https://github.com/chifflier/nfqueue-go <https://github.com/chifflier/nfqueue-go> ) and given a choice I'd
> MUCH rather use a go than a C.
This sounds cool... So for fun, I compared ping and iperf3 with no-op nfqueue callbacks in both C and Go. As for the hardware setup, I used two lxc containers (effectively just veth) on an APU2.
For the Go program, I used test_nfqueue from the wrapper above (which yes, does need some work) and removed debugging / logging.
For the C program I used this:
https://github.com/irontec/netfilter-nfqueue-samples/blob/master/sample-helloworld.c
I removed any per-packet printf calls and compiled with "gcc sample-helloworld.c -o nfq -lnfnetlink -lnetfilter_queue”.
Ping results:
ping without nfqueue:
root at lsrv:~# iptables -F OUTPUT
root at lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11
500 packets transmitted, 500 received, 0% packet loss, time 7985ms
rtt min/avg/max/mdev = 0.056/0.058/0.185/0.011 ms
ping with no-op nfqueue callback in C:
root at lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-num 0
root at lsrv:~/nfqueue# ping -c 500 -i 0.01 -q 10.182.122.11
500 packets transmitted, 500 received, 0% packet loss, time 7981ms
rtt min/avg/max/mdev = 0.117/0.123/0.384/0.020 ms
ping with no-op nfqueue callback in Go:
root at lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-num 0
root at lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11
500 packets transmitted, 500 received, 0% packet loss, time 7982ms
rtt min/avg/max/mdev = 0.095/0.172/0.532/0.042 ms
The mean induced latency of 65us for C or 114us for Go might be within your parameters, except you mentioned 10us for WiFi bus arbitration, which does indeed look impossible with this setup, even in C.
Iperf3 results:
iperf3 without nfqueue:
root at lsrv:~# iptables -F OUTPUT
root at lsrv:~# iperf3 -t 5 -c 10.182.122.11
Connecting to host 10.182.122.11, port 5201
[ 4] local 10.182.122.1 port 55810 connected to 10.182.122.11 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 452 MBytes 3.79 Gbits/sec 0 178 KBytes
[ 4] 1.00-2.00 sec 454 MBytes 3.82 Gbits/sec 0 320 KBytes
[ 4] 2.00-3.00 sec 450 MBytes 3.77 Gbits/sec 0 320 KBytes
[ 4] 3.00-4.00 sec 451 MBytes 3.79 Gbits/sec 0 352 KBytes
[ 4] 4.00-5.00 sec 451 MBytes 3.79 Gbits/sec 0 352 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-5.00 sec 2.21 GBytes 3.79 Gbits/sec 0 sender
[ 4] 0.00-5.00 sec 2.21 GBytes 3.79 Gbits/sec receiver
iperf Done.
iperf3 with no-op nfqueue callback in C:
root at lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-num 0
root at lsrv:~/nfqueue# iperf3 -t 5 -c 10.182.122.11
Connecting to host 10.182.122.11, port 5201
[ 4] local 10.182.122.1 port 55868 connected to 10.182.122.11 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 17.4 MBytes 146 Mbits/sec 0 107 KBytes
[ 4] 1.00-2.00 sec 16.9 MBytes 142 Mbits/sec 0 107 KBytes
[ 4] 2.00-3.00 sec 17.0 MBytes 142 Mbits/sec 0 107 KBytes
[ 4] 3.00-4.00 sec 17.0 MBytes 142 Mbits/sec 0 107 KBytes
[ 4] 4.00-5.00 sec 17.0 MBytes 143 Mbits/sec 0 115 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-5.00 sec 85.3 MBytes 143 Mbits/sec 0 sender
[ 4] 0.00-5.00 sec 84.7 MBytes 142 Mbits/sec receiver
iperf3 with no-op nfqueue callback in Go:
root at lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-num 0
root at lsrv:~# iperf3 -t 5 -c 10.182.122.11
Connecting to host 10.182.122.11, port 5201
[ 4] local 10.182.122.1 port 55864 connected to 10.182.122.11 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 14.6 MBytes 122 Mbits/sec 0 96.2 KBytes
[ 4] 1.00-2.00 sec 14.1 MBytes 118 Mbits/sec 0 96.2 KBytes
[ 4] 2.00-3.00 sec 14.0 MBytes 118 Mbits/sec 0 102 KBytes
[ 4] 3.00-4.00 sec 14.0 MBytes 117 Mbits/sec 0 102 KBytes
[ 4] 4.00-5.00 sec 13.7 MBytes 115 Mbits/sec 0 107 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-5.00 sec 70.5 MBytes 118 Mbits/sec 0 sender
[ 4] 0.00-5.00 sec 69.9 MBytes 117 Mbits/sec receiver
iperf Done.
So rats, throughput gets brutalized for both C and Go. For Go, a rate of 117 Mbit with a 1500 byte MTU is 9750 packets/sec, which is 103us / packet. Mean induced latency measured by ping is 114us, which is not far off 103us, so the rate slowdown looks to be mostly caused by the per-packet nfqueue calls. The core running test_nfqueue is pinned at 100% during the test. "nice -n -20" does nothing.
Presumably you’ll sometimes be releasing more than one packet at a time(?) so I guess whether or not this is workable depends on how many you release at once, what hardware you’re on and what rates you need to test at. But when you’re trying to test a qdisc, I guess you’d want to minimize the burden you add to the CPU, or else move it to a core the qdisc isn’t running on, or something, so the qdisc itself isn’t affected by the test rig.
> There is of course a hideous amount of complexity moved to the daemon,
I can only imagine.
> as a pure fifo ap queue forms aggregregates much differently
> than a fq_codeled one. But, yea! userspace....
This would be awesome if it works out! After that iperf3 test though, I think I may have smashed my dreams of writing a libnetfilter_queue userspace qdisc in Go, or C for that matter.
If this does somehow turn out to be good enough performance-wise, I think you’d have a lot more fun and spend a lot less time on it in Go than C, but that’s just an opinion... :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/make-wifi-fast/attachments/20180617/89ebe057/attachment-0001.html>
More information about the Make-wifi-fast
mailing list