From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 0C8CD3BA8E for ; Sun, 17 Jun 2018 07:19:28 -0400 (EDT) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8A0D75BCD8; Sun, 17 Jun 2018 11:19:27 +0000 (UTC) Received: from localhost (unknown [10.36.112.11]) by smtp.corp.redhat.com (Postfix) with ESMTP id 55E8E111C481; Sun, 17 Jun 2018 11:19:23 +0000 (UTC) Date: Sun, 17 Jun 2018 13:19:21 +0200 From: Jesper Dangaard Brouer To: Pete Heist Cc: Dave Taht , Make-Wifi-fast , brouer@redhat.com, Florian Westphal , Marek Majkowski Message-ID: <20180617131921.09bf5353@redhat.com> In-Reply-To: <150ABF21-FAFC-48E2-9E55-CAA609EAE449@heistp.net> References: <1527721073.171416827@apps.rackspace.com> <150ABF21-FAFC-48E2-9E55-CAA609EAE449@heistp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Sun, 17 Jun 2018 11:19:27 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Sun, 17 Jun 2018 11:19:27 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'brouer@redhat.com' RCPT:'' Subject: Re: [Make-wifi-fast] emulating wifi better - coupling qdiscs in netem? X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jun 2018 11:19:28 -0000 Hi Pete, Happened to be at the Netfilter Workshop, and discussed nfqueue with Florian and Marek, and I saw this attempt to use nfqueue, and Florian points out that you are not using the GRO facility of nfqueue. I'll quote what Florian said below: On Sun, 17 Jun 2018 12:45:52 +0200 Florian Westphal wrote: =20 > The linked example code is old and does not set > mnl_attr_put_u32(nlh, NFQA_CFG_FLAGS, htonl(NFQA_CFG_F_GSO)); >=20 > When requesting the queue. >=20 > This means kernel has to do software segmentation of GSO skbs. >=20 > Consider using > https://git.netfilter.org/libnetfilter_queue/tree/examples/nf-queue.c >=20 > instead if you need a template, it does this correctly. --Jesper On Sun, 17 Jun 2018 00:53:03 +0200 Pete Heist wrote: > > On Jun 16, 2018, at 12:30 AM, Dave Taht wrote: > >=20 > > Eric just suggested using the iptables NFQUEUE ability to toss > > packets to userspace. > >=20 > > https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queu= e/ > > For wifi, at least, timings are not hugely critical, a few hundred > > usec is something userspace can handle reasonably accurately. I like > > very much being able to separate out mcast and treat that correctly in > > userspace, also. I did want to be below 10usec (wifi "bus" > > arbitration), which I am dubious about.... > >=20 > > Now as for an implementation language? C++ C? Go? Python? The > > condition of the wrapper library for go leaves a bit to be desired > > ( https://github.com/chifflier/nfqueue-go ) and given a choice I'd > > MUCH rather use a go than a C. =20 >=20 > This sounds cool... So for fun, I compared ping and iperf3 with no-op nfq= ueue callbacks in both C and Go. As for the hardware setup, I used two lxc = containers (effectively just veth) on an APU2. >=20 > For the Go program, I used test_nfqueue from the wrapper above (which yes= , does need some work) and removed debugging / logging. >=20 > For the C program I used this: > https://github.com/irontec/netfilter-nfqueue-samples/blob/master/sample-h= elloworld.c > I removed any per-packet printf calls and compiled with "gcc sample-hello= world.c -o nfq -lnfnetlink -lnetfilter_queue=E2=80=9D. >=20 > Ping results: >=20 > ping without nfqueue: > root@lsrv:~# iptables -F OUTPUT > root@lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11 > 500 packets transmitted, 500 received, 0% packet loss, time 7985ms > rtt min/avg/max/mdev =3D 0.056/0.058/0.185/0.011 ms >=20 > ping with no-op nfqueue callback in C: > root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-nu= m 0 > root@lsrv:~/nfqueue# ping -c 500 -i 0.01 -q 10.182.122.11 > 500 packets transmitted, 500 received, 0% packet loss, time 7981ms > rtt min/avg/max/mdev =3D 0.117/0.123/0.384/0.020 ms >=20 > ping with no-op nfqueue callback in Go: > root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-nu= m 0 > root@lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11 > 500 packets transmitted, 500 received, 0% packet loss, time 7982ms > rtt min/avg/max/mdev =3D 0.095/0.172/0.532/0.042 ms >=20 > The mean induced latency of 65us for C or 114us for Go might be within yo= ur parameters, except you mentioned 10us for WiFi bus arbitration, which do= es indeed look impossible with this setup, even in C. >=20 > Iperf3 results: >=20 > iperf3 without nfqueue: > root@lsrv:~# iptables -F OUTPUT > root@lsrv:~# iperf3 -t 5 -c 10.182.122.11 > Connecting to host 10.182.122.11, port 5201 > [ 4] local 10.182.122.1 port 55810 connected to 10.182.122.11 port 5201 > [ ID] Interval Transfer Bandwidth Retr Cwnd > [ 4] 0.00-1.00 sec 452 MBytes 3.79 Gbits/sec 0 178 KBytes = =20 > [ 4] 1.00-2.00 sec 454 MBytes 3.82 Gbits/sec 0 320 KBytes = =20 > [ 4] 2.00-3.00 sec 450 MBytes 3.77 Gbits/sec 0 320 KBytes = =20 > [ 4] 3.00-4.00 sec 451 MBytes 3.79 Gbits/sec 0 352 KBytes = =20 > [ 4] 4.00-5.00 sec 451 MBytes 3.79 Gbits/sec 0 352 KBytes = =20 > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-5.00 sec 2.21 GBytes 3.79 Gbits/sec 0 sen= der > [ 4] 0.00-5.00 sec 2.21 GBytes 3.79 Gbits/sec rec= eiver > iperf Done. >=20 > iperf3 with no-op nfqueue callback in C: > root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-nu= m 0 > root@lsrv:~/nfqueue# iperf3 -t 5 -c 10.182.122.11 > Connecting to host 10.182.122.11, port 5201 > [ 4] local 10.182.122.1 port 55868 connected to 10.182.122.11 port 5201 > [ ID] Interval Transfer Bandwidth Retr Cwnd > [ 4] 0.00-1.00 sec 17.4 MBytes 146 Mbits/sec 0 107 KBytes = =20 > [ 4] 1.00-2.00 sec 16.9 MBytes 142 Mbits/sec 0 107 KBytes = =20 > [ 4] 2.00-3.00 sec 17.0 MBytes 142 Mbits/sec 0 107 KBytes = =20 > [ 4] 3.00-4.00 sec 17.0 MBytes 142 Mbits/sec 0 107 KBytes = =20 > [ 4] 4.00-5.00 sec 17.0 MBytes 143 Mbits/sec 0 115 KBytes = =20 > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-5.00 sec 85.3 MBytes 143 Mbits/sec 0 sen= der > [ 4] 0.00-5.00 sec 84.7 MBytes 142 Mbits/sec rec= eiver >=20 > iperf3 with no-op nfqueue callback in Go: > root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-nu= m 0 > root@lsrv:~# iperf3 -t 5 -c 10.182.122.11 > Connecting to host 10.182.122.11, port 5201 > [ 4] local 10.182.122.1 port 55864 connected to 10.182.122.11 port 5201 > [ ID] Interval Transfer Bandwidth Retr Cwnd > [ 4] 0.00-1.00 sec 14.6 MBytes 122 Mbits/sec 0 96.2 KBytes = =20 > [ 4] 1.00-2.00 sec 14.1 MBytes 118 Mbits/sec 0 96.2 KBytes = =20 > [ 4] 2.00-3.00 sec 14.0 MBytes 118 Mbits/sec 0 102 KBytes = =20 > [ 4] 3.00-4.00 sec 14.0 MBytes 117 Mbits/sec 0 102 KBytes = =20 > [ 4] 4.00-5.00 sec 13.7 MBytes 115 Mbits/sec 0 107 KBytes = =20 > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-5.00 sec 70.5 MBytes 118 Mbits/sec 0 sen= der > [ 4] 0.00-5.00 sec 69.9 MBytes 117 Mbits/sec rec= eiver > iperf Done. >=20 > So rats, throughput gets brutalized for both C and Go. For Go, a rate of = 117 Mbit with a 1500 byte MTU is 9750 packets/sec, which is 103us / packet.= Mean induced latency measured by ping is 114us, which is not far off 103us= , so the rate slowdown looks to be mostly caused by the per-packet nfqueue = calls. The core running test_nfqueue is pinned at 100% during the test. "ni= ce -n -20" does nothing. >=20 > Presumably you=E2=80=99ll sometimes be releasing more than one packet at = a time(?) so I guess whether or not this is workable depends on how many yo= u release at once, what hardware you=E2=80=99re on and what rates you need = to test at. But when you=E2=80=99re trying to test a qdisc, I guess you=E2= =80=99d want to minimize the burden you add to the CPU, or else move it to = a core the qdisc isn=E2=80=99t running on, or something, so the qdisc itsel= f isn=E2=80=99t affected by the test rig. >=20 > > There is of course a hideous amount of complexity moved to the daemon, = =20 >=20 > I can only imagine. >=20 > > as a pure fifo ap queue forms aggregregates much differently > > than a fq_codeled one. But, yea! userspace.... =20 >=20 > This would be awesome if it works out! After that iperf3 test though, I t= hink I may have smashed my dreams of writing a libnetfilter_queue userspace= qdisc in Go, or C for that matter. >=20 > If this does somehow turn out to be good enough performance-wise, I think= you=E2=80=99d have a lot more fun and spend a lot less time on it in Go th= an C, but that=E2=80=99s just an opinion... :) >=20 --=20 Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer