From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pete@heistp.net>
Received: from mail-wr0-x242.google.com (mail-wr0-x242.google.com
 [IPv6:2a00:1450:400c:c0c::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id CFCD93BA8E
 for <make-wifi-fast@lists.bufferbloat.net>;
 Sun, 17 Jun 2018 11:16:34 -0400 (EDT)
Received: by mail-wr0-x242.google.com with SMTP id l41-v6so14251003wre.7
 for <make-wifi-fast@lists.bufferbloat.net>;
 Sun, 17 Jun 2018 08:16:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google;
 h=mime-version:subject:from:in-reply-to:date:cc:message-id:references
 :to; bh=ULWYsBk0dahjIaF+HbaSOwBmG+iU7o6MmzLbvq9zxXE=;
 b=Eby5I766sHlJGkA34pp3CoIGbOawlMovyxt0YFtwtSHJfiA5cUhLpEyqV4e5WikrM6
 TIpMy36COgYzRNof/M2Ek2h7gnOFnUOPbqgX0uGt05UuKgxYcL1XcCiSqngMhDwvX5s9
 3zNsn5rlWo6jy1JEYj0nFzOlFBhoXX95TJdH7DHUfYqa+q2Fz2xO5btRFdZpxaRGYqG0
 /v0r8A/WmqF3lwedPPXMVHiV9+tyx0VRyQg66uLRu2YQwyl5HoXehfSt+bK3OYoWrBMX
 uC2ZkfXFJ83m0hIFg+kvNSvI4F/0XGPzwT3JDPR+zzyLarcBtAsNng3/a1iEv2Knq1HC
 Fg7A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc
 :message-id:references:to;
 bh=ULWYsBk0dahjIaF+HbaSOwBmG+iU7o6MmzLbvq9zxXE=;
 b=gT6K8PNPXVlmesYKB9yUDW0i6gTjD1UUawQ0Ps95mVhKKThCSDqb3Z+jRMk080Vvpj
 IaPdE37v7odUSz4GMK6/ivnkYoUPYhqU2ZLQZXGt2GOxQFkoddhAyATkhGrzhOJoTqBQ
 ckoHk2RCFF4Hf04T/gItyBVAFYDz6y5aDTuCxVvMSr3ipqHBuvXv5/3c0bfpLeFsckyi
 UxlhVQbSeytYgYj49/y8jxxAzne2hH9EJoAmulNtKBkJff/ztFghD3J/wC80XggwYJNU
 141TkbMz3zJkJI+gyUmeba1UGTMH2QDcxOaWuSfgcmrwbC0rVHnduR2eeBoiQGxJWe/c
 dFwQ==
X-Gm-Message-State: APt69E0F5ARiLGR+Jgl+CtyRzZ4EVtpZYGZoa08seZkR1wl5U6xFFcCu
 v4MZG/B4gSi6FRtvDsCbLIgfTA==
X-Google-Smtp-Source: ADUXVKI3vkSAqO9ts/Yte1uX8p2QXt2oRy1+Szxq04GeGDI2S15B9HyWpn1Ky1nf95GbzVBeIRNRTA==
X-Received: by 2002:adf:9724:: with SMTP id
 r33-v6mr7977794wrb.79.1529248593496; 
 Sun, 17 Jun 2018 08:16:33 -0700 (PDT)
Received: from tron.luk.heistp.net (h-1169.lbcfree.net. [185.193.85.130])
 by smtp.gmail.com with ESMTPSA id n18-v6sm15259198wrj.58.2018.06.17.08.16.32
 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
 Sun, 17 Jun 2018 08:16:33 -0700 (PDT)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_AEE169A4-5937-45CC-9B80-F50868A37B70"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Pete Heist <pete@heistp.net>
In-Reply-To: <20180617131921.09bf5353@redhat.com>
Date: Sun, 17 Jun 2018 17:16:30 +0200
Cc: Dave Taht <dave.taht@gmail.com>, Florian Westphal <fw@strlen.de>,
 Marek Majkowski <marek@cloudflare.com>,
 Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>
Message-Id: <BB56F688-A9B1-42FF-95F7-310EBCAF28A5@heistp.net>
References: <CAA93jw7YcMP7tcx1SH1H1NtNKwxHksc4aSnB-qp7XxbtT9aJ1A@mail.gmail.com>
 <1527721073.171416827@apps.rackspace.com>
 <CAA93jw4iDjdd0zgywZi8aSEWx4_QO-VpYtKojb4kn-Duy79Low@mail.gmail.com>
 <150ABF21-FAFC-48E2-9E55-CAA609EAE449@heistp.net>
 <20180617131921.09bf5353@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
X-Mailer: Apple Mail (2.3124)
Subject: Re: [Make-wifi-fast] emulating wifi better - coupling qdiscs in
	netem?
X-BeenThere: make-wifi-fast@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: <make-wifi-fast.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/make-wifi-fast>
List-Post: <mailto:make-wifi-fast@lists.bufferbloat.net>
List-Help: <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/make-wifi-fast>,
 <mailto:make-wifi-fast-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sun, 17 Jun 2018 15:16:35 -0000


--Apple-Mail=_AEE169A4-5937-45CC-9B80-F50868A37B70
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hi Jesper/Florian, thanks for noticing that, not surprisingly it =
doesn=E2=80=99t change the ping results much, but it improves throughput =
a lot (now only ~20% less than without nfqueue):

root@lsrv:~# iperf3 -t 5 -c 10.182.122.11
Connecting to host 10.182.122.11, port 5201
[  4] local 10.182.122.1 port 55936 connected to 10.182.122.11 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   375 MBytes  3.14 Gbits/sec  173    372 KBytes  =
    =20
[  4]   1.00-2.00   sec   365 MBytes  3.06 Gbits/sec  316    382 KBytes  =
    =20
[  4]   2.00-3.00   sec   372 MBytes  3.13 Gbits/sec  368    427 KBytes  =
    =20
[  4]   3.00-4.00   sec   364 MBytes  3.05 Gbits/sec  137    402 KBytes  =
    =20
[  4]   4.00-5.00   sec   364 MBytes  3.05 Gbits/sec  342    382 KBytes  =
    =20
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-5.00   sec  1.80 GBytes  3.09 Gbits/sec  1336             =
sender
[  4]   0.00-5.00   sec  1.79 GBytes  3.08 Gbits/sec                  =
receiver
iperf Done.

I don=E2=80=99t know if/how the use of GSO affects Dave=E2=80=99s =
simulation work, but I=E2=80=99ll leave that to him. I only wanted to =
contribute a quick evaluation. :)

Pete

> On Jun 17, 2018, at 1:19 PM, Jesper Dangaard Brouer =
<brouer@redhat.com> wrote:
>=20
>=20
> Hi Pete,
>=20
> Happened to be at the Netfilter Workshop, and discussed nfqueue with
> Florian and Marek, and I saw this attempt to use nfqueue, and Florian
> points out that you are not using the GRO facility of nfqueue.
>=20
> I'll quote what Florian said below:
>=20
> On Sun, 17 Jun 2018 12:45:52 +0200 Florian Westphal <fw@strlen.de =
<mailto:fw@strlen.de>> wrote:
>=20
>> The linked example code is old and does not set
>> 	mnl_attr_put_u32(nlh, NFQA_CFG_FLAGS, htonl(NFQA_CFG_F_GSO));
>>=20
>> When requesting the queue.
>>=20
>> This means kernel has to do software segmentation of GSO skbs.
>>=20
>> Consider using
>> https://git.netfilter.org/libnetfilter_queue/tree/examples/nf-queue.c =
<https://git.netfilter.org/libnetfilter_queue/tree/examples/nf-queue.c>
>>=20
>> instead if you need a template, it does this correctly.
>=20
> --Jesper
>=20
>=20
> On Sun, 17 Jun 2018 00:53:03 +0200 Pete Heist <pete@heistp.net =
<mailto:pete@heistp.net>> wrote:
>=20
>>> On Jun 16, 2018, at 12:30 AM, Dave Taht <dave.taht@gmail.com =
<mailto:dave.taht@gmail.com>> wrote:
>>>=20
>>> Eric just suggested using the iptables NFQUEUE ability to toss
>>> packets to userspace.
>>>=20
>>> =
https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queue/ =
<https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queue/=
> =
<https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queue/=
 =
<https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter_queue/=
>>
>>> For wifi, at least, timings are not hugely critical, a few hundred
>>> usec is something userspace can handle reasonably accurately. I like
>>> very much being able to separate out mcast and treat that correctly =
in
>>> userspace, also. I did want to be below 10usec (wifi "bus"
>>> arbitration), which I am dubious about....
>>>=20
>>> Now as for an implementation language? C++ C? Go? Python? The
>>> condition of the wrapper library for go leaves a bit to be desired
>>> ( https://github.com/chifflier/nfqueue-go =
<https://github.com/chifflier/nfqueue-go> =
<https://github.com/chifflier/nfqueue-go =
<https://github.com/chifflier/nfqueue-go>> ) and given a choice I'd
>>> MUCH rather use a go than a C. =20
>>=20
>> This sounds cool... So for fun, I compared ping and iperf3 with no-op =
nfqueue callbacks in both C and Go. As for the hardware setup, I used =
two lxc containers (effectively just veth) on an APU2.
>>=20
>> For the Go program, I used test_nfqueue from the wrapper above (which =
yes, does need some work) and removed debugging / logging.
>>=20
>> For the C program I used this:
>> =
https://github.com/irontec/netfilter-nfqueue-samples/blob/master/sample-he=
lloworld.c
>> I removed any per-packet printf calls and compiled with "gcc =
sample-helloworld.c -o nfq -lnfnetlink -lnetfilter_queue=E2=80=9D.
>>=20
>> Ping results:
>>=20
>> ping without nfqueue:
>> root@lsrv:~# iptables -F OUTPUT
>> root@lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11
>> 500 packets transmitted, 500 received, 0% packet loss, time 7985ms
>> rtt min/avg/max/mdev =3D 0.056/0.058/0.185/0.011 ms
>>=20
>> ping with no-op nfqueue callback in C:
>> root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE =
--queue-num 0
>> root@lsrv:~/nfqueue# ping -c 500 -i 0.01 -q 10.182.122.11
>> 500 packets transmitted, 500 received, 0% packet loss, time 7981ms
>> rtt min/avg/max/mdev =3D 0.117/0.123/0.384/0.020 ms
>>=20
>> ping with no-op nfqueue callback in Go:
>> root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE =
--queue-num 0
>> root@lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11
>> 500 packets transmitted, 500 received, 0% packet loss, time 7982ms
>> rtt min/avg/max/mdev =3D 0.095/0.172/0.532/0.042 ms
>>=20
>> The mean induced latency of 65us for C or 114us for Go might be =
within your parameters, except you mentioned 10us for WiFi bus =
arbitration, which does indeed look impossible with this setup, even in =
C.
>>=20
>> Iperf3 results:
>>=20
>> iperf3 without nfqueue:
>> root@lsrv:~# iptables -F OUTPUT
>> root@lsrv:~# iperf3 -t 5 -c 10.182.122.11
>> Connecting to host 10.182.122.11, port 5201
>> [  4] local 10.182.122.1 port 55810 connected to 10.182.122.11 port =
5201
>> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
>> [  4]   0.00-1.00   sec   452 MBytes  3.79 Gbits/sec    0    178 =
KBytes      =20
>> [  4]   1.00-2.00   sec   454 MBytes  3.82 Gbits/sec    0    320 =
KBytes      =20
>> [  4]   2.00-3.00   sec   450 MBytes  3.77 Gbits/sec    0    320 =
KBytes      =20
>> [  4]   3.00-4.00   sec   451 MBytes  3.79 Gbits/sec    0    352 =
KBytes      =20
>> [  4]   4.00-5.00   sec   451 MBytes  3.79 Gbits/sec    0    352 =
KBytes      =20
>> - - - - - - - - - - - - - - - - - - - - - - - - -
>> [ ID] Interval           Transfer     Bandwidth       Retr
>> [  4]   0.00-5.00   sec  2.21 GBytes  3.79 Gbits/sec    0             =
sender
>> [  4]   0.00-5.00   sec  2.21 GBytes  3.79 Gbits/sec                  =
receiver
>> iperf Done.
>>=20
>> iperf3 with no-op nfqueue callback in C:
>> root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE =
--queue-num 0
>> root@lsrv:~/nfqueue# iperf3 -t 5 -c 10.182.122.11
>> Connecting to host 10.182.122.11, port 5201
>> [  4] local 10.182.122.1 port 55868 connected to 10.182.122.11 port =
5201
>> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
>> [  4]   0.00-1.00   sec  17.4 MBytes   146 Mbits/sec    0    107 =
KBytes      =20
>> [  4]   1.00-2.00   sec  16.9 MBytes   142 Mbits/sec    0    107 =
KBytes      =20
>> [  4]   2.00-3.00   sec  17.0 MBytes   142 Mbits/sec    0    107 =
KBytes      =20
>> [  4]   3.00-4.00   sec  17.0 MBytes   142 Mbits/sec    0    107 =
KBytes      =20
>> [  4]   4.00-5.00   sec  17.0 MBytes   143 Mbits/sec    0    115 =
KBytes      =20
>> - - - - - - - - - - - - - - - - - - - - - - - - -
>> [ ID] Interval           Transfer     Bandwidth       Retr
>> [  4]   0.00-5.00   sec  85.3 MBytes   143 Mbits/sec    0             =
sender
>> [  4]   0.00-5.00   sec  84.7 MBytes   142 Mbits/sec                  =
receiver
>>=20
>> iperf3 with no-op nfqueue callback in Go:
>> root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j NFQUEUE =
--queue-num 0
>> root@lsrv:~# iperf3 -t 5 -c 10.182.122.11
>> Connecting to host 10.182.122.11, port 5201
>> [  4] local 10.182.122.1 port 55864 connected to 10.182.122.11 port =
5201
>> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
>> [  4]   0.00-1.00   sec  14.6 MBytes   122 Mbits/sec    0   96.2 =
KBytes      =20
>> [  4]   1.00-2.00   sec  14.1 MBytes   118 Mbits/sec    0   96.2 =
KBytes      =20
>> [  4]   2.00-3.00   sec  14.0 MBytes   118 Mbits/sec    0    102 =
KBytes      =20
>> [  4]   3.00-4.00   sec  14.0 MBytes   117 Mbits/sec    0    102 =
KBytes      =20
>> [  4]   4.00-5.00   sec  13.7 MBytes   115 Mbits/sec    0    107 =
KBytes      =20
>> - - - - - - - - - - - - - - - - - - - - - - - - -
>> [ ID] Interval           Transfer     Bandwidth       Retr
>> [  4]   0.00-5.00   sec  70.5 MBytes   118 Mbits/sec    0             =
sender
>> [  4]   0.00-5.00   sec  69.9 MBytes   117 Mbits/sec                  =
receiver
>> iperf Done.
>>=20
>> So rats, throughput gets brutalized for both C and Go. For Go, a rate =
of 117 Mbit with a 1500 byte MTU is 9750 packets/sec, which is 103us / =
packet. Mean induced latency measured by ping is 114us, which is not far =
off 103us, so the rate slowdown looks to be mostly caused by the =
per-packet nfqueue calls. The core running test_nfqueue is pinned at =
100% during the test. "nice -n -20" does nothing.
>>=20
>> Presumably you=E2=80=99ll sometimes be releasing more than one packet =
at a time(?) so I guess whether or not this is workable depends on how =
many you release at once, what hardware you=E2=80=99re on and what rates =
you need to test at. But when you=E2=80=99re trying to test a qdisc, I =
guess you=E2=80=99d want to minimize the burden you add to the CPU, or =
else move it to a core the qdisc isn=E2=80=99t running on, or something, =
so the qdisc itself isn=E2=80=99t affected by the test rig.
>>=20
>>> There is of course a hideous amount of complexity moved to the =
daemon, =20
>>=20
>> I can only imagine.
>>=20
>>> as a pure fifo ap queue forms aggregregates much differently
>>> than a fq_codeled one. But, yea! userspace.... =20
>>=20
>> This would be awesome if it works out! After that iperf3 test though, =
I think I may have smashed my dreams of writing a libnetfilter_queue =
userspace qdisc in Go, or C for that matter.
>>=20
>> If this does somehow turn out to be good enough performance-wise, I =
think you=E2=80=99d have a lot more fun and spend a lot less time on it =
in Go than C, but that=E2=80=99s just an opinion... :)
>>=20
>=20
>=20
>=20
> --=20
> Best regards,
>  Jesper Dangaard Brouer
>  MSc.CS, Principal Kernel Engineer at Red Hat
>  LinkedIn: http://www.linkedin.com/in/brouer =
<http://www.linkedin.com/in/brouer>

--Apple-Mail=_AEE169A4-5937-45CC-9B80-F50868A37B70
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><div class=3D"">Hi Jesper/Florian, thanks for noticing that, =
not surprisingly it doesn=E2=80=99t change the ping results much, but it =
improves throughput a lot (now only ~20% less than without =
nfqueue):</div><div class=3D""><br class=3D""></div><div =
class=3D"">root@lsrv:~# iperf3 -t 5 -c 10.182.122.11<br =
class=3D"">Connecting to host 10.182.122.11, port 5201<br class=3D"">[ =
&nbsp;4] local 10.182.122.1 port 55936 connected to 10.182.122.11 port =
5201<br class=3D"">[ ID] Interval &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
Transfer &nbsp; &nbsp; Bandwidth &nbsp; &nbsp; &nbsp; Retr &nbsp;Cwnd<br =
class=3D"">[ &nbsp;4] &nbsp; 0.00-1.00 &nbsp; sec &nbsp; 375 MBytes =
&nbsp;3.14 Gbits/sec &nbsp;173 &nbsp; &nbsp;372 KBytes&nbsp; &nbsp; =
&nbsp; &nbsp;<br class=3D"">[ &nbsp;4] &nbsp; 1.00-2.00 &nbsp; sec =
&nbsp; 365 MBytes &nbsp;3.06 Gbits/sec &nbsp;316 &nbsp; &nbsp;382 =
KBytes&nbsp; &nbsp; &nbsp; &nbsp;<br class=3D"">[ &nbsp;4] &nbsp; =
2.00-3.00 &nbsp; sec &nbsp; 372 MBytes &nbsp;3.13 Gbits/sec &nbsp;368 =
&nbsp; &nbsp;427 KBytes&nbsp; &nbsp; &nbsp; &nbsp;<br class=3D"">[ =
&nbsp;4] &nbsp; 3.00-4.00 &nbsp; sec &nbsp; 364 MBytes &nbsp;3.05 =
Gbits/sec &nbsp;137 &nbsp; &nbsp;402 KBytes&nbsp; &nbsp; &nbsp; =
&nbsp;<br class=3D"">[ &nbsp;4] &nbsp; 4.00-5.00 &nbsp; sec &nbsp; 364 =
MBytes &nbsp;3.05 Gbits/sec &nbsp;342 &nbsp; &nbsp;382 KBytes&nbsp; =
&nbsp; &nbsp; &nbsp;<br class=3D"">- - - - - - - - - - - - - - - - - - - =
- - - - - -<br class=3D"">[ ID] Interval &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; Transfer &nbsp; &nbsp; Bandwidth &nbsp; &nbsp; &nbsp; Retr<br =
class=3D"">[ &nbsp;4] &nbsp; 0.00-5.00 &nbsp; sec &nbsp;1.80 GBytes =
&nbsp;3.09 Gbits/sec &nbsp;1336 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; sender<br class=3D"">[ &nbsp;4] &nbsp; 0.00-5.00 &nbsp; sec =
&nbsp;1.79 GBytes &nbsp;3.08 Gbits/sec &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;receiver<br class=3D"">iperf =
Done.</div><div class=3D""><br class=3D""></div><div class=3D"">I =
don=E2=80=99t know if/how the use of GSO affects Dave=E2=80=99s =
simulation work, but I=E2=80=99ll leave that to him. I only wanted to =
contribute a quick evaluation. :)<br class=3D""><br class=3D""></div><div =
class=3D"">Pete</div><br class=3D""><div><blockquote type=3D"cite" =
class=3D""><div class=3D"">On Jun 17, 2018, at 1:19 PM, Jesper Dangaard =
Brouer &lt;<a href=3D"mailto:brouer@redhat.com" =
class=3D"">brouer@redhat.com</a>&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><span style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
float: none; display: inline !important;" class=3D"">Hi Pete,</span><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><br style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D""><span style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: =
inline !important;" class=3D"">Happened to be at the Netfilter Workshop, =
and discussed nfqueue with</span><br style=3D"font-family: Helvetica; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: =
start; text-indent: 0px; text-transform: none; white-space: normal; =
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" =
class=3D""><span style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: =
inline !important;" class=3D"">Florian and Marek, and I saw this attempt =
to use nfqueue, and Florian</span><br style=3D"font-family: Helvetica; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: =
start; text-indent: 0px; text-transform: none; white-space: normal; =
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" =
class=3D""><span style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: =
inline !important;" class=3D"">points out that you are not using the GRO =
facility of nfqueue.</span><br style=3D"font-family: Helvetica; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: =
start; text-indent: 0px; text-transform: none; white-space: normal; =
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" =
class=3D""><br style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D""><span =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; float: none; display: inline =
!important;" class=3D"">I'll quote what Florian said below:</span><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><br style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D""><span style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: =
inline !important;" class=3D"">On Sun, 17 Jun 2018 12:45:52 +0200 =
Florian Westphal &lt;</span><a href=3D"mailto:fw@strlen.de" =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D"">fw@strlen.de</a><span =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; float: none; display: inline =
!important;" class=3D"">&gt; wrote:</span><br style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D""><br style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D""><blockquote=
 type=3D"cite" style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D"">The =
linked example code is old and does not set<br class=3D""><span =
class=3D"Apple-tab-span" style=3D"white-space: pre;">	=
</span>mnl_attr_put_u32(nlh, NFQA_CFG_FLAGS, htonl(NFQA_CFG_F_GSO));<br =
class=3D""><br class=3D"">When requesting the queue.<br class=3D""><br =
class=3D"">This means kernel has to do software segmentation of GSO =
skbs.<br class=3D""><br class=3D"">Consider using<br class=3D""><a =
href=3D"https://git.netfilter.org/libnetfilter_queue/tree/examples/nf-queu=
e.c" =
class=3D"">https://git.netfilter.org/libnetfilter_queue/tree/examples/nf-q=
ueue.c</a><br class=3D""><br class=3D"">instead if you need a template, =
it does this correctly.<br class=3D""></blockquote><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><span style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
float: none; display: inline !important;" class=3D"">--Jesper</span><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><br style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D""><br style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D""><span =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; float: none; display: inline =
!important;" class=3D"">On Sun, 17 Jun 2018 00:53:03 +0200 Pete Heist =
&lt;</span><a href=3D"mailto:pete@heistp.net" style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D"">pete@heistp.net</a><span style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
float: none; display: inline !important;" class=3D"">&gt; =
wrote:</span><br style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D""><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><blockquote type=3D"cite" =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><blockquote type=3D"cite" =
class=3D"">On Jun 16, 2018, at 12:30 AM, Dave Taht &lt;<a =
href=3D"mailto:dave.taht@gmail.com" class=3D"">dave.taht@gmail.com</a>&gt;=
 wrote:<br class=3D""><br class=3D"">Eric just suggested using the =
iptables NFQUEUE ability to toss<br class=3D"">packets to userspace.<br =
class=3D""><br class=3D""><a =
href=3D"https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter=
_queue/" =
class=3D"">https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfil=
ter_queue/</a><span class=3D"Apple-converted-space">&nbsp;</span>&lt;<a =
href=3D"https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfilter=
_queue/" =
class=3D"">https://home.regit.org/netfilter-en/using-nfqueue-and-libnetfil=
ter_queue/</a>&gt;<br class=3D"">For wifi, at least, timings are not =
hugely critical, a few hundred<br class=3D"">usec is something userspace =
can handle reasonably accurately. I like<br class=3D"">very much being =
able to separate out mcast and treat that correctly in<br =
class=3D"">userspace, also. I did want to be below 10usec (wifi "bus"<br =
class=3D"">arbitration), which I am dubious about....<br class=3D""><br =
class=3D"">Now as for an implementation language? C++ C? Go? Python? =
The<br class=3D"">condition of the wrapper library for go leaves a bit =
to be desired<br class=3D"">(<span =
class=3D"Apple-converted-space">&nbsp;</span><a =
href=3D"https://github.com/chifflier/nfqueue-go" =
class=3D"">https://github.com/chifflier/nfqueue-go</a><span =
class=3D"Apple-converted-space">&nbsp;</span>&lt;<a =
href=3D"https://github.com/chifflier/nfqueue-go" =
class=3D"">https://github.com/chifflier/nfqueue-go</a>&gt; ) and given a =
choice I'd<br class=3D"">MUCH rather use a go than a C. &nbsp;<br =
class=3D""></blockquote><br class=3D"">This sounds cool... So for fun, I =
compared ping and iperf3 with no-op nfqueue callbacks in both C and Go. =
As for the hardware setup, I used two lxc containers (effectively just =
veth) on an APU2.<br class=3D""><br class=3D"">For the Go program, I =
used test_nfqueue from the wrapper above (which yes, does need some =
work) and removed debugging / logging.<br class=3D""><br class=3D"">For =
the C program I used this:<br class=3D""><a =
href=3D"https://github.com/irontec/netfilter-nfqueue-samples/blob/master/s=
ample-helloworld.c" =
class=3D"">https://github.com/irontec/netfilter-nfqueue-samples/blob/maste=
r/sample-helloworld.c</a><br class=3D"">I removed any per-packet printf =
calls and compiled with "gcc sample-helloworld.c -o nfq -lnfnetlink =
-lnetfilter_queue=E2=80=9D.<br class=3D""><br class=3D"">Ping =
results:<br class=3D""><br class=3D"">ping without nfqueue:<br =
class=3D"">root@lsrv:~# iptables -F OUTPUT<br class=3D"">root@lsrv:~# =
ping -c 500 -i 0.01 -q 10.182.122.11<br class=3D"">500 packets =
transmitted, 500 received, 0% packet loss, time 7985ms<br class=3D"">rtt =
min/avg/max/mdev =3D 0.056/0.058/0.185/0.011 ms<br class=3D""><br =
class=3D"">ping with no-op nfqueue callback in C:<br =
class=3D"">root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j =
NFQUEUE --queue-num 0<br class=3D"">root@lsrv:~/nfqueue# ping -c 500 -i =
0.01 -q 10.182.122.11<br class=3D"">500 packets transmitted, 500 =
received, 0% packet loss, time 7981ms<br class=3D"">rtt min/avg/max/mdev =
=3D 0.117/0.123/0.384/0.020 ms<br class=3D""><br class=3D"">ping with =
no-op nfqueue callback in Go:<br class=3D"">root@lsrv:~# iptables -A =
OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-num 0<br =
class=3D"">root@lsrv:~# ping -c 500 -i 0.01 -q 10.182.122.11<br =
class=3D"">500 packets transmitted, 500 received, 0% packet loss, time =
7982ms<br class=3D"">rtt min/avg/max/mdev =3D 0.095/0.172/0.532/0.042 =
ms<br class=3D""><br class=3D"">The mean induced latency of 65us for C =
or 114us for Go might be within your parameters, except you mentioned =
10us for WiFi bus arbitration, which does indeed look impossible with =
this setup, even in C.<br class=3D""><br class=3D"">Iperf3 results:<br =
class=3D""><br class=3D"">iperf3 without nfqueue:<br =
class=3D"">root@lsrv:~# iptables -F OUTPUT<br class=3D"">root@lsrv:~# =
iperf3 -t 5 -c 10.182.122.11<br class=3D"">Connecting to host =
10.182.122.11, port 5201<br class=3D"">[ &nbsp;4] local 10.182.122.1 =
port 55810 connected to 10.182.122.11 port 5201<br class=3D"">[ ID] =
Interval =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Transfer =
&nbsp;&nbsp;&nbsp;&nbsp;Bandwidth =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Retr &nbsp;Cwnd<br class=3D"">[ =
&nbsp;4] &nbsp;&nbsp;0.00-1.00 &nbsp;&nbsp;sec &nbsp;&nbsp;452 MBytes =
&nbsp;3.79 Gbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;178 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;1.00-2.00 &nbsp;&nbsp;sec &nbsp;&nbsp;454 MBytes &nbsp;3.82 =
Gbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;320 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;2.00-3.00 &nbsp;&nbsp;sec &nbsp;&nbsp;450 MBytes &nbsp;3.77 =
Gbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;320 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;3.00-4.00 &nbsp;&nbsp;sec &nbsp;&nbsp;451 MBytes &nbsp;3.79 =
Gbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;352 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;4.00-5.00 &nbsp;&nbsp;sec &nbsp;&nbsp;451 MBytes &nbsp;3.79 =
Gbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;352 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">- - - - - - - - - - - =
- - - - - - - - - - - - - -<br class=3D"">[ ID] Interval =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Transfer =
&nbsp;&nbsp;&nbsp;&nbsp;Bandwidth =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Retr<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;0.00-5.00 &nbsp;&nbsp;sec &nbsp;2.21 GBytes &nbsp;3.79 =
Gbits/sec &nbsp;&nbsp;&nbsp;0 =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;se=
nder<br class=3D"">[ &nbsp;4] &nbsp;&nbsp;0.00-5.00 &nbsp;&nbsp;sec =
&nbsp;2.21 GBytes &nbsp;3.79 Gbits/sec =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;receiver<br class=3D"">iperf Done.<br =
class=3D""><br class=3D"">iperf3 with no-op nfqueue callback in C:<br =
class=3D"">root@lsrv:~# iptables -A OUTPUT -d 10.182.122.11/32 -j =
NFQUEUE --queue-num 0<br class=3D"">root@lsrv:~/nfqueue# iperf3 -t 5 -c =
10.182.122.11<br class=3D"">Connecting to host 10.182.122.11, port =
5201<br class=3D"">[ &nbsp;4] local 10.182.122.1 port 55868 connected to =
10.182.122.11 port 5201<br class=3D"">[ ID] Interval =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Transfer =
&nbsp;&nbsp;&nbsp;&nbsp;Bandwidth =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Retr &nbsp;Cwnd<br class=3D"">[ =
&nbsp;4] &nbsp;&nbsp;0.00-1.00 &nbsp;&nbsp;sec &nbsp;17.4 MBytes =
&nbsp;&nbsp;146 Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;107 =
KBytes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;1.00-2.00 &nbsp;&nbsp;sec &nbsp;16.9 MBytes &nbsp;&nbsp;142 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;107 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;2.00-3.00 &nbsp;&nbsp;sec &nbsp;17.0 MBytes &nbsp;&nbsp;142 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;107 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;3.00-4.00 &nbsp;&nbsp;sec &nbsp;17.0 MBytes &nbsp;&nbsp;142 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;107 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;4.00-5.00 &nbsp;&nbsp;sec &nbsp;17.0 MBytes &nbsp;&nbsp;143 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;115 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">- - - - - - - - - - - =
- - - - - - - - - - - - - -<br class=3D"">[ ID] Interval =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Transfer =
&nbsp;&nbsp;&nbsp;&nbsp;Bandwidth =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Retr<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;0.00-5.00 &nbsp;&nbsp;sec &nbsp;85.3 MBytes &nbsp;&nbsp;143 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;se=
nder<br class=3D"">[ &nbsp;4] &nbsp;&nbsp;0.00-5.00 &nbsp;&nbsp;sec =
&nbsp;84.7 MBytes &nbsp;&nbsp;142 Mbits/sec =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;receiver<br class=3D""><br class=3D"">iperf3 =
with no-op nfqueue callback in Go:<br class=3D"">root@lsrv:~# iptables =
-A OUTPUT -d 10.182.122.11/32 -j NFQUEUE --queue-num 0<br =
class=3D"">root@lsrv:~# iperf3 -t 5 -c 10.182.122.11<br =
class=3D"">Connecting to host 10.182.122.11, port 5201<br class=3D"">[ =
&nbsp;4] local 10.182.122.1 port 55864 connected to 10.182.122.11 port =
5201<br class=3D"">[ ID] Interval =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Transfer =
&nbsp;&nbsp;&nbsp;&nbsp;Bandwidth =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Retr &nbsp;Cwnd<br class=3D"">[ =
&nbsp;4] &nbsp;&nbsp;0.00-1.00 &nbsp;&nbsp;sec &nbsp;14.6 MBytes =
&nbsp;&nbsp;122 Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;96.2 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;1.00-2.00 &nbsp;&nbsp;sec &nbsp;14.1 MBytes &nbsp;&nbsp;118 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;96.2 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;2.00-3.00 &nbsp;&nbsp;sec &nbsp;14.0 MBytes &nbsp;&nbsp;118 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;102 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;3.00-4.00 &nbsp;&nbsp;sec &nbsp;14.0 MBytes &nbsp;&nbsp;117 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;102 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;4.00-5.00 &nbsp;&nbsp;sec &nbsp;13.7 MBytes &nbsp;&nbsp;115 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 &nbsp;&nbsp;&nbsp;107 KBytes =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br class=3D"">- - - - - - - - - - - =
- - - - - - - - - - - - - -<br class=3D"">[ ID] Interval =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Transfer =
&nbsp;&nbsp;&nbsp;&nbsp;Bandwidth =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Retr<br class=3D"">[ &nbsp;4] =
&nbsp;&nbsp;0.00-5.00 &nbsp;&nbsp;sec &nbsp;70.5 MBytes &nbsp;&nbsp;118 =
Mbits/sec &nbsp;&nbsp;&nbsp;0 =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;se=
nder<br class=3D"">[ &nbsp;4] &nbsp;&nbsp;0.00-5.00 &nbsp;&nbsp;sec =
&nbsp;69.9 MBytes &nbsp;&nbsp;117 Mbits/sec =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;receiver<br class=3D"">iperf Done.<br =
class=3D""><br class=3D"">So rats, throughput gets brutalized for both C =
and Go. For Go, a rate of 117 Mbit with a 1500 byte MTU is 9750 =
packets/sec, which is 103us / packet. Mean induced latency measured by =
ping is 114us, which is not far off 103us, so the rate slowdown looks to =
be mostly caused by the per-packet nfqueue calls. The core running =
test_nfqueue is pinned at 100% during the test. "nice -n -20" does =
nothing.<br class=3D""><br class=3D"">Presumably you=E2=80=99ll =
sometimes be releasing more than one packet at a time(?) so I guess =
whether or not this is workable depends on how many you release at once, =
what hardware you=E2=80=99re on and what rates you need to test at. But =
when you=E2=80=99re trying to test a qdisc, I guess you=E2=80=99d want =
to minimize the burden you add to the CPU, or else move it to a core the =
qdisc isn=E2=80=99t running on, or something, so the qdisc itself =
isn=E2=80=99t affected by the test rig.<br class=3D""><br =
class=3D""><blockquote type=3D"cite" class=3D"">There is of course a =
hideous amount of complexity moved to the daemon, &nbsp;<br =
class=3D""></blockquote><br class=3D"">I can only imagine.<br =
class=3D""><br class=3D""><blockquote type=3D"cite" class=3D"">as a pure =
fifo ap queue forms aggregregates much differently<br class=3D"">than a =
fq_codeled one. But, yea! userspace.... &nbsp;<br =
class=3D""></blockquote><br class=3D"">This would be awesome if it works =
out! After that iperf3 test though, I think I may have smashed my dreams =
of writing a libnetfilter_queue userspace qdisc in Go, or C for that =
matter.<br class=3D""><br class=3D"">If this does somehow turn out to be =
good enough performance-wise, I think you=E2=80=99d have a lot more fun =
and spend a lot less time on it in Go than C, but that=E2=80=99s just an =
opinion... :)<br class=3D""><br class=3D""></blockquote><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><br style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D""><br style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D""><span =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; float: none; display: inline =
!important;" class=3D"">--<span =
class=3D"Apple-converted-space">&nbsp;</span></span><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><span style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
float: none; display: inline !important;" class=3D"">Best =
regards,</span><br style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=3D""><span =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px; float: none; display: inline =
!important;" class=3D"">&nbsp;Jesper Dangaard Brouer</span><br =
style=3D"font-family: Helvetica; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; =
orphans: auto; text-align: start; text-indent: 0px; text-transform: =
none; white-space: normal; widows: auto; word-spacing: 0px; =
-webkit-text-stroke-width: 0px;" class=3D""><span style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
float: none; display: inline !important;" class=3D"">&nbsp;MSc.CS, =
Principal Kernel Engineer at Red Hat</span><br style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" class=3D""><span style=3D"font-family: Helvetica; font-size: 12px; =
font-style: normal; font-variant-caps: normal; font-weight: normal; =
letter-spacing: normal; orphans: auto; text-align: start; text-indent: =
0px; text-transform: none; white-space: normal; widows: auto; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: =
inline !important;" class=3D"">&nbsp;LinkedIn:<span =
class=3D"Apple-converted-space">&nbsp;</span></span><a =
href=3D"http://www.linkedin.com/in/brouer" style=3D"font-family: =
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: normal; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: =
0px;" =
class=3D"">http://www.linkedin.com/in/brouer</a></div></blockquote></div><=
br class=3D""></body></html>=

--Apple-Mail=_AEE169A4-5937-45CC-9B80-F50868A37B70--