From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <woody77@gmail.com>
Received: from mail-oi0-x236.google.com (mail-oi0-x236.google.com
 [IPv6:2607:f8b0:4003:c06::236])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id B12773B260
 for <bloat@lists.bufferbloat.net>; Sun,  4 Dec 2016 13:12:29 -0500 (EST)
Received: by mail-oi0-x236.google.com with SMTP id y198so318263591oia.1
 for <bloat@lists.bufferbloat.net>; Sun, 04 Dec 2016 10:12:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=qIJbBA6p95ntVwAvltAD/A/MvXsn3e/D0Whp7pu9yck=;
 b=RQbujF2ZWyGdDJa6LGCfmJgcIRpt08PppXM3w4y5AQifYRm/q+92jZskZDb81cuuqg
 ucZkkyHXJEuZDUjojzjHoTxJIE5Ns2iHgWsq+6uVIz1vsq4LQ2YuEEsSVYPWEtTynJqj
 ejyRyCMkgHFoICLXFMPjVRfbgeGrQj+7iBwSJYI3E27vfPWGAnsWIh4fiVZxsoGQh6M0
 RYmqX72ag7KLrUWF8cHThM9MVdFweQ/TLAhoUpzpXqzatnJdxHrAhjdntIYB4ckBPU+W
 MRsIQzjWRCplbeT4Nm7U0wmJSDfFLO43S6XgC+Ji5wK7LxxrkANKUTNcWc3unzSOVWUw
 3iVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=qIJbBA6p95ntVwAvltAD/A/MvXsn3e/D0Whp7pu9yck=;
 b=hhmA569sJPN6GCEj9d60Ja8+aBKxLDFj2S+PgYhzddzljq2Nfo9UfhSTpoPCm+OYaN
 VRQ97VD/kqyC4/7WAsKE+A3JkpkNlniNUmE/3PDVbnpBx3q9giRuQsHc0uGgnWs/mVUu
 1Lw4iY1SfKE34pYpYm9Ck7K0aXzDEsZ0FsPUlVnoYSJusSll2OlnlKnrwqraq5fxSAFA
 N6GQgHTKpegAziFcFN4fCZGxNEua1OSXetzY0L8Y2/qzA/xC26bYS4fzDV/iSwv4uzMx
 u1NJVnzGXxHPOPIqQGSEAAzgv2Z9XKWXfrhXpahCHO5kdG2ObIaz4TeVCmn9STb/v2dq
 9G0A==
X-Gm-Message-State: AKaTC01AEKxwGrP7aLKevy7WIE+do+5lBjNPEvkVf2D8ELfd2YFKwWvOm8/drSQNbBeK9FxvkBZoM3raJ7kLkw==
X-Received: by 10.202.73.146 with SMTP id w140mr26920141oia.207.1480875148950; 
 Sun, 04 Dec 2016 10:12:28 -0800 (PST)
MIME-Version: 1.0
Received: by 10.182.28.9 with HTTP; Sun, 4 Dec 2016 10:12:28 -0800 (PST)
In-Reply-To: <CAA93jw4bnTWMjkVGwHjxJkK6w3aHKKNZsDuO0Wc8TGwwX0CEJQ@mail.gmail.com>
References: <5052DE08-A6AA-4544-B281-5FEBC9572673@gmail.com>
 <CAA93jw4bnTWMjkVGwHjxJkK6w3aHKKNZsDuO0Wc8TGwwX0CEJQ@mail.gmail.com>
From: Aaron Wood <woody77@gmail.com>
Date: Sun, 4 Dec 2016 10:12:28 -0800
Message-ID: <CALQXh-PFjoHjc-2+JnNR6TwR30yNt01454XA3s7ptCEupEY_VQ@mail.gmail.com>
To: Dave Taht <dave.taht@gmail.com>
Cc: Rich Brown <richb.hanover@gmail.com>, bloat <bloat@lists.bufferbloat.net>
Content-Type: multipart/alternative; boundary=001a1134fe4835c3cd0542d91d54
Subject: Re: [Bloat] Reasons to prefer netperf vs iperf?
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sun, 04 Dec 2016 18:12:29 -0000

--001a1134fe4835c3cd0542d91d54
Content-Type: text/plain; charset=UTF-8

On Sun, Dec 4, 2016 at 9:13 AM, Dave Taht <dave.taht@gmail.com> wrote:

> On Sun, Dec 4, 2016 at 5:40 AM, Rich Brown <richb.hanover@gmail.com>
> wrote:
> > As I browse the web, I see several sets of performance measurement using
> either netperf or iperf, and never know if either offers an advantage.
> >
> > I know Flent uses netperf by default: what are the reason(s) for
> selecting it? Thanks


*netperf
+supports multiple tests in parallel on the same server


> * iperf
> + More widely available
>
Sort of...  Given the variants, less so.  But iperf3 is coded to be pretty
portable, and so it's pretty widely available

It has a pretty good JSON format for the client results, but the server
results are returned in plain text.  And it doesn't report anything
finer-grained than 100ms.

- I have generally not trusted the results published either - but
> aaron finding that major bug in iperf's udp measurements explains a
> LOT of that. I think.
>

I've found something else with it, that I need to write up:  with UDP the
application-layer pacing and the fq socket pacing cause it to report a lot
of invalid packet loss.  The application pacing is focused on layer-6
good-put.  But the fq pacing appears to be enforcing wire-rates (or
calculated ethernet rates), and so with small packets (64 byte payloads),
it's typical to see 40% packet loss (as the fq layer discards the UDP
frames to cut say 100Mbps of application layer down to 100Mpbs of wire
rate).  I need to actually do the tcpdump analysis of that and get it
written up.


> - Has, like, 3-8 non-interoperable versions.
> - is available in java, for example
>
> there *might* be an iperf version worth adopting but I have no idea
> which one it would be.
>

Part of the issue with iPerf is that there are two main variants:  iperf
and iperf3.  iperf3 is currently maintained by the ESNET folks, and their
use-case is wildly different from ours:

- Very high bandwidth (>=100Gbps)
- Latency insensitive (long-running bulk data transfers)
- private networks (jumbo frames are an assumed use)

I'm also happy to take the fork of it that I have (
https://github.com/woody77/iperf) and make that tuned for our uses.  There
are certain aspects that I wouldn't want to dive into changing at the
moment (like the single-threaded nature of the server).  But I can easily
bang on the corners and get it's defaults better suited for our uses, and
make it behave better in the face of running without offloads.  On my test
boxes, it starts to get I/O limited around 4-5 Gbps when using 1400-byte
UDP payloads.  With TCP and TSO, it's merrily runs at 9.4Gbps of good-put
over a 10Gbps NIC.

But the application and kernel-level "packet rate" is really quite
different at that point.  By default it's dropping 128KB blocks into the
TCP send buffer, and letting the TCP stack and offloads do their thing.

At the higher-performance end of things, I think it would benefit from
using sendmmsg()/recvmmsg() on platforms that support them.  I think that
would let it better work with fq pacing at rates of 5Gbps and up.

I started speccing out a flent specific netperf/iperf replacement
> *years* ago,  (twd), but the enormous amount of money/effort required
> to do it right caused me to dump the project. Also, at the time
> (because of the need for reliable high speed measurements AND for
> measurements on obscure, weak, cpus) my preferred language was going
> to be C, and that too raised the time/money metric into the
> stratosphere.


iperf3's internal structure might be useful for bootstrapping a project
like that.  It already has all the application logic infrastructure, and
has a central timer heap (which can be made more efficient/exact), and it
already has a notion of different kinds of workloads.  It wouldn't be too
much work to make its tests more modular.

-Aaron

--001a1134fe4835c3cd0542d91d54
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div class=3D"gmail_quo=
te">On Sun, Dec 4, 2016 at 9:13 AM, Dave Taht <span dir=3D"ltr">&lt;<a href=
=3D"mailto:dave.taht@gmail.com" target=3D"_blank">dave.taht@gmail.com</a>&g=
t;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0=
px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);borde=
r-left-style:solid;padding-left:1ex"><span class=3D"gmail-">On Sun, Dec 4, =
2016 at 5:40 AM, Rich Brown &lt;<a href=3D"mailto:richb.hanover@gmail.com">=
richb.hanover@gmail.com</a>&gt; wrote:<br>
&gt; As I browse the web, I see several sets of performance measurement usi=
ng either netperf or iperf, and never know if either offers an advantage.<b=
r>
&gt;<br>
&gt; I know Flent uses netperf by default: what are the reason(s) for selec=
ting it? Thanks</span></blockquote><div><br></div><div>*netperf</div><div>+=
supports multiple tests in parallel on the same server</div><div><br></div>=
<div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px =
0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-l=
eft-style:solid;padding-left:1ex">
* iperf<br>
+ More widely available<br></blockquote><div>Sort of...=C2=A0 Given the var=
iants, less so.=C2=A0 But iperf3 is coded to be pretty portable, and so it&=
#39;s pretty widely available</div><div><br></div><div>It has a pretty good=
 JSON format for the client results, but the server results are returned in=
 plain text.=C2=A0 And it doesn&#39;t report anything finer-grained than 10=
0ms.</div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);=
border-left-style:solid;padding-left:1ex">- I have generally not trusted th=
e results published either - but<br>
aaron finding that major bug in iperf&#39;s udp measurements explains a<br>
LOT of that. I think.<br></blockquote><div><br></div><div>I&#39;ve found so=
mething else with it, that I need to write up: =C2=A0with UDP the applicati=
on-layer pacing and the fq socket pacing cause it to report a lot of invali=
d packet loss.=C2=A0 The application pacing is focused on layer-6 good-put.=
=C2=A0 But the fq pacing appears to be enforcing wire-rates (or calculated =
ethernet rates), and so with small packets (64 byte payloads), it&#39;s typ=
ical to see 40% packet loss (as the fq layer discards the UDP frames to cut=
 say 100Mbps of application layer down to 100Mpbs of wire rate).=C2=A0 I ne=
ed to actually do the tcpdump analysis of that and get it written up.</div>=
<div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px =
0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-l=
eft-style:solid;padding-left:1ex">
- Has, like, 3-8 non-interoperable versions.<br>
- is available in java, for example<br>
<br>
there *might* be an iperf version worth adopting but I have no idea<br>
which one it would be.<br></blockquote><div><br></div><div>Part of the issu=
e with iPerf is that there are two main variants: =C2=A0iperf and iperf3. =
=C2=A0iperf3 is currently maintained by the ESNET folks, and their use-case=
 is wildly different from ours:</div><div><br></div><div>- Very high bandwi=
dth (&gt;=3D100Gbps)</div><div>- Latency insensitive (long-running bulk dat=
a transfers)</div><div>- private networks (jumbo frames are an assumed use)=
</div><div><br></div><div>I&#39;m also happy to take the fork of it that I =
have (<a href=3D"https://github.com/woody77/iperf">https://github.com/woody=
77/iperf</a>) and make that tuned for our uses.=C2=A0 There are certain asp=
ects that I wouldn&#39;t want to dive into changing at the moment (like the=
 single-threaded nature of the server).=C2=A0 But I can easily bang on the =
corners and get it&#39;s defaults better suited for our uses, and make it b=
ehave better in the face of running without offloads.=C2=A0 On my test boxe=
s, it starts to get I/O limited around 4-5 Gbps when using 1400-byte UDP pa=
yloads.=C2=A0 With TCP and TSO, it&#39;s merrily runs at 9.4Gbps of good-pu=
t over a 10Gbps NIC. =C2=A0</div><div><br></div><div>But the application an=
d kernel-level &quot;packet rate&quot; is really quite different at that po=
int.=C2=A0 By default it&#39;s dropping 128KB blocks into the TCP send buff=
er, and letting the TCP stack and offloads do their thing.</div><div><br></=
div><div>At the higher-performance end of things, I think it would benefit =
from using sendmmsg()/recvmmsg() on platforms that support them.=C2=A0 I th=
ink that would let it better work with fq pacing at rates of 5Gbps and up.<=
/div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0=
px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);borde=
r-left-style:solid;padding-left:1ex">
I started speccing out a flent specific netperf/iperf replacement<br>
*years* ago,=C2=A0 (twd), but the enormous amount of money/effort required<=
br>
to do it right caused me to dump the project. Also, at the time<br>
(because of the need for reliable high speed measurements AND for<br>
measurements on obscure, weak, cpus) my preferred language was going<br>
to be C, and that too raised the time/money metric into the<br>
stratosphere.</blockquote><div><br></div><div>iperf3&#39;s internal structu=
re might be useful for bootstrapping a project like that.=C2=A0 It already =
has all the application logic infrastructure, and has a central timer heap =
(which can be made more efficient/exact), and it already has a notion of di=
fferent kinds of workloads.=C2=A0 It wouldn&#39;t be too much work to make =
its tests more modular.</div><div><br></div><div>-Aaron</div></div></div></=
div>

--001a1134fe4835c3cd0542d91d54--