From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <peteheist@gmail.com>
Received: from mail-wr0-x236.google.com (mail-wr0-x236.google.com
 [IPv6:2a00:1450:400c:c0c::236])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 302743BA8E
 for <cake@lists.bufferbloat.net>; Mon, 27 Nov 2017 06:04:23 -0500 (EST)
Received: by mail-wr0-x236.google.com with SMTP id o14so25991149wrf.9
 for <cake@lists.bufferbloat.net>; Mon, 27 Nov 2017 03:04:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:subject:message-id:date:to:mime-version;
 bh=czAPEcutDto/w3ijqor5suwTRxyefRzwPFnuYL5fADQ=;
 b=D3e8UWLxYrA0X9AcR0gvy8cW5aBBD8aPDEiNsBIyzB2qX8xWueuofYujy35RcuUwUC
 iVDCaH8i9SlRcxCTiQQgvRixP51nX6WMNe7Eh2twRH2Qi+BDFfsx401LlJNlTW8vQsOx
 tax0GTf3tLnD2YHu9AnM014jEDcartXkRIPuwluNl1rHcao6nQEdjGJFEg7FL8TfluA2
 oA9jQNsYOJgd+L7cKIo65ekYkj5aaLFiaaXNK1vyR7hIPwIFqHOlIazAqsDve7KbcFaH
 NI8Dc3VIgXLtMH6UvsFQdAU84kixfDyszauK7/D2RgimocjBU8HafkiPncgCXixTKLb7
 6Itg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:subject:message-id:date:to:mime-version;
 bh=czAPEcutDto/w3ijqor5suwTRxyefRzwPFnuYL5fADQ=;
 b=fxCjuP4Cz0btJ9sLzgTJWas/QJ/By/zEDLDxEoR2G9a+0G4X1zco9KVI4W/gJWEcna
 1YwIrwaw2ZL0yiMYrYVDeG2eRcqYoibSQi3TEU0uIWwOVvCsBOFO+OIr75CB/SJ8NujR
 l/qSVXLCo7SrnnCUAAZdB9ajWIZkNHwNWwCSaieMTmSqbhPfc5wpMTLuYcxsiFD5mCl/
 tnhLe2lUdto8R2CZw8k/YCXq+Dx/ZCHAFtHtPr2g03Kw0pf2nBy1NPWmd8Bmr48+BHFv
 6jRYwhP1RrmByEBjOSSmtYyxjDniXT2ceZfg+hEqtXJKR/ZYyjQ2Qhd2vRZQpGK8MJ4I
 tBzA==
X-Gm-Message-State: AJaThX42sW7N8Jg3B5dZgmiSrQ/qOquyDtW3Ye+6Chlen4Wan+RGecpA
 lLl14QgJ/M7g1u2f7Wl9I2r0DdpW
X-Google-Smtp-Source: AGs4zMahXHN3D0jMuNkvb4qT+Q+oEUd+YL8lucff1oQoOA9mAtt5Hpv/bNDtfInDVCfpm8BCxlbxBg==
X-Received: by 10.223.166.235 with SMTP id t98mr31346068wrc.251.1511780661692; 
 Mon, 27 Nov 2017 03:04:21 -0800 (PST)
Received: from [10.72.0.130] (h-1169.lbcfree.net. [185.99.119.68])
 by smtp.gmail.com with ESMTPSA id p42sm50862742wrb.28.2017.11.27.03.04.20
 for <cake@lists.bufferbloat.net>
 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
 Mon, 27 Nov 2017 03:04:20 -0800 (PST)
From: Pete Heist <peteheist@gmail.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_89852D60-7F70-45D4-A770-A966C4D98DB2"
Message-Id: <85E1A7B2-8AA7-418A-BE43-209A1EC8881A@gmail.com>
Date: Mon, 27 Nov 2017 12:04:19 +0100
To: cake@lists.bufferbloat.net
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
X-Mailer: Apple Mail (2.3124)
Subject: [Cake] cake flenter results round 1
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 27 Nov 2017 11:04:23 -0000


--Apple-Mail=_89852D60-7F70-45D4-A770-A966C4D98DB2
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

http://www.drhleny.cz/bufferbloat/cake/round1/ =
<http://www.drhleny.cz/bufferbloat/cake/round1/>

Round 1 Tarball: http://www.drhleny.cz/bufferbloat/cake/round1.tgz =
<http://www.drhleny.cz/bufferbloat/cake/round1.tgz>

Round 0 Tarball (previous run): =
http://www.drhleny.cz/bufferbloat/cake/round0.tgz =
<http://www.drhleny.cz/bufferbloat/cake/round0.tgz>

*** Notes/Analysis ***

* New bql tests show the effectiveness of cake=E2=80=99s TSO/GSO/GRO =
=E2=80=9Cpeeling=E2=80=9D vs fq_codel? Or am I seeing an mq artifact on =
my 4-queue device?

=
http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_eg_fq_codel_=
nolimit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_eg_fq_codel=
_nolimit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_eg_cakeeth_n=
olimit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_eg_cakeeth_=
nolimit/index.html>

* Cake holds TCP RTT to half that of fq_codel at 10mbit bandwidth. I =
like to call this technique of rate limiting well below the =
interface=E2=80=99s maximum "over-limiting=E2=80=9D, which seems to work =
well with stable point-to-point WiFi connections. (Of course, =
point-to-multipoint or unstable rates requires the new ath9k/10k driver =
changes as limiting in this way would not be effective, well explained =
here- https://www.youtube.com/watch?v=3DRb-UnHDw02o =
<https://www.youtube.com/watch?v=3DRb-UnHDw02o>):

=
http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_sfq_10.0mb=
it/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_sfq_10.0m=
bit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_fq_codel_1=
0.0mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_fq_codel_=
10.0mbit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_cakeeth_10=
.0mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_cakeeth_1=
0.0mbit/index.html>

* Cake at 950mbit performed just as well as fq_codel, vs the round0 runs =
where fq_codel had a bit of an advantage. Perhaps the addition of the =
=E2=80=9Cethernet=E2=80=9D keyword did this?

=
http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_fq_codel_9=
50mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_fq_codel_=
950mbit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_cakeeth_95=
0mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_cakeeth_9=
50mbit/index.html>

** I=E2=80=99m finding the "32 Flows, RRUL Best-Effort=E2=80=9D tests =
fascinating to look at. It might be possible to spot implementation =
differences between fq_codel and cake from these.

* At 10mbit, cake and fq_codel are better at most things than sfq by an =
order of magnitude or more. But interestingly, at this bandwidth =
fq_codel=E2=80=99s results look a bit better than cake, where total =
bandwidth for fq_codel is higher (4.78/9.12mbit for fq_codel and =
3.91/8.63mbit for cake) and ping latency a bit lower (1.79ms vs 1.92ms), =
and TCP RTT significantly better (~30ms vs ~45 ms). Maybe cake's =
=E2=80=9Cethernet=E2=80=9D keyword at these low bandwidths affects a =
test like this disproportionally?

=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_sfq_10.0mbit/inde=
x.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_sfq_10.0mbit/ind=
ex.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_10.0mbit=
/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_10.0mbi=
t/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_10.0mbit/=
index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_10.0mbit=
/index.html>

* At 100mbit, the situation reverses, with fq_codel TCP RTT above 10ms =
and cake around 4.75ms.

=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_100mbit/=
index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_100mbit=
/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_100mbit/i=
ndex.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_100mbit/=
index.html>

* And then above 200mbit, fq_codel performs considerably better than =
cake at the 32/32 flow tests. At 900mbit, UDP/ping is 1.1ms for fq_codel =
and 10ms for cake. TCP RTT is ~6.5ms for fq_codel and ~12ms for cake. =
Dave=E2=80=99s earlier explanation probably applies here: "Since =
fq_codel supports superpackets and cake peels them, we have a cpu and =
latency hit that originates from that. Also the code derived algorithm =
in cake differs quite significantly from mainline codel, and my =
principal gripe about it has been that it has not been extensively =
tested against higher delays."

=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_900mbit/=
index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_900mbit=
/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_900mbit/i=
ndex.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_900mbit/=
index.html>

* On the Cake RTT tests, we take about a 15% hit in total TCP throughput =
at rtt 1ms vs rtt 10ms (1454mbit vs 1700mbit), and a 55% hit at rtt =
100us (which is why you=E2=80=99d probably only consider using that on =
10gbit links). If we don=E2=80=99t remove the =E2=80=98ethernet=E2=80=99 =
keyword altogether, I guess I=E2=80=99d like to see it at least be 10ms, =
as TCP RTT only goes from around 0.8ms to 1.8ms, which I don=E2=80=99t =
think makes a huge latency difference in real world terms. Or it might =
be another argument for removing datacentre, ethernet and metro =
altogether, because there are tradeoffs to decide about.

=
http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_10ms_rrulbe_eg_cake=
_900mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_10ms_rrulbe_eg_cak=
e_900mbit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_1ms_rrulbe_eg_cake_=
900mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_1ms_rrulbe_eg_cake=
_900mbit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_100us_rrulbe_eg_cak=
e_900mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_100us_rrulbe_eg_ca=
ke_900mbit/index.html>

* I wonder if the UDP flood tests really work at 900mbit:

=
http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_fq_codel_900mbit=
/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_fq_codel_900mbi=
t/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_cakeeth_900mbit/=
index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_cakeeth_900mbit=
/index.html>

* Again as before, I=E2=80=99m surprised that srchost/dsthost is much =
more fair. Numbers that follow are 1-flow/12-flow throughput. For =
srchost/dsthost, it=E2=80=99s 413/439mbit up, 413/447 down and for =
dual-srchost/dual-dsthost it=E2=80=99s 126/647mbit up, 77/749mbit down. =
Rampant speculation: does this have to do with the =E2=80=9Cpeeling=E2=80=9D=
? And should we / do we even do peeling with soft rate limiting? I think =
I saw it help with bql(?), but I=E2=80=99m not sure I=E2=80=99ve seen it =
help when rate limited below the interface=E2=80=99s rate.

=
http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_src_cake_dst=
_900mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_src_cake_ds=
t_900mbit/index.html>
=
http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_dsrc_cake_dd=
st_900mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_dsrc_cake_d=
dst_900mbit/index.html>

* I still need a better understanding of what triple-isolate does. It =
isn=E2=80=99t clear to me from the man page. Results here are similar to =
dual-srchost/dual-dsthost:

=
http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_dsrc_cake_dd=
st_900mbit/index.html =
<http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_dsrc_cake_d=
dst_900mbit/index.html>


*** Round 2 Plans ***

- Add bql tests to anywhere rate limiting is used
- Add ethernet keyword to host isolation tests
- Add ethtool output to host info
- Remove or improve flow isolation tests
- Add host isolation tests with rtt variation (to look again at problem =
I reported in an earlier thread)

*** Future Plans ***

- Use netem to make a spread of rtts and bandwidths
- Add VoIP tests (I hope to do this with irtt)
- Add ack filtering tests
- Test BBR
- Use qemu to test other archs (I may never get to this, honestly)


--Apple-Mail=_89852D60-7F70-45D4-A770-A966C4D98DB2
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><a href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/</a><div =
class=3D""><br class=3D""></div><div class=3D"">Round 1 Tarball:&nbsp;<a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1.tgz" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1.tgz</a></div><div=
 class=3D""><br class=3D""></div><div class=3D"">Round 0 Tarball =
(previous run):&nbsp;<a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round0.tgz" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round0.tgz</a><br =
class=3D""><div class=3D""><br class=3D""></div><div class=3D"">*** =
Notes/Analysis ***</div><div class=3D""><br class=3D""></div><div =
class=3D"">* New bql tests show the effectiveness of =
cake=E2=80=99s&nbsp;TSO/GSO/GRO =E2=80=9Cpeeling=E2=80=9D vs fq_codel? =
Or am I seeing an mq artifact on my 4-queue device?</div><div =
class=3D""><br class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_eg_f=
q_codel_nolimit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_e=
g_fq_codel_nolimit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_eg_c=
akeeth_nolimit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/bql_csrt_rrulbe_e=
g_cakeeth_nolimit/index.html</a></div><div class=3D""><br =
class=3D""></div><div class=3D"">* Cake holds TCP RTT to half that of =
fq_codel at 10mbit bandwidth. I like to call this technique of rate =
limiting well below the interface=E2=80=99s maximum "over-limiting=E2=80=9D=
, which seems to work well with stable point-to-point WiFi connections. =
(Of course, point-to-multipoint or unstable rates requires the new =
ath9k/10k driver changes as limiting in this way would not be effective, =
well explained here- <a =
href=3D"https://www.youtube.com/watch?v=3DRb-UnHDw02o" =
class=3D"">https://www.youtube.com/watch?v=3DRb-UnHDw02o</a>):</div><div =
class=3D""><br class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_sf=
q_10.0mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg=
_sfq_10.0mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_fq=
_codel_10.0mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg=
_fq_codel_10.0mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_ca=
keeth_10.0mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg=
_cakeeth_10.0mbit/index.html</a></div><div class=3D""><br =
class=3D""></div><div class=3D"">* Cake at 950mbit performed just as =
well as fq_codel, vs the round0 runs where fq_codel had a bit of an =
advantage. Perhaps the addition of the =E2=80=9Cethernet=E2=80=9D =
keyword did this?</div><div class=3D""><br class=3D""></div><div =
class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_fq=
_codel_950mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg=
_fq_codel_950mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg_ca=
keeth_950mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/eg_csrt_rrulbe_eg=
_cakeeth_950mbit/index.html</a></div><div class=3D""><br =
class=3D""></div><div class=3D"">** I=E2=80=99m finding the "32 Flows, =
RRUL Best-Effort=E2=80=9D tests fascinating to look at. It might be =
possible to spot implementation differences between fq_codel and cake =
from these.</div><div class=3D""><br class=3D""></div><div class=3D"">* =
At 10mbit, cake and fq_codel are better at most things than sfq by an =
order of magnitude or more. But interestingly, at this bandwidth =
fq_codel=E2=80=99s results look a bit better than cake, where total =
bandwidth for fq_codel is higher (4.78/9.12mbit for fq_codel and =
3.91/8.63mbit for cake) and ping latency a bit lower (1.79ms vs 1.92ms), =
and TCP RTT significantly better (~30ms vs ~45 ms). Maybe cake's =
=E2=80=9Cethernet=E2=80=9D keyword at these low bandwidths affects a =
test like this disproportionally?</div><div class=3D""><br =
class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_sfq_10.0m=
bit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_sfq_10=
.0mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_=
10.0mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_cod=
el_10.0mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_1=
0.0mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeet=
h_10.0mbit/index.html</a></div><div class=3D""><br class=3D""></div><div =
class=3D"">* At 100mbit, the situation reverses, with fq_codel TCP RTT =
above 10ms and cake around 4.75ms.</div><div class=3D""><br =
class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_=
100mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_cod=
el_100mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_1=
00mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeet=
h_100mbit/index.html</a></div><div class=3D""><br class=3D""></div><div =
class=3D"">* And then above 200mbit, fq_codel performs considerably =
better than cake at the 32/32 flow tests. At 900mbit, UDP/ping is 1.1ms =
for fq_codel and 10ms for cake. TCP RTT is ~6.5ms for fq_codel and ~12ms =
for cake. Dave=E2=80=99s earlier explanation probably applies here: =
"Since fq_codel supports superpackets and cake peels them, we have a cpu =
and latency hit that originates from that. Also the code derived =
algorithm in cake differs quite significantly from mainline codel, and =
my principal gripe about it has been that it has not been extensively =
tested against higher delays."</div><div class=3D""><br =
class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_codel_=
900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_fq_cod=
el_900mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeeth_9=
00mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/32flows_eg_cakeet=
h_900mbit/index.html</a></div><div class=3D""><br class=3D""></div><div =
class=3D"">* On the Cake RTT tests, we take about a 15% hit in total TCP =
throughput at rtt 1ms vs rtt 10ms (1454mbit vs 1700mbit), and a 55% hit =
at rtt 100us (which is why you=E2=80=99d probably only consider using =
that on 10gbit links). If we don=E2=80=99t remove the =E2=80=98ethernet=E2=
=80=99 keyword altogether, I guess I=E2=80=99d like to see it at least =
be 10ms, as TCP RTT only goes from around 0.8ms to 1.8ms, which I =
don=E2=80=99t think makes a huge latency difference in real world terms. =
Or it might be another argument for removing datacentre, ethernet and =
metro altogether, because there are tradeoffs to decide about.</div><div =
class=3D""><br class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_10ms_rrulbe=
_eg_cake_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_10ms_rru=
lbe_eg_cake_900mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_1ms_rrulbe_=
eg_cake_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_1ms_rrul=
be_eg_cake_900mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_100us_rrulb=
e_eg_cake_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/cake_rtt_100us_rr=
ulbe_eg_cake_900mbit/index.html</a></div><div class=3D""><br =
class=3D""></div><div class=3D"">* I wonder if the UDP flood tests =
really work at 900mbit:</div><div class=3D""><br class=3D""></div><div =
class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_fq_codel=
_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_fq_co=
del_900mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_cakeeth_=
900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/udpflood_eg_cakee=
th_900mbit/index.html</a></div><div class=3D""><br class=3D""></div><div =
class=3D"">* Again as before, I=E2=80=99m surprised that srchost/dsthost =
is much more fair. Numbers that follow are 1-flow/12-flow throughput. =
For srchost/dsthost, it=E2=80=99s 413/439mbit up, 413/447 down and for =
dual-srchost/dual-dsthost it=E2=80=99s 126/647mbit up, 77/749mbit down. =
Rampant speculation: does this have to do with the =E2=80=9Cpeeling=E2=80=9D=
? And should we / do we even do peeling with soft rate limiting? I think =
I saw it help with bql(?), but I=E2=80=99m not sure I=E2=80=99ve seen it =
help when rate limited below the interface=E2=80=99s rate.</div><div =
class=3D""><br class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_src_=
cake_dst_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_s=
rc_cake_dst_900mbit/index.html</a></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_dsrc=
_cake_ddst_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_d=
src_cake_ddst_900mbit/index.html</a></div><div class=3D""><br =
class=3D""></div><div class=3D"">* I still need a better understanding =
of what triple-isolate does. It isn=E2=80=99t clear to me from the man =
page. Results here are similar to dual-srchost/dual-dsthost:</div><div =
class=3D""><br class=3D""></div><div class=3D""><a =
href=3D"http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_dsrc=
_cake_ddst_900mbit/index.html" =
class=3D"">http://www.drhleny.cz/bufferbloat/cake/round1/hostiso_eg_cake_d=
src_cake_ddst_900mbit/index.html</a></div><div class=3D""><br =
class=3D""></div><div class=3D""><br class=3D""></div><div class=3D"">*** =
Round 2 Plans ***</div><div class=3D""><br class=3D""></div><div =
class=3D"">- Add bql tests to anywhere rate limiting is used</div><div =
class=3D"">- Add ethernet keyword to host isolation tests</div><div =
class=3D"">- Add ethtool output to host info</div><div class=3D"">- =
Remove or improve flow isolation tests</div><div class=3D""><div =
class=3D""><div class=3D"">- Add host isolation tests with rtt variation =
(to look again at problem I reported in an earlier thread)</div><div =
class=3D""><div class=3D""><br class=3D""></div></div><div class=3D"">*** =
Future Plans ***</div><div class=3D""><br class=3D""></div><div =
class=3D"">- Use netem to make a spread of rtts and bandwidths</div><div =
class=3D""><div class=3D"">- Add VoIP tests (I hope to do this with =
irtt)</div></div><div class=3D"">- Add ack filtering tests</div><div =
class=3D"">- Test BBR</div></div><div class=3D""><div class=3D"">- Use =
qemu to test other archs (I may never get to this, honestly)</div><div =
class=3D""><br class=3D""></div></div></div></div></body></html>=

--Apple-Mail=_89852D60-7F70-45D4-A770-A966C4D98DB2--