From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 857EA3B2A4 for ; Mon, 31 Dec 2018 03:54:02 -0500 (EST) Received: by mail-wm1-x334.google.com with SMTP id m22so24070287wml.3 for ; Mon, 31 Dec 2018 00:54:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=BBvpfMaXl3O9j4fzhwawscb/ql1pCe3y4kBmn6wonEU=; b=kukJprZBqStFN5yD9z6aCj7EVEmqJPjnxgwKVvUvORA8YDshyGqmzmsyg9m4K+e3xi lm7XMohZqUHA5SRHOw941uvvRL2dSRaBgFFBcFIQ+6N0iHVjSIvN3LUCA43drGbu+6gs YAP829bAVk7XQ0TOqRTQ1j2+5OWa9aP6SS+y85Qd34SP4GdlUyoKvGMaqR0IKPIBZoqI dV7ISaHECr3jA+v/dp/soUlkTp4x/XPPT0DhBAT/9OWeK/iJNLVm4pSEaG2TKx3YArpk 97jwilhcWjv+bm2Vw1aG8EJ/lKIPtbu9xOAZg6p+7W0yxNJkp8VMyUWwkRFj8R36VZ0W FS7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=BBvpfMaXl3O9j4fzhwawscb/ql1pCe3y4kBmn6wonEU=; b=t2d2UjHrgOxldDmFXh0+++kaYZ6R0ZRZiRIMSfRr7kkaHX8eo8FQquYD+PUDsNmuWv 8dkbM2rdGao6+hBLVmPOjwAAH8ZXiQZO2NWQFW42bAEEuslXJqYRE1WK557yA3JOj8bg WMwoqrmaH+XrHd403Bo7f3UsMUyRuEc38VrBIWbpv2vh4v/5ra3bsNVdDGT9+/mTrh7x x3DW8t5aq2hf1G+ZcR47OkofkfY7ohOCuxmrrq3LKI5BpQfXoIHGO5grDamcg5tan7oW sLoPEeaI+2pkjl3fVfBTD4M2nhr1J8w03axH4zH5FoEflByL2C+J2vWNDbNpIg6cxMy3 uSXQ== X-Gm-Message-State: AA+aEWbUt/gFBx8u3TX558FJz+FhI9lfRbIjMvvFIkv+QOYfilrBcXwi V+A0ftUPLeBSJFn+6KRov6xPhw== X-Google-Smtp-Source: AFSGD/UOY2dpQqd3TN5guZ27RHvZD6czcO9X9a79y00JhHDhiyJXeH+uzSU8X4BC2qUyzbohk6Ax1g== X-Received: by 2002:a1c:47:: with SMTP id 68mr28874677wma.89.1546246441477; Mon, 31 Dec 2018 00:54:01 -0800 (PST) Received: from tron.luk.heistp.net (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id z12sm34821227wrh.35.2018.12.31.00.54.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 31 Dec 2018 00:54:00 -0800 (PST) From: Pete Heist Message-Id: <33EE284A-03FA-404A-B474-DD5C267A911C@heistp.net> Content-Type: multipart/alternative; boundary="Apple-Mail=_A3A196A3-6743-42F2-838C-8170264D74A0" Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Date: Mon, 31 Dec 2018 09:53:59 +0100 In-Reply-To: <967991BD-108C-4151-8DF8-737569CAC653@gmx.de> Cc: Cake List To: Sebastian Moeller References: <555BACDF-7A1E-4C6B-BFB1-0C5ACB77715E@heistp.net> <79875813-63E8-48D4-9A1F-B7C18F1325D0@gmx.de> <52F7AEE6-D156-4569-94C4-9E9E1590C84F@heistp.net> <967991BD-108C-4151-8DF8-737569CAC653@gmx.de> X-Mailer: Apple Mail (2.3445.9.1) Subject: Re: [Cake] cake and hfsc rate limiters outperforming htb on one-armed router X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Dec 2018 08:54:02 -0000 --Apple-Mail=_A3A196A3-6743-42F2-838C-8170264D74A0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 So more specifically, for 500mbit I can use a calculated burst/cburst of = 62500 (1000 * 500000 / 8000), here=E2=80=99s the change: default: 320mbit up / 268mbit down, 3ms latency, 8.8ms tcp rtt burst/cburst 62500: 200mbit up / 480mbit down, 40ms latency, 40ms tcp = rtt Aggregate throughput goes from 588mbit to 680mbit, but latency = skyrockets. A burst of only 2xMTU doesn=E2=80=99t change much, and 4xMTU already = jumps to 30ms latency and only 630mbit aggregate bandwidth. So for this one-armed router setup on this hardware, I don=E2=80=99t see = a worthwhile tradeoff of latency for aggregate throughput. Perhaps in = other situations, it could be useful. :) > On Dec 31, 2018, at 1:10 AM, Sebastian Moeller = wrote: >=20 > Well the idea would be to scale the buffer to cover, say Xms at the = configured bandwidth, so HTB could deal with CPU stalls up to X-Yms = (with Y << X)... We just switched sqm-scripts to automatically scale the = buffering to 1ms.... > Would be interested to learn whether that would increase HTB's = utilisation? >=20 >=20 > On December 30, 2018 11:36:27 PM GMT+01:00, Pete Heist = wrote: > The experiments I did with those didn=E2=80=99t yield great results, = with changing a value by one MTU sometimes causing sudden throughput or = inter-flow latency increases, with the tradeoffs not being clear. I=E2=80=99= m afraid admins could easily cause problems fiddling with these. = Fortunately most customer facing routers have aggregate bitrates that an = APU1 can handle even with default htb, or cake. I also appreciate that = such settings don=E2=80=99t exist in cake=E2=80=A6 :) >=20 > On Dec 30, 2018, at 10:51 PM, Sebastian Moeller = wrote: >=20 > Hi Pete, >=20 > you might want to have a look at htb's burst and cburst parameters, as = these should allow to trade in latency under load for bandwidth = utilization. >=20 >=20 > Best Regards > Sebastian >=20 > On Dec 30, 2018, at 21:42, Pete Heist wrote: >=20 > It=E2=80=99s a bit more complicated than this. It looks like the htb = rate limiter is different in that as rates increase the actual rate = starts to deviate from the specified rate early on, but it rather = gracefully handles the =E2=80=9Cout of CPU=E2=80=9D situation, where it = still maintains control of the queue, just gradually fails to meet the = rate specified by greater and greater percentages. >=20 > Instead of a single flow test with iperf3, here are rates that each = limiter can reach on egress of both apu1a interfaces during an rrul_be = test: >=20 > # - max limit on APU for one-armed routing, rrul_be test 4+4 flows = (firewall on): > # - cake: 210mbit > # - htb+fq_codel: 93%@100mbit, 90%@200mbit, 84%@300mbit, = 72%@400mbit, 59%@500mbit > # - hfsc+fq_codel: 310mbit > # - hfsc+cake: 300mbit >=20 > The numbers for cake and hfsc are right before loss of queue, and with = htb the queue isn=E2=80=99t lost even at 500mbit, for example, just the = actual rate is only 59% of what was specified. >=20 > I really need to graph the specified rate vs the actual rate, = inter-flow and intra-flow latency, stepped 25mbit at a time. I think it = would be interesting, so this is on my todo list if there=E2=80=99s time = after the ISP config gets done. >=20 > On Dec 28, 2018, at 1:17 AM, Pete Heist wrote: >=20 > For whatever reason, I=E2=80=99m seeing the rate limiters in cake and = hfsc vastly outperform htb in the one-armed router configuration I = described in my previous thread. To simplify things, I apply the qdiscs = with a single class only at egress of eth0 on apu1a: >=20 > apu2a <=E2=80=94 default VLAN =E2=80=94> apu1a <=E2=80=94 VLAN = 3300 =E2=80=94> apu2b >=20 > I use iperf3 from apu2a to apu2b and find the rate at which things = break down. Whereas cake and hfsc can both reach around 850mbit, htb is = breaking down at around 200mbit, which seems rather strange. This could = be a function of the older kernel I have to use, the hardware, or maybe = htb just isn=E2=80=99t suited well to this task for some reason. I wish = I knew, as I=E2=80=99d rather be using htb for this task than hfsc = (especially given the lockup issue with cake)... >=20 > =E2=80=94=E2=80=94 >=20 > #!/bin/bash=20 >=20 > # point where iperf3 throughput drops below ~93% of theoretical: > # htb: 200mbit > # hfsc: 850mbit > # cake: 850mbit >=20 > IFACE=3Deth0 > RATE=3D850mbit >=20 > start_htb() { > stop > tc qdisc add dev $IFACE root handle 1: htb default 1 > tc class add dev $IFACE parent 1: classid 1:1 htb rate $RATE ceil = $RATE > tc qdisc add dev $IFACE parent 1:1 handle 10: fq_codel > } >=20 > start_hfsc() { > stop > tc qdisc add dev $IFACE root handle 1: hfsc default 1 > tc class add dev $IFACE parent 1: classid 1:1 hfsc sc rate $RATE = ul rate $RATE > tc qdisc add dev $IFACE parent 1:1 handle 10: fq_codel > } >=20 > start_cake() { > stop > tc qdisc add dev $IFACE root cake bandwidth $RATE > } >=20 > stop() { > tc qdisc del dev $IFACE root &>/dev/null > tc qdisc del dev $IFACE ingress &>/dev/null > } >=20 > "$@=E2=80=9C > =E2=80=94=E2=80=94 >=20 > root@apu1a:~/rate_limiters# uname -a > Linux apu1a 3.16.7-ckt9-voyage #1 SMP Thu Apr 23 11:10:44 HKT 2015 = i686 GNU/Linux >=20 > root@apu1a:~/rate_limiters# cat /proc/cpuinfo=20 > processor : 0 > vendor_id : AuthenticAMD > cpu family : 20 > model : 2 > model name : AMD G-T40E Processor > stepping : 0 > microcode : 0x5000101 > cpu MHz : 800.000 > cache size : 512 KB > physical id : 0 > siblings : 2 > core id : 0 > cpu cores : 2 > apicid : 0 > initial apicid : 0 > fdiv_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 6 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge = mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext = fxsr_opt pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid = aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic = cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt arat = hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall > bogomips : 1999.83 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > power management: ts ttp tm stc 100mhzsteps hwpstate >=20 > processor : 1 > vendor_id : AuthenticAMD > cpu family : 20 > model : 2 > model name : AMD G-T40E Processor > stepping : 0 > microcode : 0x5000101 > cpu MHz : 800.000 > cache size : 512 KB > physical id : 0 > siblings : 2 > core id : 1 > cpu cores : 2 > apicid : 1 > initial apicid : 1 > fdiv_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 6 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge = mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext = fxsr_opt pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid = aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic = cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt arat = hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall > bogomips : 1999.83 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > power management: ts ttp tm stc 100mhzsteps hwpstate >=20 > root@apu1a:~/rate_limiters# ethtool -i eth0 > driver: r8169 > version: 2.3LK-NAPI > firmware-version: rtl_nic/rtl8168e-2.fw > bus-info: 0000:01:00.0 > supports-statistics: yes > supports-test: no > supports-eeprom-access: no > supports-register-dump: yes > supports-priv-flags: no >=20 > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake = >=20 >=20 >=20 > --=20 > Sent from my Android device with K-9 Mail. Please excuse my brevity. --Apple-Mail=_A3A196A3-6743-42F2-838C-8170264D74A0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 So = more specifically, for 500mbit I can use a calculated burst/cburst of = 62500 (1000 * 500000 / 8000), here=E2=80=99s the change:

default: 320mbit up / = 268mbit down, 3ms latency, 8.8ms tcp rtt
burst/cburst= 62500: 200mbit up / 480mbit down, 40ms latency, 40ms tcp rtt

Aggregate throughput = goes from 588mbit to 680mbit, but latency skyrockets.

A burst of only 2xMTU = doesn=E2=80=99t change much, and 4xMTU already jumps to 30ms latency and = only 630mbit aggregate bandwidth.

So for this one-armed router setup on = this hardware, I don=E2=80=99t see a worthwhile tradeoff of latency for = aggregate throughput. Perhaps in other situations, it could be useful. = :)

On Dec 31, 2018, at 1:10 AM, Sebastian = Moeller <moeller0@gmx.de> wrote:

Well = the idea would be to scale the buffer to cover, say Xms at the = configured bandwidth, so HTB could deal with CPU stalls up to X-Yms = (with Y << X)... We just switched sqm-scripts to automatically = scale the buffering to 1ms....
Would be interested to = learn whether that would increase HTB's utilisation?


On December 30, = 2018 11:36:27 PM GMT+01:00, Pete Heist <pete@heistp.net> = wrote:
The experiments I did with those didn=E2=80=99t =
yield great results, with changing a value by one MTU sometimes causing =
sudden throughput or inter-flow latency increases, with the tradeoffs =
not being clear. I=E2=80=99m afraid admins could easily cause problems =
fiddling with these. Fortunately most customer facing routers have =
aggregate bitrates that an APU1 can handle even with default htb, or =
cake. I also appreciate that such settings don=E2=80=99t exist in =
cake=E2=80=A6 :)

On Dec 30, 2018, at 10:51 PM, = Sebastian Moeller <moeller0@gmx.de> wrote:

Hi = Pete,

you might want to have a look at = htb's burst and cburst parameters, as these should allow to trade in = latency under load for bandwidth utilization.


Best Regards
Sebastian

On Dec 30, 2018, at 21:42, Pete Heist <pete@heistp.net> = wrote:

It=E2=80=99s a bit more complicated = than this. It looks like the htb rate limiter is different in that as = rates increase the actual rate starts to deviate from the specified rate = early on, but it rather gracefully handles the =E2=80=9Cout of CPU=E2=80=9D= situation, where it still maintains control of the queue, just = gradually fails to meet the rate specified by greater and greater = percentages.

Instead of a single flow test = with iperf3, here are rates that each limiter can reach on egress of = both apu1a interfaces during an rrul_be test:

# - max limit on APU for one-armed routing, rrul_be test 4+4 = flows (firewall on):
# - cake: 210mbit
# = - htb+fq_codel: 93%@100mbit, 90%@200mbit, 84%@300mbit, 72%@400mbit, = 59%@500mbit
# - hfsc+fq_codel: 310mbit
# = - hfsc+cake: 300mbit

The numbers for cake = and hfsc are right before loss of queue, and with htb the queue isn=E2=80=99= t lost even at 500mbit, for example, just the actual rate is only 59% of = what was specified.

I really need to graph = the specified rate vs the actual rate, inter-flow and intra-flow = latency, stepped 25mbit at a time. I think it would be interesting, so = this is on my todo list if there=E2=80=99s time after the ISP config = gets done.

On Dec 28, 2018, at 1:17 AM, Pete Heist <pete@heistp.net> = wrote:

For whatever reason, I=E2=80=99m = seeing the rate limiters in cake and hfsc vastly outperform htb in the = one-armed router configuration I described in my previous thread. To = simplify things, I apply the qdiscs with a single class only at egress = of eth0 on apu1a:

apu2a <=E2=80=94 = default VLAN =E2=80=94> apu1a <=E2=80=94 VLAN 3300 =E2=80=94>= apu2b

I use iperf3 from apu2a to apu2b = and find the rate at which things break down. Whereas cake and hfsc can = both reach around 850mbit, htb is breaking down at around 200mbit, which = seems rather strange. This could be a function of the older kernel I = have to use, the hardware, or maybe htb just isn=E2=80=99t suited well = to this task for some reason. I wish I knew, as I=E2=80=99d rather be = using htb for this task than hfsc (especially given the lockup issue = with cake)...

=E2=80=94=E2=80=94

#!/bin/bash

# = point where iperf3 throughput drops below ~93% of theoretical:
# htb: 200mbit
# hfsc: 850mbit
# = cake: 850mbit

IFACE=3Deth0
RATE=3D850mbit

start_htb() {
stop
tc qdisc add dev $IFACE root = handle 1: htb default 1
tc class add dev $IFACE = parent 1: classid 1:1 htb rate $RATE ceil $RATE
tc = qdisc add dev $IFACE parent 1:1 handle 10: fq_codel
}

start_hfsc() {
stop
tc qdisc add dev $IFACE root handle 1: hfsc default 1
tc class add dev $IFACE parent 1: classid 1:1 hfsc sc = rate $RATE ul rate $RATE
tc qdisc add dev $IFACE = parent 1:1 handle 10: fq_codel
}

start_cake() {
stop
tc = qdisc add dev $IFACE root cake bandwidth $RATE
}

stop() {
tc qdisc del dev = $IFACE root &>/dev/null
tc qdisc del dev = $IFACE ingress &>/dev/null
}

"$@=E2=80=9C
=E2=80=94=E2=80=94
root@apu1a:~/rate_limiters# uname -a
Linux = apu1a 3.16.7-ckt9-voyage #1 SMP Thu Apr 23 11:10:44 HKT 2015 i686 = GNU/Linux

root@apu1a:~/rate_limiters# cat = /proc/cpuinfo
processor : 0
vendor_id = : AuthenticAMD
cpu family : 20
model = : 2
model name : AMD G-T40E Processor
stepping : 0
microcode : 0x5000101
cpu MHz : 800.000
cache size = : 512 KB
physical id : 0
siblings = : 2
core id : 0
cpu cores = : 2
apicid : 0
initial apicid = : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 6
wp : yes
flags : fpu = vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 = clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp = lm constant_tsc nonstop_tsc extd_apicid aperfmperf pni monitor ssse3 = cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a = misalignsse 3dnowprefetch ibs skinit wdt arat hw_pstate npt lbrv = svm_lock nrip_save pausefilter vmmcall
bogomips : = 1999.83
clflush size : 64
cache_alignment= : 64
address sizes : 36 bits physical, 48 = bits virtual
power management: ts ttp tm stc 100mhzsteps = hwpstate

processor : 1
vendor_id : AuthenticAMD
cpu family : 20
model : 2
model name : AMD = G-T40E Processor
stepping : 0
microcode = : 0x5000101
cpu MHz : 800.000
cache size : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 6
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 = apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht = syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc nonstop_tsc = extd_apicid aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy = svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit = wdt arat hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall
bogomips : 1999.83
clflush size : 64
cache_alignment : 64
address sizes = : 36 bits physical, 48 bits virtual
power management: ts = ttp tm stc 100mhzsteps hwpstate

root@apu1a:~/rate_limiters# ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version: rtl_nic/rtl8168e-2.fw
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: = no


Cake mailing = list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake



--
Sent = from my Android device with K-9 Mail. Please excuse my = brevity.

= --Apple-Mail=_A3A196A3-6743-42F2-838C-8170264D74A0--