From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A36163B2A4 for ; Sun, 30 Dec 2018 19:10:05 -0500 (EST) Received: from [192.168.1.133] ([77.181.129.114]) by mail.gmx.com (mrgmx101 [212.227.17.168]) with ESMTPSA (Nemesis) id 0Lg5kl-1h6mJc2Glf-00pe2p; Mon, 31 Dec 2018 01:10:03 +0100 Date: Mon, 31 Dec 2018 01:10:01 +0100 User-Agent: K-9 Mail for Android In-Reply-To: <52F7AEE6-D156-4569-94C4-9E9E1590C84F@heistp.net> References: <555BACDF-7A1E-4C6B-BFB1-0C5ACB77715E@heistp.net> <79875813-63E8-48D4-9A1F-B7C18F1325D0@gmx.de> <52F7AEE6-D156-4569-94C4-9E9E1590C84F@heistp.net> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----5U9QSX0K8JE5I8DO7BI23HPLEC9H28" Content-Transfer-Encoding: 7bit To: Pete Heist CC: Cake List From: Sebastian Moeller Message-ID: <967991BD-108C-4151-8DF8-737569CAC653@gmx.de> X-Provags-ID: V03:K1:itaIB6L5WEsMfAXVamRyFOwt/AlbS6KHDQLOp5GzC4o+xdLYMqb pXw93fF4flgz/0L2OKv2P+LtqQArtAlHhjmPX7WjdrE3N2fTE+9aqIpPhU5kGOse/DlbtPq mvJ9wAoL5dAtvqLsvCsdiHvn/063q4lmdYpe7C90Dxrwg9bLXDR3GGMZ/ZYCnGXlqmRKovi y/IISm/A3iFgG3qmhSjEQ== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:sCMcxxIyIIY=:WbMedoPtNKWJPCC1mnOMUE Wv7+XSc6f+RmFlbuBeiuB+lU394MgwG+Z56GkCXXOHLfY4ImARXZWD1WAbQTwi7Fz1FiO31qY 1BE7ysC10kunKwbPvbiXXb/TZHtUipD0xowNwoLmK1g9q7pTUpgdHRhvTt3OZxm5dhFV8QGXX 0n2ny3glGxZ5WBd+9jEyZCL+jkndLXZxGbjk4+v9UCkIGqLcJbtJDG/xE7VcUKJqjgu2RVTpx vZsYueX6Cgb54l18+C1kloy/U7JLR3vZFp7Hx+qy5PDUEClEYZy5Le/v27D8Gwv11q6TAm4Uo r56HA4XQkQ/ndTqtXNimmCZU3NhqZmWjLTMEDZ1FEe5Sifi/wCmFKbwi0r5oo6WcrJ4jegk3l dfbAqjicMzN6XEMq7Ea/xGOhb2p67GTK3Jwmolx0Flber84FdwVCx+iFvAu/wOhN9Ewi/emHS 5Ly7BIELzsnwmPl3LseMoGER/puB6G3YQuPAlTmGYyCHv49Qg/8rnAgVj/DdIzZFFts44qwJn nLwCU2YxeysxRsajjEqeJ4UbPM980mgjhT760kZ2deQKk3GO7WE6sttd9gA0wDaA/y0asSjvz aLqoP7lh8PKQ1ZiwTgLCOjOtr0Lz89GvCl0YzBeAcF1+dAVzBkWblhFiG22Un8cbtXZFIBNC7 Rz14ssXpkhHYq58I9AIDrD1oO8TrhsE+lNREIpU+9+fOHSnPFystgTGdkaDoBOdfqBedqk5Im cKIXKD59fvA4DvfPDfV9C5cCM5EAG0q8ez9pYrIremyciQubeR1uxrLmeVd/Xhs9KQOAoEvli 4ywQm6iqvNR3c8IciiX49TgPECmvypRlCD62xYDso+HsqtRxD4yISEjiIF7zzh1nW/ClH4HzF 9RMYlDVx//TzQIEt5/ssaINuY9pX01jyqzmsLxOmjCtS8HZq8zkNk/Dsw+Fa0m Subject: Re: [Cake] cake and hfsc rate limiters outperforming htb on one-armed router X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Dec 2018 00:10:06 -0000 ------5U9QSX0K8JE5I8DO7BI23HPLEC9H28 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Well the idea would be to scale the buffer to cover, say Xms at the configu= red bandwidth, so HTB could deal with CPU stalls up to X-Yms (with Y << X)= =2E=2E=2E We just switched sqm-scripts to automatically scale the buffering= to 1ms=2E=2E=2E=2E Would be interested to learn whether that would increase HTB's utilisation= ? On December 30, 2018 11:36:27 PM GMT+01:00, Pete Heist = wrote: >The experiments I did with those didn=E2=80=99t yield great results, with >changing a value by one MTU sometimes causing sudden throughput or >inter-flow latency increases, with the tradeoffs not being clear=2E I=E2= =80=99m >afraid admins could easily cause problems fiddling with these=2E >Fortunately most customer facing routers have aggregate bitrates that >an APU1 can handle even with default htb, or cake=2E I also appreciate >that such settings don=E2=80=99t exist in cake=E2=80=A6 :) > >> On Dec 30, 2018, at 10:51 PM, Sebastian Moeller >wrote: >>=20 >> Hi Pete, >>=20 >> you might want to have a look at htb's burst and cburst parameters, >as these should allow to trade in latency under load for bandwidth >utilization=2E >>=20 >>=20 >> Best Regards >> Sebastian >>=20 >>> On Dec 30, 2018, at 21:42, Pete Heist wrote: >>>=20 >>> It=E2=80=99s a bit more complicated than this=2E It looks like the htb= rate >limiter is different in that as rates increase the actual rate starts >to deviate from the specified rate early on, but it rather gracefully >handles the =E2=80=9Cout of CPU=E2=80=9D situation, where it still mainta= ins control of >the queue, just gradually fails to meet the rate specified by greater >and greater percentages=2E >>>=20 >>> Instead of a single flow test with iperf3, here are rates that each >limiter can reach on egress of both apu1a interfaces during an rrul_be >test: >>>=20 >>> # - max limit on APU for one-armed routing, rrul_be test 4+4 flows >(firewall on): >>> # - cake: 210mbit >>> # - htb+fq_codel: 93%@100mbit, 90%@200mbit, 84%@300mbit, >72%@400mbit, 59%@500mbit >>> # - hfsc+fq_codel: 310mbit >>> # - hfsc+cake: 300mbit >>>=20 >>> The numbers for cake and hfsc are right before loss of queue, and >with htb the queue isn=E2=80=99t lost even at 500mbit, for example, just = the >actual rate is only 59% of what was specified=2E >>>=20 >>> I really need to graph the specified rate vs the actual rate, >inter-flow and intra-flow latency, stepped 25mbit at a time=2E I think it >would be interesting, so this is on my todo list if there=E2=80=99s time = after >the ISP config gets done=2E >>>=20 >>>> On Dec 28, 2018, at 1:17 AM, Pete Heist wrote: >>>>=20 >>>> For whatever reason, I=E2=80=99m seeing the rate limiters in cake and= hfsc >vastly outperform htb in the one-armed router configuration I described >in my previous thread=2E To simplify things, I apply the qdiscs with a >single class only at egress of eth0 on apu1a: >>>>=20 >>>> apu2a <=E2=80=94 default VLAN =E2=80=94> apu1a <=E2=80=94 VLAN = 3300 =E2=80=94> apu2b >>>>=20 >>>> I use iperf3 from apu2a to apu2b and find the rate at which things >break down=2E Whereas cake and hfsc can both reach around 850mbit, htb is >breaking down at around 200mbit, which seems rather strange=2E This could >be a function of the older kernel I have to use, the hardware, or maybe >htb just isn=E2=80=99t suited well to this task for some reason=2E I wish= I knew, >as I=E2=80=99d rather be using htb for this task than hfsc (especially gi= ven >the lockup issue with cake)=2E=2E=2E >>>>=20 >>>> =E2=80=94=E2=80=94 >>>>=20 >>>> #!/bin/bash=20 >>>>=20 >>>> # point where iperf3 throughput drops below ~93% of theoretical: >>>> # htb: 200mbit >>>> # hfsc: 850mbit >>>> # cake: 850mbit >>>>=20 >>>> IFACE=3Deth0 >>>> RATE=3D850mbit >>>>=20 >>>> start_htb() { >>>> stop >>>> tc qdisc add dev $IFACE root handle 1: htb default 1 >>>> tc class add dev $IFACE parent 1: classid 1:1 htb rate $RATE >ceil $RATE >>>> tc qdisc add dev $IFACE parent 1:1 handle 10: fq_codel >>>> } >>>>=20 >>>> start_hfsc() { >>>> stop >>>> tc qdisc add dev $IFACE root handle 1: hfsc default 1 >>>> tc class add dev $IFACE parent 1: classid 1:1 hfsc sc rate >$RATE ul rate $RATE >>>> tc qdisc add dev $IFACE parent 1:1 handle 10: fq_codel >>>> } >>>>=20 >>>> start_cake() { >>>> stop >>>> tc qdisc add dev $IFACE root cake bandwidth $RATE >>>> } >>>>=20 >>>> stop() { >>>> tc qdisc del dev $IFACE root &>/dev/null >>>> tc qdisc del dev $IFACE ingress &>/dev/null >>>> } >>>>=20 >>>> "$@=E2=80=9C >>>> =E2=80=94=E2=80=94 >>>>=20 >>>> root@apu1a:~/rate_limiters# uname -a >>>> Linux apu1a 3=2E16=2E7-ckt9-voyage #1 SMP Thu Apr 23 11:10:44 HKT 201= 5 >i686 GNU/Linux >>>>=20 >>>> root@apu1a:~/rate_limiters# cat /proc/cpuinfo=20 >>>> processor : 0 >>>> vendor_id : AuthenticAMD >>>> cpu family : 20 >>>> model : 2 >>>> model name : AMD G-T40E Processor >>>> stepping : 0 >>>> microcode : 0x5000101 >>>> cpu MHz : 800=2E000 >>>> cache size : 512 KB >>>> physical id : 0 >>>> siblings : 2 >>>> core id : 0 >>>> cpu cores : 2 >>>> apicid : 0 >>>> initial apicid : 0 >>>> fdiv_bug : no >>>> f00f_bug : no >>>> coma_bug : no >>>> fpu : yes >>>> fpu_exception : yes >>>> cpuid level : 6 >>>> wp : yes >>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca >cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt >pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid aperfmperf pni >monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm >sse4a misalignsse 3dnowprefetch ibs skinit wdt arat hw_pstate npt lbrv >svm_lock nrip_save pausefilter vmmcall >>>> bogomips : 1999=2E83 >>>> clflush size : 64 >>>> cache_alignment : 64 >>>> address sizes : 36 bits physical, 48 bits virtual >>>> power management: ts ttp tm stc 100mhzsteps hwpstate >>>>=20 >>>> processor : 1 >>>> vendor_id : AuthenticAMD >>>> cpu family : 20 >>>> model : 2 >>>> model name : AMD G-T40E Processor >>>> stepping : 0 >>>> microcode : 0x5000101 >>>> cpu MHz : 800=2E000 >>>> cache size : 512 KB >>>> physical id : 0 >>>> siblings : 2 >>>> core id : 1 >>>> cpu cores : 2 >>>> apicid : 1 >>>> initial apicid : 1 >>>> fdiv_bug : no >>>> f00f_bug : no >>>> coma_bug : no >>>> fpu : yes >>>> fpu_exception : yes >>>> cpuid level : 6 >>>> wp : yes >>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca >cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt >pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid aperfmperf pni >monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm >sse4a misalignsse 3dnowprefetch ibs skinit wdt arat hw_pstate npt lbrv >svm_lock nrip_save pausefilter vmmcall >>>> bogomips : 1999=2E83 >>>> clflush size : 64 >>>> cache_alignment : 64 >>>> address sizes : 36 bits physical, 48 bits virtual >>>> power management: ts ttp tm stc 100mhzsteps hwpstate >>>>=20 >>>> root@apu1a:~/rate_limiters# ethtool -i eth0 >>>> driver: r8169 >>>> version: 2=2E3LK-NAPI >>>> firmware-version: rtl_nic/rtl8168e-2=2Efw >>>> bus-info: 0000:01:00=2E0 >>>> supports-statistics: yes >>>> supports-test: no >>>> supports-eeprom-access: no >>>> supports-register-dump: yes >>>> supports-priv-flags: no >>>>=20 >>>=20 >>> _______________________________________________ >>> Cake mailing list >>> Cake@lists=2Ebufferbloat=2Enet >>> https://lists=2Ebufferbloat=2Enet/listinfo/cake >>=20 --=20 Sent from my Android device with K-9 Mail=2E Please excuse my brevity=2E ------5U9QSX0K8JE5I8DO7BI23HPLEC9H28 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable Well the idea would be to scale the buffer to cove= r, say Xms at the configured bandwidth, so HTB could deal with CPU stalls u= p to X-Yms (with Y << X)=2E=2E=2E We just switched sqm-scripts to aut= omatically scale the buffering to 1ms=2E=2E=2E=2E
Would be interested to= learn whether that would increase HTB's utilisation?


On December 30, 2018 11:36:27 PM GMT+01:00, Pete Heist <= ;pete@heistp=2Enet> wrote:
The experiments I did with those didn=E2=80=99t yiel=
d great results, with changing a value by one MTU sometimes causing sudden =
throughput or inter-flow latency increases, with the tradeoffs not being cl=
ear=2E I=E2=80=99m afraid admins could easily cause problems fiddling with =
these=2E Fortunately most customer facing routers have aggregate bitrates t=
hat an APU1 can handle even with default htb, or cake=2E I also appreciate =
that such settings don=E2=80=99t exist in cake=E2=80=A6 :)

On Dec 30, 2018, at 10:51 PM, Sebas= tian Moeller <moeller0@gmx=2Ede> wrote:

Hi Pete,

you mi= ght want to have a look at htb's burst and cburst parameters, as these shou= ld allow to trade in latency under load for bandwidth utilization=2E

Best Regards
Sebastian

On Dec 30, 2018, at 21:42, Pete Heist <pete@heistp=2Enet>= ; wrote:

It=E2=80=99s a bit more complicated than this=2E It looks l= ike the htb rate limiter is different in that as rates increase the actual = rate starts to deviate from the specified rate early on, but it rather grac= efully handles the =E2=80=9Cout of CPU=E2=80=9D situation, where it still m= aintains control of the queue, just gradually fails to meet the rate specif= ied by greater and greater percentages=2E

Instead of a single flow t= est with iperf3, here are rates that each limiter can reach on egress of bo= th apu1a interfaces during an rrul_be test:

# - max limit on APU for= one-armed routing, rrul_be test 4+4 flows (firewall on):
# - cake: 21= 0mbit
# - htb+fq_codel: 93%@100mbit, 90%@200mbit, 84%@300mbit, 72%@400= mbit, 59%@500mbit
# - hfsc+fq_codel: 310mbit
# - hfsc+cake: 300mb= it

The numbers for cake and hfsc are right before loss of queue, and= with htb the queue isn=E2=80=99t lost even at 500mbit, for example, just t= he actual rate is only 59% of what was specified=2E

I really need to= graph the specified rate vs the actual rate, inter-flow and intra-flow lat= ency, stepped 25mbit at a time=2E I think it would be interesting, so this = is on my todo list if there=E2=80=99s time after the ISP config gets done= =2E

On Dec 28, 2018= , at 1:17 AM, Pete Heist <pete@heistp=2Enet> wrote:

For whatev= er reason, I=E2=80=99m seeing the rate limiters in cake and hfsc vastly out= perform htb in the one-armed router configuration I described in my previou= s thread=2E To simplify things, I apply the qdiscs with a single class only= at egress of eth0 on apu1a:

apu2a <=E2=80=94 default VLAN =E2= =80=94> apu1a <=E2=80=94 VLAN 3300 =E2=80=94> apu2b

I= use iperf3 from apu2a to apu2b and find the rate at which things break dow= n=2E Whereas cake and hfsc can both reach around 850mbit, htb is breaking d= own at around 200mbit, which seems rather strange=2E This could be a functi= on of the older kernel I have to use, the hardware, or maybe htb just isn= =E2=80=99t suited well to this task for some reason=2E I wish I knew, as I= =E2=80=99d rather be using htb for this task than hfsc (especially given th= e lockup issue with cake)=2E=2E=2E

=E2=80=94=E2=80=94

#!/bin/= bash

# point where iperf3 throughput drops below ~93% of theoretica= l:
# htb: 200mbit
# hfsc: 850mbit
# cake: 850mbit

IFACE=3De= th0
RATE=3D850mbit

start_htb() {
stop
tc qdisc ad= d dev $IFACE root handle 1: htb default 1
tc class add dev $IFACE p= arent 1: classid 1:1 htb rate $RATE ceil $RATE
tc qdisc add dev $IF= ACE parent 1:1 handle 10: fq_codel
}

start_hfsc() {
stop<= br> tc qdisc add dev $IFACE root handle 1: hfsc default 1
tc cl= ass add dev $IFACE parent 1: classid 1:1 hfsc sc rate $RATE ul rate $RATE tc qdisc add dev $IFACE parent 1:1 handle 10: fq_codel
}

s= tart_cake() {
stop
tc qdisc add dev $IFACE root cake bandwi= dth $RATE
}

stop() {
tc qdisc del dev $IFACE root &&g= t;/dev/null
tc qdisc del dev $IFACE ingress &>/dev/null
}=

"$@=E2=80=9C
=E2=80=94=E2=80=94

root@apu1a:~/rate_limiter= s# uname -a
Linux apu1a 3=2E16=2E7-ckt9-voyage #1 SMP Thu Apr 23 11:10:4= 4 HKT 2015 i686 GNU/Linux

root@apu1a:~/rate_limiters# cat /proc/cpui= nfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 20
mod= el : 2
model name : AMD G-T40E Processor
stepping : 0
microcode := 0x5000101
cpu MHz : 800=2E000
cache size : 512 KB
physical id : = 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initia= l apicid : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : y= es
fpu_exception : yes
cpuid level : 6
wp : yes
flags : fpu v= me de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush = mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_= tsc nonstop_tsc extd_apicid aperfmperf pni monitor ssse3 cx16 popcnt lahf_l= m cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs= skinit wdt arat hw_pstate npt lbrv svm_lock nrip_save pausefilter vmmcall<= br>bogomips : 1999=2E83
clflush size : 64
cache_alignment : 64
add= ress sizes : 36 bits physical, 48 bits virtual
power management: ts ttp = tm stc 100mhzsteps hwpstate

processor : 1
vendor_id : AuthenticAM= D
cpu family : 20
model : 2
model name : AMD G-T40E Processor
= stepping : 0
microcode : 0x5000101
cpu MHz : 800=2E000
cache size= : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores := 2
apicid : 1
initial apicid : 1
fdiv_bug : no
f00f_bug : nocoma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 6
= wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge = mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt = pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid aperfmperf pni monit= or ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a mi= salignsse 3dnowprefetch ibs skinit wdt arat hw_pstate npt lbrv svm_lock nri= p_save pausefilter vmmcall
bogomips : 1999=2E83
clflush size : 64
= cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtualpower management: ts ttp tm stc 100mhzsteps hwpstate

root@apu1a:~/= rate_limiters# ethtool -i eth0
driver: r8169
version: 2=2E3LK-NAPIfirmware-version: rtl_nic/rtl8168e-2=2Efw
bus-info: 0000:01:00=2E0
s= upports-statistics: yes
supports-test: no
supports-eeprom-access: no<= br>supports-register-dump: yes
supports-priv-flags: no


Cake mailing list
Cake@lists=2Ebufferbloat=2Enet
https://lists=2Ebufferbloat= =2Enet/listinfo/cake



--
Sent from my Android device with K-9 Mail=2E Please = excuse my brevity=2E ------5U9QSX0K8JE5I8DO7BI23HPLEC9H28--