From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id B74BD21F1FF for ; Fri, 23 Aug 2013 12:56:04 -0700 (PDT) Received: from hms-beagle-2.home.lan ([79.229.225.62]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0M0yaB-1W3wnG2QlL-00vBAA for ; Fri, 23 Aug 2013 21:56:02 +0200 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) From: Sebastian Moeller In-Reply-To: Date: Fri, 23 Aug 2013 21:56:02 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <5BEF0C7C-C2F4-45A9-9FF2-E32A05B8D67B@gmx.de> References: <56B261F1-2277-457C-9A38-FAB89818288F@gmx.de> <2148E2EF-A119-4499-BAC1-7E647C53F077@gmx.de> <03951E31-8F11-4FB8-9558-29EAAE3DAE4D@gmx.de> <9A9B094D-CA07-48B0-85FE-FA7C759FEDE3@gmx.de> To: Dave Taht X-Mailer: Apple Mail (2.1508) X-Provags-ID: V03:K0:NCPNO7GMpd89+CB/sgk+N9cUw5neP4cFfrILvzConrzinx1kaDX l9j0bP1Gl9mmXEWlO7EOQx9cRsFCtHNJscvGXJouhstGTVezQMN13DnEceZBHDCteGM45bI +kpCVqYzUdH10OEvTHe7PLQc0lGwMvgccCYm9W6DB5MjzZ8zZKa+4tel6k7b3h0WDPwSdxW q75meHwCxh/pxbNgPl58Q== Cc: Jesper Dangaard Brouer , "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] some kernel updates X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Aug 2013 19:56:05 -0000 Hi Dave, I guess I found the culprit: once I added $ADSLL to the ingress() in simple.qos: ingress() { CEIL=3D$DOWNLINK PRIO_RATE=3D`expr $CEIL / 3` # Ceiling for prioirty BE_RATE=3D`expr $CEIL / 6` # Min for best effort BK_RATE=3D`expr $CEIL / 6` # Min for background BE_CEIL=3D`expr $CEIL - 64` # A little slop at the top LQ=3D"quantum `get_mtu $IFACE`" $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null $TC qdisc add dev $IFACE handle ffff: ingress $TC qdisc del dev $DEV root 2> /dev/null $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12 $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit = ceil ${CEIL}kbit $ADSLL $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit = ceil ${CEIL}kbit prio 0 $ADSLL $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil = ${PRIO_RATE}kbit prio 1 $ADSLL $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate = ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate = ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL # I'd prefer to use a pre-nat filter but that causes permutation... $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN = `get_quantum 500` `get_flows ${PRIO_RATE}` $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN = `get_quantum 1500` `get_flows ${BE_RATE}` $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN = `get_quantum 1500` `get_flows ${BK_RATE}` diffserv $DEV ifconfig $DEV up # redirect all IP packets arriving in $IFACE to ifb0 $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \ match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV } I get basically the same RRUL ping RTTs for htb_private as for tc_stab. = So Jesper was right the patch seems to fix the issue. I guess I should = send out my current version of yours and Toke's AQM scripts soon. Best Sebastian P.S.: I am not sure whether I want to tackle the PIE issue today... On Aug 23, 2013, at 21:47 , Dave Taht wrote: > quick note: running this script requires that you=20 >=20 > ifconfig ifb0 up >=20 > at some point. In my case on cerowrt you took care of that already... >=20 >=20 > On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller = wrote: > Hi Dave, >=20 > On Aug 23, 2013, at 07:13 , Dave Taht wrote: >=20 > > > > > > > > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller = wrote: > > Hi List, hi Jesper, > > > > So I tested 3.10.9-1 to assess the status of the HTB atm link layer = adjustments to see whether the recent changes resurrected this feature. > > Unfortunately the htb_private link layer adjustments still = is broken (RRUL ping RTT against Toke's netperf host in Germany of = ~80ms, same as without link layer adjustments). On the bright side the = tc_stab method still works as well as before (ping RTT around 40ms). > > I would like to humbly propose to use the tc stab method in = cerowrt to perform ATM link layer adjustments as default. To repeat = myself, simply telling the kernel a lie about the packet size seems more = robust than fudging HTB's rate tables. Especially since the kernel = already fudges the packet size to account for the ethernet header and = then some, so this path should receive more scrutiny by virtue of having = more users? > > > > It's my hope that the atm code works but is misconfigured. You can = output the tc commands by overriding the TC variable with TC=3D"echo tc" = and paste here. >=20 > So I went for TC=3D"logger tc" and used log read to harvest as = I could not find the echo output, but I guess that should not matter. So = here is the result (slightly edited to get rid of the log timestamps and = log level): >=20 > tc qdisc del dev ge00 root > tc qdisc add dev ge00 root handle 1: htb default 12 > tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate = 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047 > tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate = 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047 > tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate = 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047 > tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate = 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047 > tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate = 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047 > tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 = noecn quantum 300 > tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 = noecn quantum 300 > tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 = noecn quantum 300 > tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip = protocol 0 0x00 flowid 1:12 > tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw = classid 1:11 > tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw = classid 1:12 > tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw = classid 1:13 > tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw = classid 1:11 > tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw = classid 1:12 > tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw = classid 1:13 > tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw = classid 1:11 > tc qdisc del dev ge00 handle ffff: ingress > tc qdisc add dev ge00 handle ffff: ingress > tc qdisc del dev ifb0 root > tc qdisc add dev ifb0 root handle 1: htb default 12 > tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate = 15494kbit ceil 15494kbit > tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate = 15494kbit ceil 15494kbit prio 0 > tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate = 32kbit ceil 5164kbit prio 1 > tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate = 2582kbit ceil 15430kbit prio 2 > tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate = 2582kbit ceil 15430kbit prio 3 > tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 = ecn quantum 500 > tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 = ecn quantum 1500 > tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 = ecn quantum 1500 > tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip = protocol 0 0x00 flowid 1:12 > tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip = tos 0x00 0xfc classid 1:12 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 = priority 0x00 0xfc classid 1:12 > tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip = tos 0x20 0xfc classid 1:13 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 = priority 0x20 0xfc classid 1:13 > tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip = tos 0x10 0xfc classid 1:11 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 = priority 0x10 0xfc classid 1:11 > tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip = tos 0xb8 0xfc classid 1:11 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 = priority 0xb8 0xfc classid 1:11 > tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip = tos 0xc0 0xfc classid 1:11 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match = ip6 priority 0xc0 0xfc classid 1:11 > tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip = tos 0xe0 0xfc classid 1:11 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match = ip6 priority 0xe0 0xfc classid 1:11 > tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip = tos 0x90 0xfc classid 1:11 > tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match = ip6 priority 0x90 0xfc classid 1:11 > tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw = classid 1:11 > tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match = u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0 >=20 > I notice it seem this only shows up for egress(), but looking at = simple.qos ingress() is not addend ${ADSLL} at all so that is to be = expected. There is nothing in dmesg at all. >=20 > So I am off to add ADSLL to ingress() as well and then test RRUL = again... >=20 >=20 > Jesper please let me know if this looks reasonable, at least to my eye = it seems to fit with what "tc disc add htb help" tells me. I tried your: > echo "func __detect_linklayer +p" = /sys/kernel/debug/dynamic_debug/control > but got no output even though debugs was already mounted=85 >=20 > Best > Sebastian >=20 > > > > Now, I have been testing this using Dave's most recent = cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think = this kernel should contain all htb fixes including commit 8a8e3d84b17 = (net_sched: restore "linklayer atm" handling) but am not fully sure. > > > > It does. > > > > `@Dave is there an easy way to find which patches you applied to the = kernels of the cerowrt (testing-)releases? > > > > Normally I DO commit stuff that is in testing, but my big push this = time around was to get everything important into mainline 3.10, as it = will be the "stable" release for a good long time. > > > > So I am still mostly working the x86 side at the moment. I WAS kind = of hoping that everything I just landed would make it up to 3.10. But = for your perusal: > > > > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most = of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch = ripped out due to another weird bug I'm looking at. (It also has support = for ipv6 nat thx to the ever prolific stephen walker heeding the call = for patches...). 100% totally untested, I have this weird bug to figure = out how to fix next: > > > > = http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.ht= ml > > > > I fear it's a comparison gone south, maybe in bradley's = optimizations for not kernel trapping, don't know. > > > > 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE = losing the close naming integration, but, had to try this.... > > > > If you guys want me to start committing and pushing patches again, = I'll do it, but most of that stuff will end up in 3.10.10, I think, in a = couple days. The rest might make 3.12. Pie has to survive scrutiny on = the netdev list in particular. > > > > While I have you r attention :) I also tested 3.10.9-1's pie and it = is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of = 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper = atm link layer adjustments). > > > > This is with simple.qos I imagine? Simplest should do better than = that with pie. Judging from how its estimator works I think it will do = badly with multiple queues. But testing will tell... > > > > But, yea, this pie is actually usable, and the previous wasn't. = Thank you for looking at it! > > > > It is different from cisco's last pie drop in that it can do ecn, = does local congestion notification, has a better use of net_random, it's = mostly KernelStyle, and I forget what else. > > > > There is still a major rounding error in the code, and I'd like = cisco to fix the api so it uses identical syntax to codel. Right now you = specify "target 8" to get "target 7", and the "ms" is implied. target 5 = becomes target 3. The default target is a whopping 20 (rounded to 19), = which is in part where your 70+ms of extra delay came from. > > > > Multiple parties have the delusion that 20ms is "good enough". > > > > Part of the remaining delay may also be rounding error. Cisco uses = kernels with HZ=3D1000, cero uses HZ=3D250..... > > > > Anyway, to get more comparable tests... you can fiddle with the two = $QDISC lines in simple*.qos to add a target 8 to get closer to a codel = 5ms config, but that would break a codel config which treats target 8 as = target 8us. > > > > I MIGHT, if I get energetic enough, fix the API, the time = accounting, and a few other things in pie, the problem is, that = ns2_codel seems still more effective on most workloads and *fq_codel = smokes absolutely everything. There are a few places where pie is a win = over straight codel, notably on packet floods. And it may well be easier = to retrofit into existing hardware fast path designs. > > > > I worry about interactions between pie and other stuff. It seems = inevitable at this point that some form of pie will be widely deployed, = and I simply haven't tried enough traffic types and RTTs to draw a firm = conclusion, period. Long RTTs are the last big place where codel and pie = and fq_codel have to be seriously tested. > > > > ns2_codel is looking pretty good now, at the shorter RTTs I've = tried. A big problem I have is getting decent long RTT emulation out of = netem (some preliminary code is up at github) > > > > ... and getting cero stable enough for others to actually use - next = up is fixing the userspace problems. > > > > ... and trying to make a small dent in the wifi problem along the = way (couple commits coming up) > > > > ... and find funding to get through the winter. > > > > There's probably a few other things that are on that list but I = forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I = could quit smoking. > > > > While I am not able to build kernels, it seems that I am able to = quickly test whether link layer adjustments work or not. SO aim happy to = help where I can :) > > > > Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms = and target 7ms, too. fq_codel, same.... > > > > tc -s qdisc show dev ge00 > > tc -s qdisc show dev ifb0 > > > > would be useful info to have in general after each test. > > > > TIA. > > > > There are also things like tcp_upload and tcp_download and = tcp_bidirectional that are useful tests in the rrul suite. > > > > Thank you for your efforts on these early alpha releases. I hope = things will stablize more soon, and I'll fold your aqm stuff into my = next attempt this weekend. > > > > This is some of the stuff I know that needs fixing in userspace: > > > > * TODO readlink not found > > * TODO netdev user missing > > * TODO Wed Dec 5 17:14:46 2012 authpriv.error dnsmasq: found = already running DHCP-server on interface 'se00' refusing to start, use = 'option force 1' to override > > * TODO [ 18.480468] Mirror/redirect action on > > [ 18.539062] Failed to load ipt action > > * upload and download are reversed in aqm > > * BCP38 > > * Squash CS values > > * Replace ntp > > * Make ahcp client mode > > * Drop more privs for polipo > > * upnp > > * priv separation > > * Review FW rules > > * dhcpv6 support > > * uci-defaults/make-cert.sh uses a bad path for px5g > > * Doesn't configure the web browser either > > > > > > > > > > Best > > Sebastian > > > > > > > > > > -- > > Dave T=E4ht > > > > Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html >=20 >=20 >=20 >=20 > --=20 > Dave T=E4ht >=20 > Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html