From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id D9B1E21F1EE for ; Fri, 23 Aug 2013 02:16:55 -0700 (PDT) Received: from u-089-cab204a2.am1.uni-tuebingen.de ([134.2.89.3]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0LtmK9-1WBCAz0yjF-011BFS for ; Fri, 23 Aug 2013 11:16:52 +0200 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) From: Sebastian Moeller In-Reply-To: Date: Fri, 23 Aug 2013 11:16:52 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <2147C955-0BA1-4C1F-B51B-E3F26EA28EAB@gmx.de> References: <56B261F1-2277-457C-9A38-FAB89818288F@gmx.de> <2148E2EF-A119-4499-BAC1-7E647C53F077@gmx.de> <03951E31-8F11-4FB8-9558-29EAAE3DAE4D@gmx.de> To: Dave Taht X-Mailer: Apple Mail (2.1508) X-Provags-ID: V03:K0:LQef+V5i37hpJV6eL5rGjcdoS1/aWkzcPx9ipXx+bX8nhgMsuMH A4J4sy/8tCmzIy7N42zZSFATHLUbbXFt+VtiFTaYyw8lIXWpzf2TzcfYdAN5HQPD1cbt1Y/ mpySAQLyVLsZqeC3zKWrVNIYrgogaNcNrK3fFCYdvSFN0XH4CmcblMsFP3wPAeSCDUynmXp kCuQ9MlrPkw3QNMy7quEg== Cc: Jesper Dangaard Brouer , "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] some kernel updates X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Aug 2013 09:16:56 -0000 Hi Dave, On Aug 23, 2013, at 07:13 , Dave Taht wrote: >=20 >=20 >=20 > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller = wrote: > Hi List, hi Jesper, >=20 > So I tested 3.10.9-1 to assess the status of the HTB atm link layer = adjustments to see whether the recent changes resurrected this feature. > Unfortunately the htb_private link layer adjustments still is = broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, = same as without link layer adjustments). On the bright side the tc_stab = method still works as well as before (ping RTT around 40ms). > I would like to humbly propose to use the tc stab method in = cerowrt to perform ATM link layer adjustments as default. To repeat = myself, simply telling the kernel a lie about the packet size seems more = robust than fudging HTB's rate tables. Especially since the kernel = already fudges the packet size to account for the ethernet header and = then some, so this path should receive more scrutiny by virtue of having = more users? >=20 > It's my hope that the atm code works but is misconfigured. You can = output the tc commands by overriding the TC variable with TC=3D"echo tc" = and paste here. I will do this once I am back home. But I did check "tc -d = qdisc" and "tc -d class show dev ge00" and got: > root@nacktmulle:~# tc -d class show dev ge00 > class htb 1:11 parent 1:1 leaf 110: prio 1 quantum 1500 rate 128000bit = overhead 40 ceil 810000bit burst 2Kb/1 mpu 0b overhead 0b cburst = 12953b/1 mpu 0b overhead 0b level 0=20 > class htb 1:1 root rate 2430Kbit overhead 40 ceil 2430Kbit burst 2Kb/1 = mpu 0b overhead 0b cburst 2Kb/1 mpu 0b overhead 0b level 7=20 > class htb 1:10 parent 1:1 prio 0 quantum 1500 rate 2430Kbit overhead = 40 ceil 2430Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 2Kb/1 mpu 0b = overhead 0b level 0=20 > class htb 1:13 parent 1:1 leaf 130: prio 3 quantum 1500 rate 405000bit = overhead 40 ceil 2366Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 11958b/1 = mpu 0b overhead 0b level 0=20 > class htb 1:12 parent 1:1 leaf 120: prio 2 quantum 1500 rate 405000bit = overhead 40 ceil 2366Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 11958b/1 = mpu 0b overhead 0b level 0=20 > class fq_codel 110:20e parent 110:=20 > class fq_codel 120:10 parent 120:=20 > root@nacktmulle:~# tc -d qdisc > qdisc fq_codel 0: dev se00 root refcnt 2 limit 1024p flows 1024 = quantum 300 target 5.0ms interval 100.0ms ecn=20 > qdisc htb 1: dev ge00 root refcnt 2 r2q 10 default 12 = direct_packets_stat 0 ver 3.17 > qdisc fq_codel 110: dev ge00 parent 1:11 limit 600p flows 1024 quantum = 300 target 5.0ms interval 100.0ms=20 > qdisc fq_codel 120: dev ge00 parent 1:12 limit 600p flows 1024 quantum = 300 target 5.0ms interval 100.0ms=20 > qdisc fq_codel 130: dev ge00 parent 1:13 limit 600p flows 1024 quantum = 300 target 5.0ms interval 100.0ms=20 > qdisc ingress ffff: dev ge00 parent ffff:fff1 ----------------=20 > qdisc htb 1: dev ifb0 root refcnt 2 r2q 10 default 12 = direct_packets_stat 0 ver 3.17 > qdisc fq_codel 110: dev ifb0 parent 1:11 limit 1000p flows 1024 = quantum 500 target 5.0ms interval 100.0ms ecn=20 > qdisc fq_codel 120: dev ifb0 parent 1:12 limit 1000p flows 1024 = quantum 1500 target 5.0ms interval 100.0ms ecn=20 > qdisc fq_codel 130: dev ifb0 parent 1:13 limit 1000p flows 1024 = quantum 1500 target 5.0ms interval 100.0ms ecn=20 > qdisc mq 0: dev sw00 root=20 > qdisc mq 0: dev gw01 root=20 > qdisc mq 0: dev gw00 root=20 > qdisc mq 0: dev sw10 root=20 > qdisc mq 0: dev gw11 root=20 > qdisc mq 0: dev gw10 root=20 So at least the configured overhead of 40 bytes shows up using = htb_private. Unlike tc_stab which reports the link layer in "tc -d = qdisc" I never figured out whether htb ever reports the link layer = option at all. Changing the overhead value in AQM changes the reported = overhead in "tc -d class show dev ge00". That said I will collect the tc output and post it here... > Now, I have been testing this using Dave's most recent cerowrt = alpha version with a 3.10.9 kernel on mips hardware, I think this kernel = should contain all htb fixes including commit 8a8e3d84b17 (net_sched: = restore "linklayer atm" handling) but am not fully sure. >=20 > It does.=20 You rock! > =20 > `@Dave is there an easy way to find which patches you applied to the = kernels of the cerowrt (testing-)releases? >=20 > Normally I DO commit stuff that is in testing, but my big push this = time around was to get everything important into mainline 3.10, as it = will be the "stable" release for a good long time.=20 Oh sorry, I know that I am testing your WIP branch here, and I = think it is great that you share this with us so we can test early and = often. I just realized that I had no way of knowing which patches made = it into 3.10.9-1... > =20 > So I am still mostly working the x86 side at the moment. I WAS kind of = hoping that everything I just landed would make it up to 3.10. But for = your perusal: >=20 > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of = the kernel patches I used in it. Thanks a lot! > 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird = bug I'm looking at. (It also has support for ipv6 nat thx to the ever = prolific stephen walker heeding the call for patches...). 100% totally = untested, I have this weird bug to figure out how to fix next: >=20 > = http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.ht= ml >=20 > I fear it's a comparison gone south, maybe in bradley's optimizations = for not kernel trapping, don't know. >=20 > 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE = losing the close naming integration, but, had to try this=85. Getting IPv6 working is my next toy project, once the atm issue = is gone for good :) >=20 > If you guys want me to start committing and pushing patches again, = I'll do it, but most of that stuff will end up in 3.10.10, I think, in a = couple days. Oh, no you are on the driver's seat here, you set the pace. I = just got carried away by the thought that atm might be fixed and all = that was needed was confirmation :) > The rest might make 3.12. Pie has to survive scrutiny on the netdev = list in particular. >=20 > While I have you r attention :) I also tested 3.10.9-1's pie and it is = way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of = 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper = atm link layer adjustments). >=20 > This is with simple.qos I imagine? Simplest should do better than that = with pie. Judging from how its estimator works I think it will do badly = with multiple queues. But testing will tell... >=20 > But, yea, this pie is actually usable, and the previous wasn't. Thank = you for looking at it! >=20 > It is different from cisco's last pie drop in that it can do ecn, does = local congestion notification, has a better use of net_random, it's = mostly KernelStyle, and I forget what else. >=20 > There is still a major rounding error in the code, and I'd like cisco = to fix the api so it uses identical syntax to codel. Right now you = specify "target 8" to get "target 7", and the "ms" is implied. target 5 = becomes target 3. Is there a method to this madness? > The default target is a whopping 20 (rounded to 19), which is in part = where your 70+ms of extra delay came from.=20 so like 20 up 20 down, totaling 40ms just from a bad target = value... >=20 > Multiple parties have the delusion that 20ms is "good enough". Hmm, I would have thought that cisco with its IP telephony = products of all companies would think that increasing the latency by a = factor of 3 to 4 over the unloaded condition would be "sub-optimal". >=20 > Part of the remaining delay may also be rounding error. Cisco uses = kernels with HZ=3D1000, cero uses HZ=3D250=85.. >=20 > Anyway, to get more comparable tests... you can fiddle with the two = $QDISC lines in simple*.qos to add a target 8 to get closer to a codel = 5ms config, but that would break a codel config which treats target 8 as = target 8us. Ah, I can do this tonight and run a test on pie to see whether = the RTT comes down by 40ms - 2*7ms =3D 26ms... >=20 > I MIGHT, if I get energetic enough, fix the API, the time accounting, = and a few other things in pie, the problem is, that ns2_codel seems = still more effective on most workloads and *fq_codel smokes absolutely = everything. I agree, fq_codel looks like the winner (well efq_codel and = nfq_codel are indiscernible from fq_codel in my RRUL tests, but they too = are fq_codel for the most part I guess) > There are a few places where pie is a win over straight codel, notably = on packet floods. I am not set up in any way to test this. > And it may well be easier to retrofit into existing hardware fast path = designs.=20 Well, it seems superior to no AQM so looks like a decent stop = gap measure until fq_cofdel can migrate to all routers :) >=20 > I worry about interactions between pie and other stuff. It seems = inevitable at this point that some form of pie will be widely deployed, = and I simply haven't tried enough traffic types and RTTs to draw a firm = conclusion, period. Long RTTs are the last big place where codel and pie = and fq_codel have to be seriously tested.=20 What do you consider to be a long RTT? =46rom home I have a best = case ping RTT to snapon of 180ms, so if this is sufficient I might be = able to help. Would starting netperf on my router help you in testing? = My bandwidth up 2430Kbit/s and down 15494Kbit/s might be a bit measly. I = will be taking my family on holiday next week, so there could be another = remote test site if you want. >=20 > ns2_codel is looking pretty good now, at the shorter RTTs I've tried. = A big problem I have is getting decent long RTT emulation out of netem = (some preliminary code is up at github)=20 Or just testing over real long paths? >=20 > ... and getting cero stable enough for others to actually use - next = up is fixing the userspace problems.=20 I think it actually is pretty useable even in its current CI = state. >=20 > ... and trying to make a small dent in the wifi problem along the way = (couple commits coming up) >=20 > ... and find funding to get through the winter. > =20 > There's probably a few other things that are on that list but I = forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I = could quit smoking. Congrats! > =20 > While I am not able to build kernels, it seems that I am able to = quickly test whether link layer adjustments work or not. SO aim happy to = help where I can :) >=20 > Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms = and target 7ms, too. fq_codel, same=85. Aye, will do. > =20 > tc -s qdisc show dev ge00 > tc -s qdisc show dev ifb0 >=20 > would be useful info to have in general after each test. Agreed, that is basically what I do, but so far never saved the = results... >=20 > TIA. >=20 > There are also things like tcp_upload and tcp_download and = tcp_bidirectional that are useful tests in the rrul suite. I might get around to test those, but for the only small niche = where I can offer testing RRUL seems to work quite well. >=20 > Thank you for your efforts on these early alpha releases. I hope = things will stablize more soon, and I'll fold your aqm stuff into my = next attempt this weekend. Thanks a lot. >=20 > This is some of the stuff I know that needs fixing in userspace: >=20 > * TODO readlink not found > * TODO netdev user missing > * TODO Wed Dec 5 17:14:46 2012 authpriv.error dnsmasq: found already = running DHCP-server on interface 'se00' refusing to start, use 'option = force 1' to override > * TODO [ 18.480468] Mirror/redirect action on > [ 18.539062] Failed to load ipt action > * upload and download are reversed in aqm I think that is fixed, at least the rate I set in download is applied to = the htb attached to ifb0 and the upload to ge00 which seems quite = correct. Or are you concerned about the initial values that show up in = the AQM guy? If the latter I can try to set the defaults in = model/cbi/aqm.lua=85 > * BCP38 > * Squash CS values > * Replace ntp > * Make ahcp client mode > * Drop more privs for polipo > * upnp > * priv separation > * Review FW rules > * dhcpv6 support > * uci-defaults/make-cert.sh uses a bad path for px5g > * Doesn't configure the web browser either I would love to see the open connect client package to be = included = (https://dev.openwrt.org/browser/packages/net/openconnect/Makefile), but = I might be the only one. Thanks a lt & Best Regards Sebastian >=20 >=20 >=20 >=20 > Best > Sebastian >=20 >=20 >=20 >=20 > --=20 > Dave T=E4ht >=20 > Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html