From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 2B03A21F159 for ; Sun, 25 Aug 2013 07:31:16 -0700 (PDT) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 462E320EF9; Sun, 25 Aug 2013 10:31:15 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute2.internal (MEProxy); Sun, 25 Aug 2013 10:31:15 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=imap.cc; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=mesmtp; bh=45411KAos+kjaAkiL8RizAsvQQE=; b=rrFMK S9lvi2o8B/CVbW2HHCSymfzyq7Ts4Z6BrKaz33Oo0gcCUswcEjvAa4eCWs+N6eV9 eQwGvL4d5NAhMeDo6gj0Xsacqf54I6vTg2JdJX30R+sXrGIzlq1IT3w8FE63fLsZ N0X+qEL2Lms9WFJQn4MKi89ukarkSNJcuNEG1g= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; s=smtpout; bh=45411KAos +kjaAkiL8RizAsvQQE=; b=dpvKE6mlKMNkj01DnJppkl2fAAWPxLz/4Xu+Wo3u0 u2qXpNV6GVEPWcm+2s4wagAbOTRlPbni6fADxebcLJEMCRYgkDOelShFUfKwPkHD I7xU9MVRtHsk2RSXvNIV/8qADktqiabJP7fzW7N7VqHRdT3eXscaKXAJr52NZdmp EI= X-Sasl-enc: /TzG79e7kk2xpKNWtDGZYBpcsk1NgfDwICYaOLLmDygY 1377441073 Received: from [172.30.42.15] (unknown [188.221.232.223]) by mail.messagingengine.com (Postfix) with ESMTPA id E44E568009C for ; Sun, 25 Aug 2013 10:31:12 -0400 (EDT) From: Fred Stratton Content-Type: multipart/alternative; boundary="Apple-Mail=_6DFB72FB-03B7-46AC-831F-4F07D5924965" Message-Id: <69A95FC6-0BBD-4AA0-912B-61382F2723C8@imap.cc> Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Date: Sun, 25 Aug 2013 15:31:11 +0100 References: <56B261F1-2277-457C-9A38-FAB89818288F@gmx.de> <2148E2EF-A119-4499-BAC1-7E647C53F077@gmx.de> <03951E31-8F11-4FB8-9558-29EAAE3DAE4D@gmx.de> <9A9B094D-CA07-48B0-85FE-FA7C759FEDE3@gmx.de> <5BEF0C7C-C2F4-45A9-9FF2-E32A05B8D67B@gmx.de> <8CD72282-88CB-43FD-84EF-574DDB23F0AB@gmx.de> <0886582B-E46C-4F93-A9E5-C45A81C32AEA@imap.cc> <8AFDEBD8-54C9-46B6-8CBE-5CD4242A2765@imap.cc> To: "cerowrt-devel@lists.bufferbloat.net" In-Reply-To: X-Mailer: Apple Mail (2.1508) Subject: Re: [Cerowrt-devel] some kernel updates X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 14:31:16 -0000 --Apple-Mail=_6DFB72FB-03B7-46AC-831F-4F07D5924965 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Correction. Using 3.10.9-2 On 25 Aug 2013, at 15:26, Fred Stratton wrote: > Thank you. >=20 > This is an initial response. >=20 > Am using 3.10.2-1 currently, with the standard AQM interface. This = does not have the pull down menu of your interface, which is why I ask = if both are active.=20 > On 25 Aug 2013, at 14:59, Sebastian Moeller wrote: >=20 >> Hi Fred, >>=20 >>=20 >> On Aug 25, 2013, at 12:17 , Fred Stratton = wrote: >>=20 >>>=20 >>> On 25 Aug 2013, at 10:21, Fred Stratton = wrote: >>>=20 >>>> As the person with the most flaky ADSL link, I point out that None = of these recent, welcome, changes, are having any effect here, with an = uplink sped of circa 950 kbits/s. >>=20 >> Okay, how flaky is you link? What rate of Errors do you have = while testing? I am especially interested in CRC errors and ES SES and = HEC, just to get an idea how flaky the line is... >>=20 >>>>=20 >>>> The reason I mention this is that it is still impossible to watch = iPlayer Flash streaming video and download at the same time, The iPlayer = stream fails. The point of the exercise was to achieve this.=20 >>>>=20 >>>> The uplink delay is consistently around 650ms, which appears to be = too high for effective streaming. In addition, the uplink stream has = multiple breaks, presumably outages, if the uplink rate is capped at, = say, 700 kbits/s. >>=20 >> Well, watching video is going to stress your downlink so the = uplink should not saturate by the ACKs and the concurrent downloads also = do not stress your uplink except for the ACKs, so this points to = downlink errors as far as I can tell from the data you have given. If = the up link has repeated outages however, your problems might be = unfixable because these, if long enough, will cause lost ACKs and will = probably trigger retransmission, independent of whether the link layer = adjustments work or not. (You could test this by shaping you up and = downlink to <=3D 50% of the link rates and disable all link layer = adjustments, 50% is larger than the ATM worst case so should have you = covered. Well unless you del link has an excessive number of tones = reserved for forward error correction (FEC)). >=20 > Uptime 100655 > downstream 12162 kbits/s > CRC errors 10154 > FEC Errors 464 > hEC Errors 758 >=20 > upstream 1122 kbits/s > no errors in period. >=20 >> Could you perform the following test by any chance: state = iPlayer and yor typical downloads and then have a look at = http://gw.home.lan:81und the following tab chain Status -> Realtime = Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound = rate stays well below you shaped limit and you still encounter the = stream failure I would say it is save to ignore the link layer = adjustments as cause of your issues. >=20 > Am happy reducing rate to fifty per cent, but the uplink appears to = have difficulty operating below circa 500 kbits/s. This should not be = so. I shall try a fourth time. >>=20 >>=20 >>>>=20 >>>> YouTube has no problems. >>>>=20 >>>> I remain unclear whether the use of tc-stab and htb are mutually = exclusive options, using the present stock interface. >>=20 >> Well, depending on the version of the cerowrt you use, <3.10.9-1 = I believe lacks a functional HTB link layer adjustment mechanism, so you = should select tc_stab. My most recent modifications to Toke and Dave's = AQM package does only allow you to select one or the other. In any case = selecting BOTH is not a reasonable thing to do, because best case it = will only apply overhead twice, worst case it would also do the (link = layer adjustments) LLA twice >=20 >=20 >> See initial comments. >>=20 >>>>=20 >>>> The current ISP connection is IPoA LLC. >>>=20 >>> Correction - Bridged LLC.=20 >>=20 >> Well, I think you should try to figure out your overhead = empirically and check the encapsulation. I would recommend you run the = following script on you r link over night and send me the log file it = produces: >>=20 >> #! /bin/bash >> # TODO use seq or bash to generate a list of the requested sizes (to = alow for non-equdistantly spaced sizes) >>=20 >> # Telekom Tuebingen Moltkestrasse 6 >> TECH=3DADSL2 >> # finding a proper target IP is somewhat of an art, just traceroute a = remote site=20 >> # and find the nearest host reliably responding to pings showing the = smallet variation of pingtimes >> TARGET=3D87.186.197.70 # T >> DATESTR=3D`date +%Y%m%d_%H%M%S` # to allow multiple sequential = records >> LOG=3Dping_sweep_${TECH}_${DATESTR}.txt >>=20 >>=20 >> # by default non-root ping will only end one packet per second, so = work around that by calling ping independently for each package >> # empirically figure out the shortest period still giving the = standard ping time (to avoid being slow-pathed by our host) >> PINGPERIOD=3D0.01 # in seconds >> PINGSPERSIZE=3D10000 >>=20 >> # Start, needed to find the per packet overhead dependent on the ATM = encapsulation >> # to reliably show ATM quantization one would like to see at least = two steps, so cover a range > 2 ATM cells (so > 96 bytes) >> SWEEPMINSIZE=3D16 # 64bit systems seem to require 16 bytes = of payload to include a timestamp... >> SWEEPMAXSIZE=3D116 >>=20 >>=20 >> n_SWEEPS=3D`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}` >>=20 >>=20 >> i_sweep=3D0 >> i_size=3D0 >>=20 >> while [ ${i_sweep} -lt ${PINGSPERSIZE} ] >> do >> (( i_sweep++ )) >> echo "Current iteration: ${i_sweep}" >> # now loop from sweepmin to sweepmax >> i_size=3D${SWEEPMINSIZE} >> while [ ${i_size} -le ${SWEEPMAXSIZE} ] >> do >> echo "${i_sweep}. repetition of ping size ${i_size}" >> ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} & >> (( i_size++ )) >> # we need a sleep binary that allows non integer times (GNU = sleep is fine as is sleep of macosx 10.8.4) >> sleep ${PINGPERIOD} >> done >> done >>=20 >> #tail -f ${LOG} >>=20 >> echo "Done... ($0)" >>=20 >>=20 >> Please set TARGET to the closest IP host on the ISP side of your link = that gives reliable ping RTTs (using "ping -c 100 -s 16 = your.best.host.ip"). Also test whether the RTTs are in the same ballpark = when you reduce the ping period to 0.01 (you might have to increase the = period until the RTTs are close to the standard 1 ping per second case). = I can then run this through my matlab code to detect the actual = overhead. (I am happy to share the code as well, if you have matlab = available; it might even run under octave but I have not tested that = since the last major changes). >=20 > To follow at some point. >>=20 >>=20 >>>=20 >>>> Whatever byte value is used for tc-stab makes no change. >>=20 >> I assume you talk about the overhead? Missing link layer = adjustment will eat between 50% and 10% of your link bandwidth, while = missing overhead values will be more benign. The only advise I can give = is to pick the overhead that actually describes your link. I am willing = to help you figure this out. >=20 > The link is bridged LLC. Have been using 18 and 32 for test purposes. = I shall move to PPPoA VC-MUX in 4 months. >>=20 >>>>=20 >>>> I have applied the ingress modification to simple.qos, keeping the = original version., and tested both. >>=20 >> For which cerowrt version? It is only expected to do something = for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did = NOT work. >=20 > Using 3.10.9-2 >=20 >>=20 >>>>=20 >>>> I have changed the Powerline adaptors I use to ones with known = smaller buffers, though this is unlikely to be a ate-limiting step. >>>>=20 >>>> I have changed the 2Wire gateway, known to be heavily buffered, = with a bridged Huawei HG612, with a Broadcom 6368 SoC. >>>>=20 >>>> This device has a permanently on telnet interface, with a simple = password, which cannot be changed other than by firmware recompilation=85 >>>>=20 >>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0. >>>>=20 >>>> None of these changes affect the problematic uplink delay. >>=20 >> So how did you measure the uplink delay? The RRUL plots you sent = me show an increase in ping RTT from around 50ms to 80ms with tc_stab = and fq_codel on simplest.qos, how does that reconcile with 650ms uplink = delay, netalyzr? >=20 > Max Planck and Netalyzr produce the same figure. I use both, but Max = Planck gives you circa 3 tries per IP address per 24 hours. >>=20 >>>>=20 >>>>=20 >>>> On 24 Aug 2013, at 21:51, Sebastian Moeller = wrote: >>>>=20 >>>>> Hi Dave, >>>>>=20 >>>>>=20 >>>>> On Aug 23, 2013, at 22:29 , Dave Taht wrote: >>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller = wrote: >>>>>> Hi Dave, >>>>>>=20 >>>>>> I guess I found the culprit: >>>>>>=20 >>>>>> once I added $ADSLL to the ingress() in simple.qos: >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> I had that in there originally. I ripped it out because it seemed = to help with ADSL at the time - as I was unaware the extent that the = whole subsystem was busted! >>>>>=20 >>>>> Ah, and I had added my stab based version to both ingress() and = egress() assuming that both links need to be kept under control. So with = fixed htb link layer adjustment (LLA) it only worked on the uplink and = in retrospect if I look at my initial test data I actually see one of = the hallmarks of a working LLA for the upstream. (The upstream good-put = was reduced compared to the no LLA test, caused by LLA making the = actually sent packets larger so fewer packets fit through the shaped = link). But since I was not expecting only half a working system I = overlooked that in the data. >>>>> But looking at the latency of the ping RTT probes it becomes = quite clear that only doing link layer adjustments on the uplink is even = worse than not doing it all (because the latency is still almost as bad = as without LLA but the up-link bandwidth is reduced). >>>>>=20 >>>>>>=20 >>>>>> I like to think of the process we've just gone through as "wow, = we just fixed the uk, and a few other countries". :) Feels kind of good, = doesn't it? (Too bad the pay sucks.) >>>>>=20 >>>>> Oh, I can not complain about pay, I have a day job in totally = different field, so this is more of a hobby for me :)=20 >>>>>=20 >>>>>> I mean, jeeze, chopping another 30+ms off the latency of that = many systems should get medals from economists worldwide monitoring = productivity.=20 >>>>>>=20 >>>>>> Does anyone have a date/kernel version on when linklayer overhead = compensation stopped working? There was a bug even prior to 3.8 that = looked bad. (and RED was busted for 3 years). >>>>>>=20 >>>>>> Another step would be trying to improve openwrt's native qos = system somewhat in the DSL case. They don't use this subsystem (probably = because it didn't work), and it's also broke on ipv6. (They use conn = track) >>>>>=20 >>>>> Oh, in the bql-40 time frame I hacked the stab based LLA into = their generate.sh and it worked quite well, even though at time my = measurements were quite crude. SInce their qos scripts are HFSC based = the HTB private implementation is not going to do them any good. Luckily = now that does not seem to matter as both methods now perform identically = as they should. (Well, now Jespers last changes are nicer than the old = table lookup, but it should be relatively say to implant the same for = stab, heck once I got my linux machine up I might take this as my first = attempt at making local changes to the kernel :) ). So adding it to = openwrt proper should be a piece of cake. Do you know by any chance who = would be the best person to contact for that, ? >>>>>=20 >>>>>>=20 >>>>>> At some point I'd like to have a mechanism for saner diffserv = classification on egress, and to clamp ingress values to egress ones. = There is a ton of work going on on finding sane codepoints on webrtc in = the ietf=85. >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> ingress() { >>>>>>=20 >>>>>> CEIL=3D$DOWNLINK >>>>>> PRIO_RATE=3D`expr $CEIL / 3` # Ceiling for prioirty >>>>>> BE_RATE=3D`expr $CEIL / 6` # Min for best effort >>>>>> BK_RATE=3D`expr $CEIL / 6` # Min for background >>>>>> BE_CEIL=3D`expr $CEIL - 64` # A little slop at the top >>>>>>=20 >>>>>> LQ=3D"quantum `get_mtu $IFACE`" >>>>>>=20 >>>>>> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null >>>>>> $TC qdisc add dev $IFACE handle ffff: ingress >>>>>>=20 >>>>>> $TC qdisc del dev $DEV root 2> /dev/null >>>>>> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default = 12 >>>>>> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate = ${CEIL}kbit ceil ${CEIL}kbit $ADSLL >>>>>> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate = ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL >>>>>> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate = 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL >>>>>> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate = ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL >>>>>> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate = ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL >>>>>>=20 >>>>>> # I'd prefer to use a pre-nat filter but that causes = permutation... >>>>>>=20 >>>>>> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 = $ECN `get_quantum 500` `get_flows ${PRIO_RATE}` >>>>>> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 = $ECN `get_quantum 1500` `get_flows ${BE_RATE}` >>>>>> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 = $ECN `get_quantum 1500` `get_flows ${BK_RATE}` >>>>>>=20 >>>>>> diffserv $DEV >>>>>>=20 >>>>>> ifconfig $DEV up >>>>>>=20 >>>>>> # redirect all IP packets arriving in $IFACE to ifb0 >>>>>>=20 >>>>>> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \ >>>>>> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV >>>>>>=20 >>>>>> } >>>>>>=20 >>>>>> I get basically the same RRUL ping RTTs for htb_private as for = tc_stab. So Jesper was right the patch seems to fix the issue. I guess I = should send out my current version of yours and Toke's AQM scripts soon. >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> Best >>>>>> Sebastian >>>>>>=20 >>>>>> P.S.: I am not sure whether I want to tackle the PIE issue = today... >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> On Aug 23, 2013, at 21:47 , Dave Taht = wrote: >>>>>>=20 >>>>>>> quick note: running this script requires that you >>>>>>>=20 >>>>>>> ifconfig ifb0 up >>>>>>>=20 >>>>>>> at some point. >>>>>>=20 >>>>>> In my case on cerowrt you took care of that already... >>>>>>=20 >>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller = wrote: >>>>>>> Hi Dave, >>>>>>>=20 >>>>>>> On Aug 23, 2013, at 07:13 , Dave Taht = wrote: >>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller = wrote: >>>>>>>> Hi List, hi Jesper, >>>>>>>>=20 >>>>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link = layer adjustments to see whether the recent changes resurrected this = feature. >>>>>>>> Unfortunately the htb_private link layer adjustments still is = broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, = same as without link layer adjustments). On the bright side the tc_stab = method still works as well as before (ping RTT around 40ms). >>>>>>>> I would like to humbly propose to use the tc stab method in = cerowrt to perform ATM link layer adjustments as default. To repeat = myself, simply telling the kernel a lie about the packet size seems more = robust than fudging HTB's rate tables. Especially since the kernel = already fudges the packet size to account for the ethernet header and = then some, so this path should receive more scrutiny by virtue of having = more users? >>>>>>>>=20 >>>>>>>> It's my hope that the atm code works but is misconfigured. You = can output the tc commands by overriding the TC variable with TC=3D"echo = tc" and paste here. >>>>>>>=20 >>>>>>> So I went for TC=3D"logger tc" and used log read to harvest as I = could not find the echo output, but I guess that should not matter. So = here is the result (slightly edited to get rid of the log timestamps and = log level): >>>>>>>=20 >>>>>>> tc qdisc del dev ge00 root >>>>>>> tc qdisc add dev ge00 root handle 1: htb default 12 >>>>>>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 = rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047 >>>>>>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 = rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu = 2047 >>>>>>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 = rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu = 2047 >>>>>>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 = rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu = 2047 >>>>>>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 = rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu = 2047 >>>>>>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 = noecn quantum 300 >>>>>>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 = noecn quantum 300 >>>>>>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 = noecn quantum 300 >>>>>>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 = match ip protocol 0 0x00 flowid 1:12 >>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw = classid 1:11 >>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw = classid 1:12 >>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw = classid 1:13 >>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 = fw classid 1:11 >>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 = fw classid 1:12 >>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 = fw classid 1:13 >>>>>>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 = fw classid 1:11 >>>>>>> tc qdisc del dev ge00 handle ffff: ingress >>>>>>> tc qdisc add dev ge00 handle ffff: ingress >>>>>>> tc qdisc del dev ifb0 root >>>>>>> tc qdisc add dev ifb0 root handle 1: htb default 12 >>>>>>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 = rate 15494kbit ceil 15494kbit >>>>>>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 = rate 15494kbit ceil 15494kbit prio 0 >>>>>>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 = rate 32kbit ceil 5164kbit prio 1 >>>>>>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 = rate 2582kbit ceil 15430kbit prio 2 >>>>>>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 = rate 2582kbit ceil 15430kbit prio 3 >>>>>>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit = 1000 ecn quantum 500 >>>>>>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit = 1000 ecn quantum 1500 >>>>>>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit = 1000 ecn quantum 1500 >>>>>>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 = match ip protocol 0 0x00 flowid 1:12 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match = ip tos 0x00 0xfc classid 1:12 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match = ip6 priority 0x00 0xfc classid 1:12 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match = ip tos 0x20 0xfc classid 1:13 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match = ip6 priority 0x20 0xfc classid 1:13 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match = ip tos 0x10 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match = ip6 priority 0x10 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match = ip tos 0xb8 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match = ip6 priority 0xb8 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match = ip tos 0xc0 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 = match ip6 priority 0xc0 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match = ip tos 0xe0 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 = match ip6 priority 0xe0 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match = ip tos 0x90 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 = match ip6 priority 0x90 0xfc classid 1:11 >>>>>>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 = fw classid 1:11 >>>>>>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 = match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0 >>>>>>>=20 >>>>>>> I notice it seem this only shows up for egress(), but looking at = simple.qos ingress() is not addend ${ADSLL} at all so that is to be = expected. There is nothing in dmesg at all. >>>>>>>=20 >>>>>>> So I am off to add ADSLL to ingress() as well and then test RRUL = again... >>>>>>>=20 >>>>>>>=20 >>>>>>> Jesper please let me know if this looks reasonable, at least to = my eye it seems to fit with what "tc disc add htb help" tells me. I = tried your: >>>>>>> echo "func __detect_linklayer +p" = /sys/kernel/debug/dynamic_debug/control >>>>>>> but got no output even though debugs was already mounted=85 >>>>>>>=20 >>>>>>> Best >>>>>>> Sebastian >>>>>>>=20 >>>>>>>>=20 >>>>>>>> Now, I have been testing this using Dave's most recent cerowrt = alpha version with a 3.10.9 kernel on mips hardware, I think this kernel = should contain all htb fixes including commit 8a8e3d84b17 (net_sched: = restore "linklayer atm" handling) but am not fully sure. >>>>>>>>=20 >>>>>>>> It does. >>>>>>>>=20 >>>>>>>> `@Dave is there an easy way to find which patches you applied = to the kernels of the cerowrt (testing-)releases? >>>>>>>>=20 >>>>>>>> Normally I DO commit stuff that is in testing, but my big push = this time around was to get everything important into mainline 3.10, as = it will be the "stable" release for a good long time. >>>>>>>>=20 >>>>>>>> So I am still mostly working the x86 side at the moment. I WAS = kind of hoping that everything I just landed would make it up to 3.10. = But for your perusal: >>>>>>>>=20 >>>>>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has = most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees = patch ripped out due to another weird bug I'm looking at. (It also has = support for ipv6 nat thx to the ever prolific stephen walker heeding the = call for patches...). 100% totally untested, I have this weird bug to = figure out how to fix next: >>>>>>>>=20 >>>>>>>> = http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.ht= ml >>>>>>>>=20 >>>>>>>> I fear it's a comparison gone south, maybe in bradley's = optimizations for not kernel trapping, don't know. >>>>>>>>=20 >>>>>>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I = HATE losing the close naming integration, but, had to try this.... >>>>>>>>=20 >>>>>>>> If you guys want me to start committing and pushing patches = again, I'll do it, but most of that stuff will end up in 3.10.10, I = think, in a couple days. The rest might make 3.12. Pie has to survive = scrutiny on the netdev list in particular. >>>>>>>>=20 >>>>>>>> While I have you r attention :) I also tested 3.10.9-1's pie = and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms = instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms = with proper atm link layer adjustments). >>>>>>>>=20 >>>>>>>> This is with simple.qos I imagine? Simplest should do better = than that with pie. Judging from how its estimator works I think it will = do badly with multiple queues. But testing will tell... >>>>>>>>=20 >>>>>>>> But, yea, this pie is actually usable, and the previous wasn't. = Thank you for looking at it! >>>>>>>>=20 >>>>>>>> It is different from cisco's last pie drop in that it can do = ecn, does local congestion notification, has a better use of net_random, = it's mostly KernelStyle, and I forget what else. >>>>>>>>=20 >>>>>>>> There is still a major rounding error in the code, and I'd like = cisco to fix the api so it uses identical syntax to codel. Right now you = specify "target 8" to get "target 7", and the "ms" is implied. target 5 = becomes target 3. The default target is a whopping 20 (rounded to 19), = which is in part where your 70+ms of extra delay came from. >>>>>>>>=20 >>>>>>>> Multiple parties have the delusion that 20ms is "good enough". >>>>>>>>=20 >>>>>>>> Part of the remaining delay may also be rounding error. Cisco = uses kernels with HZ=3D1000, cero uses HZ=3D250..... >>>>>>>>=20 >>>>>>>> Anyway, to get more comparable tests... you can fiddle with the = two $QDISC lines in simple*.qos to add a target 8 to get closer to a = codel 5ms config, but that would break a codel config which treats = target 8 as target 8us. >>>>>>>>=20 >>>>>>>> I MIGHT, if I get energetic enough, fix the API, the time = accounting, and a few other things in pie, the problem is, that = ns2_codel seems still more effective on most workloads and *fq_codel = smokes absolutely everything. There are a few places where pie is a win = over straight codel, notably on packet floods. And it may well be easier = to retrofit into existing hardware fast path designs. >>>>>>>>=20 >>>>>>>> I worry about interactions between pie and other stuff. It = seems inevitable at this point that some form of pie will be widely = deployed, and I simply haven't tried enough traffic types and RTTs to = draw a firm conclusion, period. Long RTTs are the last big place where = codel and pie and fq_codel have to be seriously tested. >>>>>>>>=20 >>>>>>>> ns2_codel is looking pretty good now, at the shorter RTTs I've = tried. A big problem I have is getting decent long RTT emulation out of = netem (some preliminary code is up at github) >>>>>>>>=20 >>>>>>>> ... and getting cero stable enough for others to actually use - = next up is fixing the userspace problems. >>>>>>>>=20 >>>>>>>> ... and trying to make a small dent in the wifi problem along = the way (couple commits coming up) >>>>>>>>=20 >>>>>>>> ... and find funding to get through the winter. >>>>>>>>=20 >>>>>>>> There's probably a few other things that are on that list but I = forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I = could quit smoking. >>>>>>>>=20 >>>>>>>> While I am not able to build kernels, it seems that I am able = to quickly test whether link layer adjustments work or not. SO aim happy = to help where I can :) >>>>>>>>=20 >>>>>>>> Give pie target 8 and target 5 a shot, please? ns2_codel target = 3ms and target 7ms, too. fq_codel, same.... >>>>>>>>=20 >>>>>>>> tc -s qdisc show dev ge00 >>>>>>>> tc -s qdisc show dev ifb0 >>>>>>>>=20 >>>>>>>> would be useful info to have in general after each test. >>>>>>>>=20 >>>>>>>> TIA. >>>>>>>>=20 >>>>>>>> There are also things like tcp_upload and tcp_download and = tcp_bidirectional that are useful tests in the rrul suite. >>>>>>>>=20 >>>>>>>> Thank you for your efforts on these early alpha releases. I = hope things will stablize more soon, and I'll fold your aqm stuff into = my next attempt this weekend. >>>>>>>>=20 >>>>>>>> This is some of the stuff I know that needs fixing in = userspace: >>>>>>>>=20 >>>>>>>> * TODO readlink not found >>>>>>>> * TODO netdev user missing >>>>>>>> * TODO Wed Dec 5 17:14:46 2012 authpriv.error dnsmasq: found = already running DHCP-server on interface 'se00' refusing to start, use = 'option force 1' to override >>>>>>>> * TODO [ 18.480468] Mirror/redirect action on >>>>>>>> [ 18.539062] Failed to load ipt action >>>>>>>> * upload and download are reversed in aqm >>>>>>>> * BCP38 >>>>>>>> * Squash CS values >>>>>>>> * Replace ntp >>>>>>>> * Make ahcp client mode >>>>>>>> * Drop more privs for polipo >>>>>>>> * upnp >>>>>>>> * priv separation >>>>>>>> * Review FW rules >>>>>>>> * dhcpv6 support >>>>>>>> * uci-defaults/make-cert.sh uses a bad path for px5g >>>>>>>> * Doesn't configure the web browser either >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> Best >>>>>>>> Sebastian >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> -- >>>>>>>> Dave T=E4ht >>>>>>>>=20 >>>>>>>> Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> -- >>>>>>> Dave T=E4ht >>>>>>>=20 >>>>>>> Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> --=20 >>>>>> Dave T=E4ht >>>>>>=20 >>>>>> Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html >>>>>=20 >>>>> _______________________________________________ >>>>> Cerowrt-devel mailing list >>>>> Cerowrt-devel@lists.bufferbloat.net >>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >>>>=20 >>>> _______________________________________________ >>>> Cerowrt-devel mailing list >>>> Cerowrt-devel@lists.bufferbloat.net >>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >>>=20 >>> _______________________________________________ >>> Cerowrt-devel mailing list >>> Cerowrt-devel@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >>=20 >=20 > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel --Apple-Mail=_6DFB72FB-03B7-46AC-831F-4F07D5924965 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 fredstratton@imap.cc> = wrote:
Thank = you.

This is an initial = response.

Am using 3.10.2-1 currently, with the = standard AQM interface. This does not have the pull down menu of your = interface, which is why I ask if both are active. 
On = 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:

Hi = Fred,


On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> = wrote:


On 25 Aug 2013, at 10:21, = Fred Stratton <fredstratton@imap.cc> = wrote:

As the person with the most = flaky ADSL link, I point out that None of these recent, welcome, = changes, are having any effect here, with an uplink sped of circa 950 = kbits/s.

Okay, how flaky is you link? What = rate of Errors do you have while testing? I am especially interested in = CRC errors and ES SES and HEC, just to get an idea how flaky the line = is...


The = reason I mention this is that it is still impossible to watch iPlayer = Flash streaming video and download at the same time, The iPlayer stream = fails. The point of the exercise was to achieve this.

The uplink = delay is consistently around 650ms, which appears to be too high for = effective streaming. In addition, the uplink stream has multiple breaks, = presumably outages, if the uplink rate is capped at, say, 700 = kbits/s.

Well, watching video is going to = stress your downlink so the uplink should not saturate by the ACKs and = the concurrent downloads also do not stress your uplink except for the = ACKs, so this points to downlink errors as far as I can tell from the = data you have given. If the up link has repeated outages however, your = problems might be unfixable because these, if long enough, will cause = lost ACKs and will probably trigger retransmission, independent of = whether the link layer adjustments work or not. (You could test this by = shaping you up and downlink to <=3D 50% of the link rates and disable = all link layer adjustments, 50% is larger than the ATM worst case so = should have you covered. Well unless you del link has an excessive = number of tones reserved for forward error correction = (FEC)).

Uptime = 100655
downstream 12162 kbits/s
CRC errors = 10154
FEC Errors 464
hEC Errors = 758

upstream 1122 kbits/s
no errors = in period.

Could you = perform the following test by any chance: state iPlayer and yor typical = downloads and then have a look at http://gw.home.lan:81und the = following tab chain Status -> Realtime Graphs -> Traffic -> = Realtime Traffic. If during your test the Outbound rate stays well below = you shaped limit and you still encounter the stream failure I would say = it is save to ignore the link layer adjustments as cause of your = issues.

Am happy reducing rate to fifty = per cent, but the uplink appears to have difficulty operating below = circa 500 kbits/s. This should not be so. I shall try a fourth = time.



YouTube has no = problems.

I remain unclear whether the use of tc-stab and htb are = mutually exclusive options, using the present stock = interface.

Well, depending on the version of = the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link = layer adjustment mechanism, so you should select tc_stab. My most recent = modifications to Toke and Dave's AQM package does only allow you to = select one or the other. In any case selecting BOTH is not a reasonable = thing to do, because best case it will only apply overhead twice, worst = case it would also do the (link layer adjustments) LLA = twice


See initial comments.


The current ISP connection = is IPoA LLC.

Correction - Bridged LLC. =

Well, I think you should try to = figure out your overhead empirically and check the encapsulation. I = would recommend you run the following script on you r link over night = and send me the log file it produces:

#! /bin/bash
# TODO use = seq or bash to generate a list of the requested sizes (to alow for = non-equdistantly spaced sizes)

# Telekom Tuebingen Moltkestrasse = 6
TECH=3DADSL2
# finding a proper target IP is somewhat of an art, = just traceroute a remote site
# and find the nearest host reliably = responding to pings showing the smallet variation of = pingtimes
TARGET=3D87.186.197.70 # T
DATESTR=3D`date = +%Y%m%d_%H%M%S` = # to allow multiple sequential = records
LOG=3Dping_sweep_${TECH}_${DATESTR}.txt


# by = default non-root ping will only end one packet per second, so work = around that by calling ping independently for each package
# = empirically figure out the shortest period still giving the standard = ping time (to avoid being slow-pathed by our = host)
PINGPERIOD=3D0.01 # in = seconds
PINGSPERSIZE=3D10000

# Start, needed to find the per = packet overhead dependent on the ATM encapsulation
# to reliably show = ATM quantization one would like to see at least two steps, so cover a = range > 2 ATM cells (so > 96 bytes)
SWEEPMINSIZE=3D16 # 64bit = systems seem to require 16 bytes of payload to include a = timestamp...
SWEEPMAXSIZE=3D116


n_SWEEPS=3D`expr = ${SWEEPMAXSIZE} - = ${SWEEPMINSIZE}`


i_sweep=3D0
i_size=3D0

while [ = ${i_sweep} -lt ${PINGSPERSIZE} ]
do
   (( = i_sweep++ ))
   echo "Current iteration: = ${i_sweep}"
   # now loop from sweepmin to = sweepmax
   i_size=3D${SWEEPMINSIZE}
=    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
=    do
echo "${i_sweep}. repetition of = ping size ${i_size}"
ping -c 1 -s ${i_size} ${TARGET} = >> ${LOG} &
(( i_size++ ))
# we need = a sleep binary that allows non integer times (GNU sleep is fine as is = sleep of macosx 10.8.4)
sleep ${PINGPERIOD}
=    done
done

#tail -f ${LOG}

echo = "Done... ($0)"


Please set TARGET to the closest IP host on = the ISP side of your link that gives reliable ping RTTs (using "ping -c = 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the = same ballpark when you reduce the ping period to 0.01 (you might have to = increase the period until the RTTs are close to the standard 1 ping per = second case). I can then run this through my matlab code to detect the = actual overhead. (I am happy to share the code as well, if you have = matlab available; it might even run under octave but I have not tested = that since the last major changes).

To = follow at some point.



Whatever byte value is used = for tc-stab makes no change.

I assume = you talk about the overhead? Missing link layer adjustment will eat = between 50% and 10% of your link bandwidth, while missing overhead = values will be more benign. The only advise I can give is to pick the = overhead that actually describes your link. I am willing to help you = figure this out.

The link is bridged LLC. = Have been using 18 and 32 for test purposes. I shall move to PPPoA = VC-MUX in 4 months.


I have applied the ingress = modification to simple.qos, keeping the original version., and tested = both.

For which cerowrt version? It is = only expected to do something for 3.10.9-1 and upwards, before that the = HTB lionklayer adjustment did NOT = work.

Using = 3.10.9-2



I have changed the Powerline = adaptors I use to ones with known smaller buffers, though this is = unlikely to be a ate-limiting step.

I have changed the 2Wire = gateway, known to be heavily buffered, with a bridged Huawei HG612, with = a Broadcom 6368 SoC.

This device has a permanently on telnet = interface, with a simple password, which cannot be changed other than by = firmware recompilation=85

Telnet, however, allows txqueuelen to = be reduced from 1000 to 0.

None of these changes affect the = problematic uplink delay.

So how = did you measure the uplink delay? The RRUL plots you sent me show an = increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel = on simplest.qos, how does that reconcile with 650ms uplink delay, = netalyzr?

Max Planck and Netalyzr produce = the same figure. I use both, but Max Planck gives you circa 3 tries per = IP address per 24 hours.



On 24 Aug 2013, at = 21:51, Sebastian Moeller <moeller0@gmx.de> = wrote:

Hi Dave,


On Aug 23, = 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> = wrote:




On Fri, Aug 23, 2013 = at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
Hi = Dave,

I guess I found the culprit:

once I added $ADSLL to = the ingress() in simple.qos:



I had that in there = originally. I ripped it out because it seemed to help with ADSL at the = time - as I was unaware the extent that the whole subsystem was = busted!

Ah, and I had added my stab based = version to both ingress() and egress() assuming that both links need to = be kept under control. So with fixed htb link layer adjustment (LLA) it = only worked on the uplink and in retrospect if I look at my initial test = data I actually see one of the hallmarks of a working LLA for the = upstream. (The upstream good-put was reduced compared to the no LLA = test, caused by LLA making the actually sent packets larger so fewer = packets fit through the shaped link). But since I was not expecting only = half a working system I overlooked that in the data.
But = looking at the latency of the ping RTT probes it becomes quite clear = that only doing link layer adjustments on the uplink is even worse than = not doing it all (because the latency is still almost as bad as without = LLA but the up-link bandwidth is reduced).


I like to think of the process we've just gone through = as "wow, we just fixed the uk, and a few other countries". :) Feels kind = of good, doesn't it? (Too bad the pay sucks.)

Oh, I can = not complain about pay, I have a day job in totally different field, so = this is more of a hobby for me :)

I = mean, jeeze, chopping another 30+ms off the latency of that many systems = should get medals from economists worldwide monitoring productivity. =

Does anyone have a date/kernel version on when linklayer = overhead compensation stopped working? There was a bug even prior to 3.8 = that looked bad. (and RED was busted for 3 years).

Another step = would be trying to improve openwrt's native qos system somewhat in the = DSL case. They don't use this subsystem (probably because it didn't = work), and it's also broke on ipv6. (They use conn = track)

Oh, in the bql-40 time frame I = hacked the stab based LLA into their generate.sh and it worked quite = well, even though at time my measurements were quite crude. SInce their = qos scripts are HFSC based the HTB private implementation is not going = to do them any good. Luckily now that does not seem to matter as both = methods now perform identically as they should. (Well, now Jespers last = changes are nicer than the old table lookup, but it should be relatively = say to implant the same for stab, heck once I got my linux machine up I = might take this as my first attempt at making local changes to the = kernel :) ). So adding it to openwrt proper should be a piece of cake. = Do you know by any chance who would be the best person to contact for = that, ?


At some point I'd like to = have a mechanism for saner diffserv classification on egress, and to = clamp ingress values to egress ones. There is a ton of work going on on = finding sane codepoints on webrtc in the = ietf=85.






ingress() = {

CEIL=3D$DOWNLINK
PRIO_RATE=3D`expr $CEIL / 3` # Ceiling for = prioirty
BE_RATE=3D`expr $CEIL / 6`   # Min for best = effort
BK_RATE=3D`expr $CEIL / 6`   # Min for = background
BE_CEIL=3D`expr $CEIL - 64`  # A little slop at the = top

LQ=3D"quantum `get_mtu $IFACE`"

$TC qdisc del dev = $IFACE handle ffff: ingress 2> /dev/null
$TC qdisc add dev $IFACE = handle ffff: ingress

$TC qdisc del dev $DEV root  2> = /dev/null
$TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb = default 12
$TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate = ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
$TC class add dev $DEV parent 1:1 = classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 = $ADSLL
$TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate = 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
$TC class add dev $DEV = parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit = prio 2 $ADSLL
$TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ = rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL

# I'd = prefer to use a pre-nat filter but that causes permutation...

$TC = qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN = `get_quantum 500` `get_flows ${PRIO_RATE}`
$TC qdisc add dev $DEV = parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` = `get_flows ${BE_RATE}`
$TC qdisc add dev $DEV parent 1:13 handle 130: = $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows = ${BK_RATE}`

diffserv $DEV

ifconfig $DEV up

# = redirect all IP packets arriving in $IFACE to ifb0

$TC filter add = dev $IFACE parent ffff: protocol all prio 10 u32 \
match u32 0 0 = flowid 1:1 action mirred egress redirect dev $DEV

}

I get = basically the same RRUL ping RTTs for htb_private as for tc_stab. So = Jesper was right the patch seems to fix the issue. I guess I should send = out my current version of yours and Toke's AQM scripts = soon.



Best
Sebastian

P.S.: I am not sure = whether I want to tackle the PIE issue today...



On Aug = 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> = wrote:

quick note: running this script = requires that you

ifconfig ifb0 up

at some = point.

In my case on cerowrt you took care of that = already...




On Fri, Aug 23, = 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
Hi = Dave,

On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> = wrote:




On Thu, Aug 22, 2013 = at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
Hi = List, hi Jesper,

So I tested 3.10.9-1 to assess the status of the = HTB atm link layer adjustments to see whether the recent changes = resurrected this feature.
Unfortunately the htb_private link layer = adjustments still is broken (RRUL ping RTT against Toke's netperf host = in Germany of ~80ms, same as without link layer adjustments). On the = bright side the tc_stab method still works as well as before (ping RTT = around 40ms).
I would like to humbly propose to use the tc stab = method in cerowrt to perform ATM link layer adjustments as default. To = repeat myself, simply telling the kernel a lie about the packet size = seems more robust than fudging HTB's rate tables. Especially since the = kernel already fudges the packet size to account for the ethernet header = and then some, so this path should receive more scrutiny by virtue of = having more users?

It's my hope that the atm code works but is = misconfigured. You can output the tc commands by overriding the TC = variable with TC=3D"echo tc" and paste here.

So I = went for TC=3D"logger tc" and used log read to harvest as I could not = find the echo output, but I guess that should not matter. So here is the = result (slightly edited to get rid of the log timestamps and log = level):

tc qdisc del dev ge00 root
tc qdisc add dev ge00 root = handle 1: htb default 12
tc class add dev ge00 parent 1: classid 1:1 = htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl = overhead 40 mtu 2047
tc class add dev ge00 parent 1:1 classid 1:10 = htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl = overhead 40 mtu 2047
tc class add dev ge00 parent 1:1 classid 1:11 = htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl = overhead 40 mtu 2047
tc class add dev ge00 parent 1:1 classid 1:12 = htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl = overhead 40 mtu 2047
tc class add dev ge00 parent 1:1 classid 1:13 = htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl = overhead 40 mtu 2047
tc qdisc add dev ge00 parent 1:11 handle 110: = fq_codel limit 600 noecn quantum 300
tc qdisc add dev ge00 parent = 1:12 handle 120: fq_codel limit 600 noecn quantum 300
tc qdisc add = dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum = 300
tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match = ip protocol 0 0x00 flowid 1:12
tc filter add dev ge00 parent 1:0 = protocol ip prio 1 handle 1 fw classid 1:11
tc filter add dev ge00 = parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
tc filter add = dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
tc = filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid = 1:11
tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 = fw classid 1:12
tc filter add dev ge00 parent 1:0 protocol ipv6 prio = 6 handle 3 fw classid 1:13
tc filter add dev ge00 parent 1:0 protocol = arp prio 7 handle 1 fw classid 1:11
tc qdisc del dev ge00 handle = ffff: ingress
tc qdisc add dev ge00 handle ffff: ingress
tc qdisc = del dev ifb0 root
tc qdisc add dev ifb0 root handle 1: htb default = 12
tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate = 15494kbit ceil 15494kbit
tc class add dev ifb0 parent 1:1 classid = 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
tc class = add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil = 5164kbit prio 1
tc class add dev ifb0 parent 1:1 classid 1:12 htb = quantum 1500 rate 2582kbit ceil 15430kbit prio 2
tc class add dev = ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil = 15430kbit prio 3
tc qdisc add dev ifb0 parent 1:11 handle 110: = fq_codel limit 1000 ecn quantum 500
tc qdisc add dev ifb0 parent 1:12 = handle 120: fq_codel limit 1000 ecn quantum 1500
tc qdisc add dev = ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
tc = filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip = protocol 0 0x00 flowid 1:12
tc filter add dev ifb0 protocol ip parent = 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
tc filter add dev = ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc = classid 1:12
tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 = match ip tos 0x20 0xfc classid 1:13
tc filter add dev ifb0 protocol = ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid = 1:13
tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match = ip tos 0x10 0xfc classid 1:11
tc filter add dev ifb0 protocol ipv6 = parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
tc = filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 = 0xfc classid 1:11
tc filter add dev ifb0 protocol ipv6 parent 1:0 = prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
tc filter add = dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc = classid 1:11
tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 = u32 match ip6 priority 0xc0 0xfc classid 1:11
tc filter add dev ifb0 = protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid = 1:11
tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 = match ip6 priority 0xe0 0xfc classid 1:11
tc filter add dev ifb0 = protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid = 1:11
tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 = match ip6 priority 0x90 0xfc classid 1:11
tc filter add dev ifb0 = parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
tc filter = add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid = 1:1 action mirred egress redirect dev ifb0

I notice it seem this = only shows up for egress(), but looking at simple.qos ingress() is not = addend ${ADSLL} at all so that is to be expected. There is nothing in = dmesg at all.

So I am off to add ADSLL to ingress() as well and = then test RRUL again...


Jesper please let me know if this = looks reasonable, at least to my eye it seems to fit with what "tc disc = add htb help" tells me. I tried your:
echo "func __detect_linklayer = +p" /sys/kernel/debug/dynamic_debug/control
but got no output even = though debugs was already = mounted=85

Best
Sebastian


Now, I have been testing this using Dave's most recent = cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think = this kernel should contain all htb fixes including commit 8a8e3d84b17 = (net_sched: restore "linklayer atm" handling) but am not fully = sure.

It does.

`@Dave is there an easy way to find which = patches you applied to the kernels of the cerowrt = (testing-)releases?

Normally I DO commit stuff that is in = testing, but my big push this time around was to get everything = important into mainline 3.10, as it will be the "stable" release for a = good long time.

So I am still mostly working the x86 side at the = moment. I WAS kind of hoping that everything I just landed would make it = up to 3.10. But for your perusal:

http:/= /snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the = kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped = out due to another weird bug I'm looking at. (It also has support for = ipv6 nat thx to the ever prolific stephen walker heeding the call for = patches...). 100% totally untested, I have this weird bug to figure out = how to fix next:

http://lists.alioth.debian.org/pipermail/babel-users/2013-Augu= st/001419.html

I fear it's a comparison gone south, maybe in = bradley's optimizations for not kernel trapping, don't = know.

3.10.9-2 also disables dnsmasq's dhcpv6 in favor of = 6relayd. I HATE losing the close naming integration, but, had to try = this....

If you guys want me to start committing and pushing = patches again, I'll do it, but most of that stuff will end up in = 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to = survive scrutiny on the netdev list in particular.

While I have = you r attention :) I also tested 3.10.9-1's pie and it is way better = than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but = still worse than fq_codel (ping RTTs around 40ms with proper atm link = layer adjustments).

This is with simple.qos I imagine? Simplest = should do better than that with pie. Judging from how its estimator = works I think it will do badly with multiple queues. But testing will = tell...

But, yea, this pie is actually usable, and the previous = wasn't. Thank you for looking at it!

It is different from cisco's = last pie drop in that it can do ecn, does local congestion notification, = has a better use of net_random, it's mostly KernelStyle, and I forget = what else.

There is still a major rounding error in the code, and = I'd like cisco to fix the api so it uses identical syntax to codel. = Right now you specify "target 8" to get "target 7", and the "ms" is = implied. target 5 becomes target 3. The default target is a whopping 20 = (rounded to 19), which is in part where your 70+ms of extra delay came = from.

Multiple parties have the delusion that 20ms is "good = enough".

Part of the remaining delay may also be rounding error. = Cisco uses kernels with HZ=3D1000, cero uses HZ=3D250.....

Anyway, = to get more comparable tests... you can fiddle with the two $QDISC lines = in simple*.qos to add a target 8 to get closer to a codel 5ms config, = but that would break a codel config which treats target 8 as target = 8us.

I MIGHT, if I get energetic enough, fix the API, the time = accounting, and a few other things in pie, the problem is, that = ns2_codel seems still more effective on most workloads and *fq_codel = smokes absolutely everything. There are a few places where pie is a win = over straight codel, notably on packet floods. And it may well be easier = to retrofit into existing hardware fast path designs.

I worry = about interactions between pie and other stuff. It seems inevitable at = this point that some form of pie will be widely deployed, and I simply = haven't tried enough traffic types and RTTs to draw a firm conclusion, = period. Long RTTs are the last big place where codel and pie and = fq_codel have to be seriously tested.

ns2_codel is looking pretty = good now, at the shorter RTTs I've tried. A big problem I have is = getting decent long RTT emulation out of netem (some preliminary code is = up at github)

... and getting cero stable enough for others to = actually use - next up is fixing the userspace problems.

... and = trying to make a small dent in the wifi problem along the way (couple = commits coming up)

... and find funding to get through the = winter.

There's probably a few other things that are on that list = but I forget. Oh, yea, since the aqm wg was voted on to be formed, I = decided I could quit smoking.

While I am not able to build = kernels, it seems that I am able to quickly test whether link layer = adjustments work or not. SO aim happy to help where I can :)

Give = pie target 8 and target 5 a shot, please? ns2_codel target 3ms and = target 7ms, too. fq_codel, same....

tc -s qdisc show dev = ge00
tc -s qdisc show dev ifb0

would be useful info to have in = general after each test.

TIA.

There are also things like = tcp_upload and tcp_download and tcp_bidirectional that are useful tests = in the rrul suite.

Thank you for your efforts on these early = alpha releases. I hope things will stablize more soon, and I'll fold = your aqm stuff into my next attempt this weekend.

This is some of = the stuff I know that needs fixing in userspace:

* TODO readlink = not found
* TODO netdev user missing
* TODO Wed Dec  5 = 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server = on interface 'se00' refusing to start, use 'option force 1' to = override
* TODO [   18.480468] Mirror/redirect action = on
[   18.539062] Failed to load ipt action
* upload and = download are reversed in aqm
* BCP38
* Squash CS values
* = Replace ntp
* Make ahcp client mode
* Drop more privs for = polipo
* upnp
* priv separation
* Review FW rules
* dhcpv6 = support
* uci-defaults/make-cert.sh uses a bad path for px5g
* = Doesn't configure the web browser = either




Best
Sebastian




--
Dave = T=E4ht

Fixing bufferbloat with cerowrt: http://www.teklibr= e.com/cerowrt/subscribe.html




--
Dav= e T=E4ht

Fixing bufferbloat with cerowrt: http://www.teklibr= e.com/cerowrt/subscribe.html




-- =
Dave T=E4ht

Fixing bufferbloat with cerowrt: http://www.teklibr= e.com/cerowrt/subscribe.html

_____________________= __________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bu= fferbloat.net
https://list= s.bufferbloat.net/listinfo/cerowrt-devel

_________= ______________________________________
Cerowrt-devel mailing = list
Cerowrt-devel@lists.bu= fferbloat.net
https://list= s.bufferbloat.net/listinfo/cerowrt-devel

_________= ______________________________________
Cerowrt-devel mailing = list
Cerowrt-devel@lists.bu= fferbloat.net
https://list= s.bufferbloat.net/listinfo/cerowrt-devel


_______________________________________________<= br>Cerowrt-devel mailing list
Cerowrt-devel@lists.bu= fferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel<= br>

= --Apple-Mail=_6DFB72FB-03B7-46AC-831F-4F07D5924965--