From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id D2CC221F1ED for ; Fri, 23 Aug 2013 03:15:16 -0700 (PDT) Received: from u-089-cab204a2.am1.uni-tuebingen.de ([134.2.89.3]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0MWTSA-1VaU4h1cGA-00XcXR for ; Fri, 23 Aug 2013 12:15:13 +0200 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) From: Sebastian Moeller In-Reply-To: <20130823092702.3171b5fd@redhat.com> Date: Fri, 23 Aug 2013 12:15:12 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <56B261F1-2277-457C-9A38-FAB89818288F@gmx.de> <2148E2EF-A119-4499-BAC1-7E647C53F077@gmx.de> <03951E31-8F11-4FB8-9558-29EAAE3DAE4D@gmx.de> <20130823092702.3171b5fd@redhat.com> To: Jesper Dangaard Brouer X-Mailer: Apple Mail (2.1508) X-Provags-ID: V03:K0:HllktxoApmzRsyRrc3hgzvF18b12lGI/9njAiVARhsQuY4mpwLJ pQzyqjfeVjhcsXKE4eBuC9Sz3Sal105mEhy+leFKFuo1nT5//CiIA6E4n631AiNpMwT2BNC 47blvX3OPij+DXTlLk9eT44uowdhfwO0hNTj1I9yje77v9/NZ4qUsHrWUWBSl4SweLeGJZ6 5/ZGkqg1JGV7pv/H6kHjw== Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] some kernel updates X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Aug 2013 10:15:17 -0000 Hi Jesper, On Aug 23, 2013, at 09:27 , Jesper Dangaard Brouer = wrote: > On Thu, 22 Aug 2013 22:13:52 -0700 > Dave Taht wrote: >=20 >> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller = wrote: >>=20 >>> Hi List, hi Jesper, >>>=20 >>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer >>> adjustments to see whether the recent changes resurrected this = feature. >>> Unfortunately the htb_private link layer adjustments still is >>> broken (RRUL ping RTT against Toke's netperf host in Germany of = ~80ms, same >>> as without link layer adjustments). On the bright side the tc_stab = method >>> still works as well as before (ping RTT around 40ms). >>> I would like to humbly propose to use the tc stab method in >>> cerowrt to perform ATM link layer adjustments as default. To repeat = myself, >>> simply telling the kernel a lie about the packet size seems more = robust >>> than fudging HTB's rate tables. >=20 > After the (regression) commit 56b765b79 ("htb: improved accuracy at > high rates"), the kernel no-longer uses the rate tables. =20 See, I am quite a layman here, spelunking through the tc and = kernel source code made me believe that the rate tables are still used = (I might have looked at too old versions of both repositories though). >=20 > My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling), > does the ATM cell overhead calculation directly on the packet length, > see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53). > Thus, the cell calc should actually be more precise now.... but see = below Is there any way to make HTB report which link layer it assumes? >=20 >>> Especially since the kernel already fudges >>> the packet size to account for the ethernet header and then some, so = this >>> path should receive more scrutiny by virtue of having more users? >=20 > As you mention, the default kernel path (not tc stab) fudges the = packet > size for Ethernet headers, AND I made a mistake (back in approx 2006, > sorry) that the "overhead" cannot be a negative number. =20 Mmh, does this also apply to stab? > Meaning that > some ATM encap overheads simply cannot be configured correctly (as you > need to subtract the ethernet header). Yes, I see, luckily PPPoA and IPoA seem quite rare, and setting = the overhead to be larger than it actually is is relatively benign, as = it will overestimate packe size. > (And its quite problematic to > change the kABI to allow for a negative overhead) Again I have no clue but overhead seems to be integer, not = unsigned, so why can it not be negative? >=20 > Perhaps we should change to use "tc stab" for this reason. But I'm = not > sure "stab" does the right thing either, and its accuracy is also > limited as its actually also table based. But why should a table be problematic here? As long as we can = assure the table is equal or larger to the largest packet we are golden. = So either we do the manly and stupid thing and go for 9000 byte jumbo = packets for the table size. Or we assume that for the most part ATM = users will art best use baby jumbo frames (I think BT does this to allow = payload MTU 1500 in spite of PPPoE encapsulation overhead) but than we = are quite fine with the default size table maxMTU of 2048 bytes, no? > We could easily change the > kernel to perform the ATM cell overhead calc inside "stab", and we > should also fix the GSO packet overhead problem. > (for now remember to disable GSO packets when shaping) Yeah I stumbled over the fact that the stab mechanism does not = honor the kernels earlier adjustments of packet length (but I seem to be = unable to find the actual file and line where this initially is = handeled). It would seem relatively easy to make stab take the earlier = adjustment into account. Regarding GSO, I assumed that GSO will not play = nicely with a AQM anyway as a single large packet will hog too much = transfer time... >=20 >> It's my hope that the atm code works but is misconfigured. You can = output >> the tc commands by overriding the TC variable with TC=3D"echo tc" and = paste >> here. >=20 > I also hope is a misconfig. Please show us the config/script. Will do this later. I would be delighted if it is just me being = stupid. >=20 > I would appreciate a link to the scripts you are using... perhaps a = git tree? Unfortunately I have no git tree and no experience with git. I = do not think I will be able to set something up quickly. But I use a = modified version of cerowrt's AQM scripts which I will post later. >=20 >=20 >>> Now, I have been testing this using Dave's most recent = cerowrt >>> alpha version with a 3.10.9 kernel on mips hardware, I think this = kernel >>> should contain all htb fixes including commit 8a8e3d84b17 = (net_sched: >>> restore "linklayer atm" handling) but am not fully sure. >>>=20 >>=20 >> It does. >=20 > It have not hit the stable tree yet, but DaveM promised he would pass = it along. >=20 > It does seem Dave Taht have my patch applied: > = http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-re= store-linklayer-atm-handling.patch Ah, good so it should have worked. >=20 >>> While I am not able to build kernels, it seems that I am able to = quickly >>> test whether link layer adjustments work or not. SO aim happy to = help where >>> I can :) >=20 > So, what is you setup lab, that allow you to test this quickly? Oh, Dave and Toke are the giants on whose shoulders I stand here = (thanks guys), all I bring to the table basically is the fact that I = have an ATM carried ADSL2+ connection at home.=20 Anyway, my theory is that proper link layer adjustments should = only show up if not performing these would make my traffic exceed my = link-speed and hence accumulate in the DSL modems bloated buffers = leading to measurable increases in latency. So I try to saturate the = both up- and down-link while measuring latency und different conditions. = SInce the worst case overhead of the ATM encapsulation approaches 50% = (with best case being around 10%) I try to test the system while shaping = to 95% percent of link rates where do expect to see an effect of the = link layer adjustments and while shaping to 50% where do not expect to = see an effect. And basically that seems to work. Practically, I use Toke's netsurf-wrapper project with the RRUL = test from my cerowrt router behind an ADSL2+ modem to a close netperf = server in Germany. The link layer adjustments are configured in my = cerowrt router, using Dave's simple.qos script (3 band HTB shaper with = fq_codel on each leaf, taking my overhead of 40 bytes into account and = optionally the link layer). It turns out that this test nicely saturates my link with 4 up = and 4 down TCP flows ad uses a train ping probes at 0.2 second period to = assess the latency induced by saturating the links. Now I shape down to = 95% and 50% of line rates and simply look at the ping RTT plot for = different conditions. In my rig I see around 30ms ping RTT without load, = 80ms with full saturation and no linklayer adjustments, and 40ms with = working link layer adjustments (hand in hand with slightly reduced TCP = good put just as one would expect). In my testing so far activating the = HTB link layer adjustments yielded the same 80ms delay I get without = link layer adjustments. If I shape down to 50% of link rates HTB, stab = and no link layer adjustments yield a ping RTT of ~40ms. Still with = proper link layer adjustments the TCP good-put is reduced even at 50% = shaping. As Dave explained with an unloaded swallow ermm, ping RTT and = fq_codel's target set to 5ms the best case would be 30ms + 2*5ms or = 40ms, so I am pretty close to ideal with proper link layer adjustments. I guess it should be possible to simply use the reduction in = good-put as an easy indicator whether the link layer adjustments work or = not. But to do this properly I would need to be able to control the size = of the sent packets which I am not, at least not with RRUL. But I am = quite sure real computer scientists could easily set something up to = test the good-put through a shaping device at differently sized packet = streams of the same bandwidth, but I digress.=20 On the other hand I do not claim to be an expert in this field = in any way and my measurement method might be flawed, if you think so = please do not hesitate to let me know how I could improve it. Best Regards Sebastian >=20 >=20 > --=20 > Best regards, > Jesper Dangaard Brouer > MSc.CS, Sr. Network Kernel Developer at Red Hat > Author of http://www.iptv-analyzer.org > LinkedIn: http://www.linkedin.com/in/brouer