From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 9A3EA21F3EB for ; Fri, 14 Nov 2014 06:01:51 -0800 (PST) Received: from hms-beagle.am28.uni-tuebingen.de ([134.2.92.69]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0MPHrQ-1XthIM05zs-004TQS; Fri, 14 Nov 2014 15:01:47 +0100 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) From: Sebastian Moeller In-Reply-To: Date: Fri, 14 Nov 2014 15:01:40 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: =?windows-1252?Q?Dave_T=E4ht?= X-Mailer: Apple Mail (2.1878.6) X-Provags-ID: V03:K0:d194xnS9V3KXZcH8uYptNiVj23G+YpJSJGYocCFQhHQN4lJjtcv +mqHXfOO1cqjGV/NzhotawF1mjPYNpTunA73kEo4BNscoJJJPICu/Ys4sbAVe7WK86nCYNb X6uPk5zRORuyKXPNH3ky7y2nI5/mHUpaYjq8vmPpsKpEKbX9JteuJdVwUmvSWtuiva00ypb lnL+xC1l7V7jV+43LJIQA== X-UI-Out-Filterresults: notjunk:1; Cc: david.black@emc.com, "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] SQM: tracking some diffserv related internet drafts better X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2014 14:02:20 -0000 Hi Dave, I probably do not understand the topic fully, but... On Nov 13, 2014, at 18:26 , Dave Taht wrote: > This appears to be close to finalization, or finalized: >=20 > http://tools.ietf.org/html/draft-ietf-dart-dscp-rtp-10 >=20 > And this is complementary: >=20 > http://tools.ietf.org/html/draft-ietf-tsvwg-rtcweb-qos-03 Oha, that=92s 15 priority levels (out of ~64 possible?) right = there for a browser to mark packets with depending on media type. Now, = not all need to map to real queues but that seems a lot, so that I would = expect in real life a bunch of those will map to the same queues.=20 If I understand correctly we already have a problems getting = decent AQM implemented at core switching/routing equipment, how = realistic is it to expect that these devices implement differential = packet drop probabilities per diffserv markings? I f the answer is not = realistic the last three DS bits become functionally equal=85 . Also CS1 = for audio/video? I thought this to be the scavenger class and hence not = suitable for anything but bulk background traffic if there is even the = slightest contention on the path=85 (on second thought this will allow = to turn a CS1 internet radio in a decent congestion monitor, if the = audio skips you know the network is starting to develop issues=85). >=20 > While wading through all this is tedious, and much of the advice = contradictory, > there are a few things that could be done more right in the sqm system > that I'd like to discuss. (feel free to pour a cup of coffee and read > the drafts) >=20 > -1) They still think the old style tos imm bit is obsolete. Sigh. Am I > the last person that uses ssh or plays games? Are we free in cerowrt/SQM to just ignore this and just keep imm = (CS2?) above the best effort queue? >=20 > 0) Key to this draft is expecting that the AF code points on a single > 5-tuple not be re-ordered, which means dumping AF41 into a priority > queue and AF42 into the BE queue is incorrect. So what about sticking to the class selectors only in SQM? If I = understand correctly we can match on the CS bits only and ignore the = other bits; I think each AFNx map to the same CS(M) class=85 Looking at = section "4.2.2.3 Using the Class Selector PHB Requirements for IP = Precedence Compatibility=94 of = http://tools.ietf.org/html/rfc2474#page-11 seems to confirm that = interpretation=85 >=20 > 1) SQM only prioritizes a few diffserv codepoints (just the ones for > which I had tools doing classification, like ssh). Doing so with tc > rules is very inefficient presently. I had basically planned on > rolling a new tc and/or iptables filter to "do the right thing" to map > into all 64 codepoints via a simple lookup table (as what is in the > wifi code already), rather than use the existing mechanism... and > hesitated > as nobody had nailed down the definitions of each one. Well, "tc filter=94 hurts us badly as I figured out implementing = filters to look into PPP encapsulated packets to get to the TOS bits=85 = But in theory all tests for the individual code points can be turned = into a hawh operation in tc filter so that we only pay the price for = each encapsulation type (IPv4 IPv6, IPv$ in PPP, IPv6 in PPP, to poke = through the PPP layer costs a few additional ANDed match tests, but I = really really hope that tc filter is smart enough to stop filter = processing on the first mismatch=85) On my TODO list for SQM is to use = tc filter=92s hash functionality to process all code points in one = operation per packet. This should also allow/require mapping each of the = 64 diffserve markings to our queues so that any = =93ietf-recommendation-of-the-day=94 can be a easily implemented by = changing our mapping table... >=20 > That said, I have not measured recently the impact of the extra tc > filters and iptables rules required. As far as I can tell tc filter is costly, in a =93non-scientific=94= test with netperf-wrapper=92s RRUL test I saw the ICMP-CDF = =93robust-range=94 (the delay span in which the CDF went from ~5% to = 95%) incase from 10ms to 30ms. No idea about the iptables rules (well, = the internet seems to argue iptables being much cheaper than tc filter). >=20 > 1a) Certainly only doing AF42 in sqm is pretty wrong (that was left > over from my test patches against mosh - mosh ran with AF42 for a > while until they crashed a couple routers with it) Why? We could just switch to stash all CS4 packets into this = queue and be compliant again to the recommendation to treat packets in = each AFN set equally? >=20 > The relevant lines are here: >=20 > = https://github.com/dtaht/ceropackages-3.10/blob/master/net/sqm-scripts/fil= es/usr/lib/sqm/functions.sh#L411 >=20 > 1b) The cake code presently does it pretty wrong, which is eminately = fixable. >=20 > 1c) And given that the standards are settling, it might be time to > start baking them into a new tc or iptables filter. This would be a > small, interesting project for someone who wants to get their feet wet > writing this sort of thing, and examples abound of how to do it. So what I plan on doing until the end of the year is getting the = hashed tc filter set up for SQM than implementing/testing different = mappings will be a piece of cake just change 64 values and you are = done... >=20 > 2) A lot of these diffserv specs - notably all the AFxx codepoints - > are all about variable drop probability. (Not that this concept has > been proven to work in the real world) We don't do variable drop > probability... and I haven't the slightest clue as to how to do it in > fq_codel. But keeping variable diffserv codepoints in order on the > same 5 tuple seems to be the way things are going. Still I have > trouble folding these ideas into the 3 basic queue system fq_codel > uses, it looks to me as most of the AF codepoints end up in the > current best effort queue, as the priority queue is limited to 30% of > the bandwidth by default. Is this really relevant for the wider internet at all? As you = argue below (and as is argued in the drafts cited above) each network = can do what ever it likes with code points so the relevant question, as = I see it is not what could we do with the code points if we had all 6 = bits for us end to end, but rather how many and which bits actually = survive a trip over the open internet ;) >=20 >=20 > 3) Squashing inbound dscp should still be the default option=85 My interpretation of = http://tools.ietf.org/html/draft-ietf-dart-dscp-rtp-10 section 3.2=92s =93= When DiffServ is used, the edge or boundary nodes of a network are = responsible for ensuring that all traffic entering that network conforms = to that network's policies for DSCP and PHB usage, and such nodes may = change DSCP markings on traffic to achieve that result.=94 is anything = goes including remapping to all zeros aka squashing. = http://tools.ietf.org/html/rfc2474 talks about a MUST to put CS 6 and 7 = into a higher priority class than CS0, but I really doubt that any ISP = will allow me to label all my traffic CS7 an will treat it accordingly, = so remapping to zero is okay if not by standard then by cause of reality = ;).=20 >=20 > 4) My patch set to the wifi code for diffserv support disables the VO > queue almost entirely in favor of punting things to the VI queue > (which can aggregate), but I'm not sure if I handled AFxx > appropriately. >=20 > 5) So far as I know, no browser implements any of this stuff yet. So > far as I know nobody actually deployed a router that tries to do smart > things with this stuff yet. I would love to know whether the proposed markings actually = survive a trip through the open internet at all. I would like to argue = that until that actually happens this a nicely academic discussion = (cerowrt does a fine job already with its nice fq_codel hierarchy, and = if all the new fancy stuff will be wiped directly by my ISP I am not = sure that implementing the proposal in sam is going to change anything, = especially nothing that can be measured. As they say, =93measurement = data or it did not happen=94 ;) ) >=20 > 6) I really wish there were more codepoints for background traffic = than cs1. But isn=92t that what AF1x is all about?. I agree the range of = the 6 DS bits is not used to its fullest extend: rather than bits treat = is as a number and do: current priority =3D (DS - 32) so we have a range = from -32 to 31 (or so) and simply require that higher values are not = treated with less priority than smaller numbers. (Heck maybe special = case CS0 to also mean zero for backward/reality compatibility ;) ) But most likely I just have misunderstood the whole issue=85. Best Regards Sebastian >=20 > --=20 > Dave T=E4ht >=20 > thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel