From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 1784E3B29E for ; Tue, 28 Apr 2020 16:33:36 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1588106010; bh=yvk69ZAykYMxtVb2eZlKZVLEUNRTTGrg2zKWHeV7wzc=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=aNom1/SlIzONX175IUBsNwk1lg2iERdDoWGGTavA7RbCc2SL1m10ZVisvLHW5/LCq 96dHb5WJK+kQkWlKMxTi/wkb3YItSyaL0V/IEbQ6kcGB2PfzYj4s149bIRTAUaoQkr fQSkautCkTGR6NHk0M9RFKni6JQl+BCKskWabCYk= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from hms-beagle2.lan ([77.8.233.10]) by mail.gmx.com (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MiaYJ-1j0VI50Jx1-00ffgK; Tue, 28 Apr 2020 22:33:30 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Sebastian Moeller In-Reply-To: Date: Tue, 28 Apr 2020 22:33:28 +0200 Cc: Luca Muscariello , "Holland, Jake" , tsvwg IETF list , bloat Content-Transfer-Encoding: quoted-printable Message-Id: References: <744B09AC-F251-4FB8-84B4-6628BB79E1F6@akamai.com> To: Gorry Fairhurst X-Mailer: Apple Mail (2.3445.104.14) X-Provags-ID: V03:K1:895yGHTelo9CovbmIQuxWb0pnFPUEh2QOUcC1kGmhBG9VYEg9oL aLqCOzeglIReoRRTFsJ0R1AkE02jr5Hshzfs+i6o/axPt4j6IXA0ZjiJpIkYgwmouWqnBUG 7unkHshh0ZrOZKn5gc3uEdIzPBfW6XuPp+BrXjXXeW/3lxCGdQq8AkfAcuVa+6fD16M4hxZ 9D/cfDkfGQzw3hRNvInMQ== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:kcSPG4KT8Gs=:6jEbfJn67XoXTSCmwkb09I Wug1KezGbbKbSupCS2DtLpRxyVddG+6Px6yElwR8yASdvH1rvFVieHh6aCKxeVoyLDfwiKNPy NoZ6FljL6NLKeVhZGwAL/NaAXbgaQA/A7TOQaibqlB6R/bITiIfQMl4kUNHVpCFiM2hP9sWsY LCQAjjFA0upw56p6w9er0IclHctVp1hUXwsxRRmczYhOyt5nrUjerrZVVUkQxqqQtj/YSKbks aAUCvKeOg8GA9zgPfrcquJE5dlQW71Bo1BVbqmVm74H7w/Kkk8AnuE/DO7R/UeW5zc2tzqPDO lU2oWvC+7bsGTCdOwryVvWYn+j6Lk91h4iIun0Hcb2X4aZGbj0tJ/z3vnsd4qqOFJBb0GOkJ7 uAXQZyeJVdQzTgSQTpPldnbs11Wp1IXYxZO9OKHA8u0IXLf6BmhtyZjYq6pfuCvv1u3VYNmv3 rFXxre2bWS3ihhxOhYayvaiRI+cgtE1U2zvFvAFfiCHa5S9+u9eb7ri/s/HAIheuGVCsLK7Fd 3OfEQsjvuiBjL4r20fS3TM/9IhAv1hvMaTGnzA33HJKlHian9SJGBEv3vRTvaOIivTYkOz7nU Bc3olEMBjIi7fotcL81+rtuePaiPePuPxY6MW+s/1kASBiv4qveKogZ1PXlZothPid9qnuqfo oG41vtZ8fXjBJy131+GZaZdcC9ax1Z1MDT3nc0I3XZ3XAPkP8KvgN5qRgJPU6wF64jVMli0GE tFWH3mKFZSiHrKDDu3BlHkHtX8ttAk8VwEqSp77dIQpkrO8FpH7d/UWci2ulItLN3FqSzIieZ XreGUeb96x3ADhW77cgtpvZLVHo9fipAbNDIZQTE078cWC4BRHw9mPP4jPdfechqC03ibRGn/ LLRytT9KGmxVzUb79aVPLBst9eByG4VMedC/4A/LERL/MTjxaPzkYm35EDDp97Gh4CNoZg0+u P/KwR7j+O1kxsvVvXQTiOlRpc+zMIINgE0ViY03bEy37WkSqbG1btgyn3TZDqY42/Q144uWjL 3h8atVjWWZUeRPumYhhDj6hqaNqrAUQX2TInrU0xO5AiWJ91bpoz4irFg5BAOON0v/nBD5qWj A0Qbq8KA+kJK4e19hGKepoGGk3M804F8jlGKS+dBOxS6H87VjqBSwvCbShDNPE5PPLNHs92Vp tBVBSPRKZ8wdEW3uZhVX/+o2MyfGEoWpsz9n6HNk9bQ6aAGWYM/YwPg2PElQ0di5Hcte9l8Ez SUIaSL5hGssuyviI2 Subject: Re: [Bloat] [tsvwg] my backlogged comments on the ECT(1) interim call X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Apr 2020 20:33:37 -0000 > On Apr 28, 2020, at 21:26, Gorry Fairhurst = wrote: >=20 > This seems all interesting, but isn't this true of any network = technology. If I use a UDP app with my own style of CC I can just take = all the capacity if I want. >=20 > A solution could be to apply a circuit-breaker/policer in the network = to perform admission control, but I don't see the link to L4S. Have I = missed something ? Maybe that the purposefully shallow LL-queue will make it = especially easy to drive it into overload-reactive pure dropping mode? = The designed-in (lack of) burst tolerance* really is not to expect = bursts in the LL-queue. Disruption will be possible with surprisingly = small bursts directed at the LL-queue (and due to ECT(1) without = admission control this will be a target even skript kiddies can not = miss) which will make the circuit-breaker/policer interventions a bit of = a gamble (if you throttle all traffic you basically compromises = low-latency, low-loss and throughput, but if you try to isolate the = offenders, you are in FQ territory and that is verboten for L4S).=20 But we have discussed the almost naive thread modelling of the L4S rfc's = before. To which the response was, a user can always implement queue = protection and (D)DOS is possible even today,so L4S is not making the = situation worse (which is a) a low bar to clear, and b) more than can be = said about RTT-dependence).=20 Best Regards Sebastian *) The dual queue coupled AQM removed PIE's additional burst tolerance = mode; I do not claim that that in itself is a problem, and that that = mode would buy much more resilience but it demonstrates a rather = haphazard approach to engineering iMHO. But see Pete's recent message = about how L4S copes with bursty traffic. >=20 > Gorry >=20 > On 28/04/2020 20:04, Luca Muscariello wrote: >> Hi Jake, >>=20 >> Thanks for the notes. Very useful. >> The other issue with the meeting was that the virtual mic queue = control channel was the WebEx Meeting chat that does not exist in WebEx = Teams. So, I had to switch to Meetings and lost some pieces of the = discussion.=20 >>=20 >> Yes there might be a terminology difference. Elastic traffic is = usually used in the sense of bandwidth sharing not just to define = variable bit rates. >>=20 >> The point is that there are incentives to cheat in L4S. >>=20 >> There is a priority queue that my application can enter by providing = as input ECT(1).=20 >> Applications such as on-line meetings will have a relatively low and = highly paced rate. >>=20 >> This traffic is conformant to dualQ L queue but is unresponsive to = congestion notifications. =20 >>=20 >> This is especially true for FEC streams which could be used to = ameliorate the media quality in presence of losses(e.g. Wi-Fi) >> or increased jitter. >>=20 >>=20 >> That was one more point on why using ECT(1) as input assumes trust or = a black list after being caught. >>=20 >> In both cases the ECT(1) as input is DoSable. >>=20 >>=20 >>=20 >> On Tue, Apr 28, 2020 at 7:12 PM Holland, Jake = wrote: >> Hi Luca, >>=20 >> =20 >> To your point about the discussion being difficult to follow: I tried = to capture the intent of everyone who commented while taking notes: >>=20 >> https://etherpad.ietf.org:9009/p/notes-ietf-interim-2020-tsvwg-03 >>=20 >> =20 >> I think this was intended to take the place of a need for everyone to = re-send the same points to the list, but of course some of the most = crucial points could probably use fleshing out with on-list follow up. >>=20 >> =20 >> It got a bit rough in places because I was disconnected a few times = and had to cut over to a local text file, and I may have failed to = correctly understand or summarize some of the comments, so there=E2=80=99s= chances I might have missed something, but I did my best to capture = them all. >>=20 >> =20 >> I encourage people to review comments and check whether they came out = more or less correct, and to offer formatting and cleanup suggestions if = there=E2=80=99s a good way to make it easier to follow. >>=20 >> =20 >> I had timestamps at the beginning of each main point of discussion, = with the intent that after the video is published it would be easier to = go back and check precisely what was said. It looks like someone has = been making cleanup edits that removed the first half of those so far, = but my local text file still has most of those and I can go back and = re-insert them if it seems useful. >>=20 >> =20 >> @Luca: during your comments in particular I think there might have = been a disruption--I had a =E2=80=9Cfirst comment missed, please check = video=E2=80=9D placeholder and I may have misunderstood the part about = video elasticity, but my interpretation at the time was that Stuart was = claiming that video was elastic in that it would adjust downward to = avoid overflowing a loaded link, and I thought you were claiming that it = was not elastic in that it would not exceed a maximum rate, which I = summarized as perhaps a semantic disagreement, but if you=E2=80=99d like = to help clean that up, it might be useful. >>=20 >> =20 >> =46rom this message, it sounds like the key point you were making was = that it also will not go below a certain rate, and perhaps that quality = can stay relatively good in spite of high network loss? >>=20 >> =20 >> Best regards, >>=20 >> Jake >>=20 >> =20 >> From: Luca Muscariello >> Date: Tuesday, April 28, 2020 at 1:54 AM >> To: Dave Taht >> Cc: tsvwg IETF list , bloat = >> Subject: Re: [Bloat] my backlogged comments on the ECT(1) interim = call >>=20 >> =20 >> Hi Dave and list members, >>=20 >> =20 >> It was difficult to follow the discussion at the meeting yesterday. >>=20 >> Who said what in the first place. >>=20 >> =20 >> There have been a lot of non-technical comments such as: this = solution >>=20 >> is better than another in my opinion. "better" has often been used >>=20 >> as when evaluating the taste of an ice cream: White chocolate vs = black chocolate. >>=20 >> This has taken a significant amount of time at the meeting. I haven't = learned >>=20 >> much from that kind of discussion and I do not think that helped to = make=20 >>=20 >> much progress. >>=20 >> =20 >> If people can re-make their points in the list it would help the = debate. >>=20 >> =20 >> Another point that a few raised is that we have to make a decision as = fast as possible. >>=20 >> I dismissed entirely that argument. Trading off latency with = resilience of the Internet >>=20 >> is entirely against the design principle of the Internet architecture = itself. >>=20 >> Risk analysis is something that we should keep in mind even when = deploying any experiment >>=20 >> and should be a substantial part of it.=20 >>=20 >> =20 >> Someone claimed that on-line meeting traffic is elastic. This is not = true, I tried to >>=20 >> clarify this. These applications (WebEx/Zoom) are low rate, a typical = maximum upstream >>=20 >> rate is 2Mbps and is not elastic. These applications have often a = stand-alone app >>=20 >> that is not using the browser WebRTC stack (the standalone app = typically works better). >>=20 >> =20 >> A client sends upstream one or two video qualities unless the video = camera is switched off.=20 >>=20 >> In presence of losses, FEC is used but it is still non elastic. >>=20 >> Someone claimed (at yesterday's meeting) that fairness is not an = issue (who cares, I heard!) >>=20 >> Well, fairness can constitute a differentiation advantage between two = companies that are=20 >>=20 >> commercializing on-line meetings products. Unless at the IETF we = accept=20 >>=20 >> "law-of-the-jungle" behaviours from Internet applications developers, = we should be careful >>=20 >> about making such claims. >>=20 >> Any opportunity to cheat, that brings a business advantage WILL be = used. >>=20 >> =20 >> /Luca >>=20 >> =20 >> TL;DR >>=20 >> To Dave: you asked several times what Cisco does on latency = reduction in >>=20 >> network equipment. I tend to be very shy when replying on these = questions >>=20 >> as this is not vendor neutral. If chairs think this is not = appropriate for >>=20 >> the list, please say it and I'll reply privately only. >>=20 >> =20 >> What I write below can be found in Cisco products data sheets and is = not >>=20 >> trade secret. There are very good blog posts explaining details. >>=20 >> Not surprisingly Cisco implements the state of the art on the topic >>=20 >> and it is totally feasible to do-the-right-thing in software and = hardware. >>=20 >> =20 >> Cisco implements AFD (one queue + a flow table) accompanied by a = priority queue for=20 >>=20 >> flows that have a certain profile in rate and size. The concept is = well known and well >>=20 >> studied in the literature. AFD is safe and can well serve a complex = traffic mix when=20 >>=20 >> accompanied by a priority queue. This prio-queue should not be = confused with a strict >>=20 >> priority queue (e.g. EF in diffserv). There are subtleties related to = the DOCSIS >>=20 >> shared medium which would be too long to describe here. >>=20 >> =20 >> This is available in Cisco CMTS for the DOCSIS segment. Bottleneck = traffic >>=20 >> does not negatively impact non-bottlenecked-traffic such as an = on-line meeting like >>=20 >> the WebEx call we had yesterday. It is safe from a network neutrality = point-of-view >>=20 >> and no applications get hurt.=20 >>=20 >> =20 >> Cisco implements AFD+prio also for some DC switches such as the Nexus = 9k. There >>=20 >> is a blog post written by Tom Edsal online that explains pretty well = how that works. >>=20 >> This includes mechanisms such as p-fabric to approximate SRPT = (shortest remaining processing time) >>=20 >> and minimize flow completion time for many DC workloads. The mix of = the two >>=20 >> brings FCT minimization AND latency minimization. This is silicon and = scales at any speed. >>=20 >> For those who are not familiar with these concepts, please search the = research work of Balaji=20 >>=20 >> Prabhakar and Ron Pang at Stanford. >>=20 >> =20 >> Wi-Fi: Cisco does airtime fairness in Aironet but I think in the = Meraki series too. >>=20 >> The concept is similar to what described above but there are several = queues, one per STA. >>=20 >> Packets are enqueued in the access (category) queue at dequeue time = from the air-time >>=20 >> packet scheduler.=20 >>=20 >> =20 >> On Mon, Apr 27, 2020 at 9:24 PM Dave Taht = wrote: >>=20 >> It looks like the majority of what I say below is not related to the >> fate of the "bit". The push to take the bit was >> strong with this one, and me... can't we deploy more of what we >> already got in places where it matters? >>=20 >> .... >>=20 >> so: A) PLEA: =46rom 10 years now, of me working on bufferbloat, = working >> on real end-user and wifi traffic and real networks.... >>=20 >> I would like folk here to stop benchmarking two flows that run for a = long time >> and in one direction only... and thus exclusively in tcp congestion >> avoidance mode. >>=20 >> Please. just. stop. Real traffic looks nothing like that. The = internet >> looks nothing like that. >> The netops folk I know just roll their eyes up at benchmarks like = this >> that prove nothing and tell me to go to ripe meetings instead. >> When y'all talk about "not looking foolish for not mandating ecn = now", >> you've already lost that audience with benchmarks like these. >>=20 >> Sure, setup a background flow(s) like that, but then hit the result >> with a mix of >> far more normal traffic? Please? networks are never used = unidirectionally >> and both directions congesting is frequent. To illustrate that = problem... >>=20 >> I have a really robust benchmark that we have used throughout the = bufferbloat >> project that I would like everyone to run in their environments, the = flent >> "rrul" test. Everybody on both sides has big enough testbeds setup = that a few >> hours spent on doing that - and please add in asymmetric networks = especially - >> and perusing the results ought to be enlightening to everyone as to = the kind >> of problems real people have, on real networks. >>=20 >> Can the L4S and SCE folk run the rrul test some day soon? Please? >>=20 >> I rather liked this benchmark that tested another traffic mix, >>=20 >> ( = https://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pd= f ) >>=20 >> although it had many flaws (like not doing dns lookups), I wish it >> could be dusted off and used to compare this >> new fangled ecn enabled stuff with the kind of results you can merely = get >> with packet loss and rtt awareness. It would be so great to be able >> to directly compare all these new algorithms against this benchmark. >>=20 >> Adding in a non ecn'd udp based routing protocol on heavily >> oversubscribed 100mbit link is also enlightening. >>=20 >> I'd rather like to see that benchmark improved for a more modernized >> home traffic mix >> where it is projected there may be 30 devices on the network on = average, >> in a few years. >>=20 >> If there is any one thing y'all can do to reduce my blood pressure = and >> keep me engaged here whilst you >> debate the end of the internet as I understand it, it would be to run >> the rrul test as part of all your benchmarks. >>=20 >> thank you. >>=20 >> B) Stuart Cheshire regaled us with several anecdotes - one concerning >> his problems >> with comcast's 1Gbit/35mbit service being unusable, under load, for >> videoconferencing. This is true. The overbuffering at the CMTSes >> still, has to be seen to be believed, at all rates. At lower rates >> it's possible to shape this, with another device (which is what >> the entire SQM deployment does in self defense and why cake has a >> specific docsis ingress mode), but it is cpu intensive >> and requires x86 hardware to do well at rates above 500Mbits, = presently. >>=20 >> So I wish CMTS makers (Arris and Cisco) were in this room. are they? >>=20 >> (Stuart, if you'd like a box that can make your comcast link = pleasurable >> under all workloads, whenever you get back to los gatos, I've got a = few >> lying around. Was so happy to get a few ietfers this past week to = apply >> what's off the shelf for end users today. :) >>=20 >> C) I am glad bob said the L4S is finally looking at asymmetric >> networks, and starting to tackle ack-filtering and accecn issues >> there. >>=20 >> But... I would have *started there*. Asymmetric access is the = predominate form >> of all edge technologies. >>=20 >> I would love to see flent rrul test results for 1gig/35mbit, 100/10, = 200/10 >> services, in particular. (from SCE also!). "lifeline" service (11/2) >> would be good >> to have results on. It would be especially good to have baseline >> comparison data from the measured, current deployment >> of the CMTSes at these rates, to start with, with no queue management = in >> play, then pie on the uplink, then fq_codel on the uplink, and then >> this ecn stuff, and so on. >>=20 >> D) The two CPE makers in the room have dismissed both fq and sce as >> being too difficult to implement. They did say that dualpi was >> actually implemented in software, not hardware. >>=20 >> I would certainly like them to benchmark what they plan to offer in = L4S >> vs what is already available in the edgerouter X, as one low end >> example among thousands. >>=20 >> I also have to note, at higher speeds, all the buffering moves into >> the wifi and the results are currently ugly. I imagine >> they are exploring how to fix their wifi stacks also? I wish more = folk >> were using RVR + latency benchmarks like this one: >>=20 >> = http://flent-newark.bufferbloat.net/~d/Airtime%20based%20queue%20limit%20f= or%20FQ_CoDel%20in%20wireless%20interface.pdf >>=20 >> Same goes for the LTE folk. >>=20 >> E) Andrew mcgregor mentioned how great it would be for a closeted = musician to >> be able to play in real time with someone across town. that has been = my goal >> for nearly 30 years now!! And although I rather enjoyed his = participation in >> my last talk on the subject ( >> = https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-ov= er-yet/ >> ) conflating >> a need for ecn and l4s signalling for low latency audio applications >> with what I actually said in that talk, kind of hurt. I achieved >> "my 2ms fiber based guitarist to fiber based drummer dream" 4+ years >> back with fq_codel and diffserv, no ecn required, >> no changes to the specs, no mandating packets be undroppable" and >> would like to rip the opus codec out of that mix one day. >>=20 >> F) I agree with jana that changing the definition of RFC3168 to suit >> the RED algorithm (which is not pi or anything fancy) often present = in >> network switches, >> today to suit dctcp, works. But you should say "configuring red to >> have l4s marking style" and document that. >>=20 >> Sometimes I try to point out many switches have a form of DRR in = them, >> and it's helpful to use that in conjunction with whatever diffserv >> markings you trust in your network. >>=20 >> To this day I wish someone would publish how much they use DCTCP = style >> signalling on a dc network relative to their other traffic. >>=20 >> To this day I keep hoping that someone will publish a suitable >> set of RED parameters for a wide variety of switches and routers - >> for the most common switches and ethernet chips, for correct DCTCP = usage. >>=20 >> Mellonox's example: >> ( = https://community.mellanox.com/s/article/howto-configure-ecn-on-mellanox-e= thernet-switches--spectrum-x >> ) is not dctcp specific. >>=20 >> many switches have a form of DRR in them, and it's helpful to use = that >> in conjunction with whatever diffserv markings you trust in your >> network, >> and, as per the above example, segregate two red queues that way. = From >> what I see >> above there is no way to differentiate ECT(0) from ECT(1) in that = switch. (?) >>=20 >> I do keep trying to point out the size of the end user ecn enabled >> deployment, starting with the data I have from free.fr. Are we >> building a network for AIs or people? >>=20 >> G) Jana also made a point about 2 queues "being enough" (I might be >> mis-remembering the exact point). Mellonoxes ethernet chips at 10Gig = expose >> 64 hardware queues, some new intel hardware exposes 2000+. How do = these >> queues work relative to these algorithms? >>=20 >> We have generally found hw mq to be far less of a benefit than the >> manufacturers think, especially as regard to >> lower latency or reduced cpu usage (as cache crossing is a bear). >> There is a lot of software work in this area left to be done, however >> they are needed to match queues to cpus (and tenants) >>=20 >> Until sch_pie gained timestamping support recently, the rate = estimator >> did not work correctly in a hw mq environment. Haven't looked over >> dualpi in this respect. >>=20 >>=20 >>=20 >>=20 >>=20 >> --=20 >> Make Music, Not War >>=20 >> Dave T=C3=A4ht >> CTO, TekLibre, LLC >> http://www.teklibre.com >> Tel: 1-831-435-0729 >> _______________________________________________ >> Bloat mailing list >> Bloat@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/bloat >>=20 > --=20 > G. Fairhurst, School of Engineering >=20