From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 274CF3B2A4 for ; Sat, 22 Jun 2019 17:10:31 -0400 (EDT) Received: by mail-pg1-x52c.google.com with SMTP id f25so5012037pgv.10 for ; Sat, 22 Jun 2019 14:10:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=AZbFg5jSjbobq/FPwP23vQfXtOGg+O3/R3fCZ4jfjIs=; b=ipHwpo4tU55+1z+ZELOUS8ojnmuCWaDLzL6F/vnCgU19bevtvumgyXlAwVPxid93Q4 BYMsRZVa4jYHOyGoV2UP7kUUCpNxd77TG8xG5dxQqxkEH8JYx/7TOupmwrviUuHtbWKU tas9tU/aDtcclfhfWIiMKoRQLHe7bzq9rouwpgpT0gtYD5jg5Eesg3WDz7RF013RWMCo 2K8tY/EwRC32+vdAex0gWxdhlSveStsAkBRu0uuiEjIAxj2f4MgnTxcG+cxY7/s3/GLd AubTeAJ9+ks2Pq0e92efPhhhKqUqjzCaLRQhkGBNusly+o23vakWc0g2UXprP+rtoVKn eJkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=AZbFg5jSjbobq/FPwP23vQfXtOGg+O3/R3fCZ4jfjIs=; b=k2olal2G2PWHqyAl7nUjkPIXdaTq8Y7126H8v8zBcfXG6EuOpFSdAdlfvrLE6WTLP6 d1vOpCS4wbLWP+tpOcNX/PQf++GWMC3y0a0csvsSl/5HEn7Yg36HOSIqv1Kc3LDx1i6r 9Ir4MexNUrhl7L6jKWI+ivjPe3LwN3ShqpzmtFz+JRnuy3gAr5mQTsL34LfTOxenL82g 6tpz7YNy3bcvPC9wI/5xfDcPFWOmtdgE6LaU8ZgCudeGzAq6w5cgnYsf79sh6hutn1oI LGCMrN4q71Jxw77p84OcUUAZ99fsvEdgTTfZnesdIZWMilTz0jhbXbZFyq35zTpbx/A/ /IwA== X-Gm-Message-State: APjAAAUidBhA7IQJL+YEKkVHF+a6j/+QFCBJMV8jWKyWq288Hr6l6DMr Uty+QYQl4lDY9PXa0TdNB3o= X-Google-Smtp-Source: APXvYqx7VrQswkJCGYQ7V4S8zAwMqKyEbuzT0O0q23pA7g58GFvrxfg68nak8kpbDGnIFWGakKhHdw== X-Received: by 2002:a17:90a:af8e:: with SMTP id w14mr15282371pjq.89.1561237830152; Sat, 22 Jun 2019 14:10:30 -0700 (PDT) Received: from [192.168.178.30] (32.23.255.123.dynamic.snap.net.nz. [123.255.23.32]) by smtp.gmail.com with ESMTPSA id d4sm5706424pju.19.2019.06.22.14.10.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Jun 2019 14:10:29 -0700 (PDT) To: "David P. Reed" Cc: Luca Muscariello , Sebastian Moeller , "ecn-sane@lists.bufferbloat.net" , tsvwg IETF list References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net> <46D1ABD8-715D-44D2-B7A0-12FE2A9263FE@gmx.de> <835b1fb3-e8d5-c58c-e2f8-03d2b886af38@gmail.com> <1561233009.95886420@apps.rackspace.com> From: Brian E Carpenter Message-ID: <85397b28-4a7a-e125-40b9-9cfce574260a@gmail.com> Date: Sun, 23 Jun 2019 09:10:24 +1200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <1561233009.95886420@apps.rackspace.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Sat, 22 Jun 2019 21:12:25 -0400 Subject: Re: [Ecn-sane] [tsvwg] per-flow scheduling X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Jun 2019 21:10:31 -0000 Just three or four small comments: On 23-Jun-19 07:50, David P. Reed wrote: > Two points: >=20 > =C2=A0 >=20 > - Jerry Saltzer and I were the primary authors of the End-to-end argume= nt paper, and the motivation was based *my* work on the original TCP and = IP protocols. Dave Clark got involved significantly later than all those = decisions, which were basically complete when he got involved. (Jerry was= my thesis supervisor, I was his student, and I operated largely independ= ently, taking input from various others at MIT). I mention this because D= ave understands the end-to-end arguments, but he understands (as we all d= id) that it was a design *principle* and not a perfectly strict rule. Tha= t said, it's a rule that has a strong foundational argument from modulari= ty and evolvability in a context where the system has to work on a wide r= ange of infrastructures (not all knowable in advance) and support a wide = range of usage/application-areas (not all knowable in advance). Treating = the paper as if it were "DDC" declaring a law is just wrong. He wasn't Mo= ses and it is not written on tablets. Dave > did have some "power" in his role of trying to achieve interoperability= across diverse implementations. But his focus was primarily on interoper= ability, not other things. So ideas in the IP protocol like "TOS" which w= ere largely placeholders for not-completely-worked-out concepts deferred = to the future were left till later. Yes, well understood, but he was in fact the link between the e2e paper a= nd the differentiated services work. Although not a nominal author of the= "two-bit" RFC, he was heavily involved in it, which is why I mentioned h= im. And he was very active in the IETF diffserv WG.=20 > - It is clear (at least to me) that from the point of view of the sourc= e of an IP datagram, the "handling" of that datagram within the network o= f networks can vary, and so that is why there is a TOS field - to specify= an interoperable, meaningfully described per-packet indicator of differe= ntial handling. In regards to the end-to-end argument, that handling choi= ce is a network function, *to the extent that it can completely be implem= ented in the network itself*. >=20 > Congestion management, however, is not achievable entirely and only wit= hin the network. That's completely obvious: congestion happens when the s= ource-destination flows exceed the capacity of the network of networks to= satisfy all demands. >=20 > The network can only implement *certain* general kinds of mechanisms th= at may be used by the endpoints to resolve congestion: >=20 > 1) admission controls. These are implemented at the interface between t= he source entity and the network of networks. They tend to be impractical= in the Internet context, because there is, by a fundamental and irrevers= ible design choice made by Cerf and Kahn (and the rest of us), no central= controller of the entire network of networks. This is to make evolvabili= ty and scalability work. 5G (not an Internet system) implies a central co= ntroller, as does SNA, LTE, and many other networks. The Internet is an o= verlay on top of such networks. >=20 > 2) signalling congestion to the endpoints, which will respond by slowin= g their transmission rate (or explicitly re-routing transmission, or comp= ressing their content) through the network to match capacity. This respon= se is done *above* the IP layer, and has proven very practical. The funct= ion in the network is reduced to "congestion signalling", in a universall= y understandable meaningful mechanism: packet drops, ECN, packet-pair sep= aration in arrival time, ...=C2=A0 This limited function is essential wit= hin the network, because it is the state of the path(s) that is needed to= implement the full function at the end points. So congestion signalling,= like ECN, is implemented according to the end-to-end argument by careful= ly defining the network function to be the minimum necessary mechanism so= that endpoints can control their rates. >=20 > 3) automatic selection of routes for flows. It's perfectly fine to sele= ct different routes based on information in the IP header (the part that = is intended to be read and understood by the network of networks). Now th= is is currently *rarely* done, due to the complexity of tracking more det= ailed routing information at the router level. But we had expected that e= ventually the Internet would be so well connected that there would be div= erse routes with diverse capabilities. For example, the "Interplanetary I= nternet" works with datagrams, that can be implemented with IP, but not u= sing TCP, which requires very low end-to-end latency. Thus, one would exp= ect that TCP would not want any packets transferred over a path via Mars,= or for that matter a geosynchronous satellite, even if the throughput wo= uld be higher. >=20 > So one can imagine that eventually a "TOS" might say - send this packet= preferably along a path that has at most 200 ms. RTT, *even if that lead= s to congestion signalling*, while another TOS might say "send this path = over the most "capacious" set of paths, ignoring RTT entirely. (these are= just for illustration, but obviously something like this woujld work). >=20 > Note that TOS is really aimed at *route selection* preferences, and not= queueing management of individual routers. That may well have been the original intention, but it was hardly mention= ed at all in the diffserv WG (which I co-chaired), and "QOS-based routing= " was in very bad odour at that time. =C2=A0 >=20 > Queueing management to share a single queue on a path for multiple prio= rities of traffic is not very compatible with "end-to-end arguments". The= re are any number of reasons why this doesn't work well. I can go into th= em. Mainly these reasons are why "diffserv" has never been adopted -=20 Oh, but it has, in lots of local deployments of voice over IP for example= =2E It's what I've taken to calling a limited domain protocol. What has n= ot happened is Internet-wide deployment, because... > it's NOT interoperable because the diversity of traffic between endpoin= ts is hard to specify in a way that translates into the network mechanism= s. Of course any queue can be managed in some algorithmic way with parame= ters, but the endpoints that want to specify an end-to-end goal don't hav= e a way to understand the impact of those parameters on a specific queue = that is currently congested. Yes. And thanks for your insights. Brian >=20 > =C2=A0 >=20 > Instead, the history of the Internet (and for that matter *all* network= s, even Bell's voice systems) has focused on minimizing queueing delay to= near zero throughout the network by whatever means it has at the endpoin= ts or in the design. This is why we have AIMD's MD as a response to detec= tion of congestion. >=20 > =C2=A0 >=20 > Pragmatic networks (those that operate in the real world) do not choose= to operate with shared links in a saturated state. That's known in the p= hone business as the Mother's Day problem. You want to have enough capaci= ty for the rare near-overload to never result in congestion.=C2=A0 Which = means that the normal state of the network is very lightly loaded indeed,= in order to minimize RTT. Consequently, focusing on somehow trying to op= timize the utilization of the network to 100% is just a purely academic e= xercise. Since "priority" at the packet level within a queue only improve= s that case, it's just a focus of (bad) Ph.D. theses. (Good Ph.D. theses = focus on actual real problems like getting the queues down to 1 packet or= less by signalling the endpoints with information that allows them to do= their job). >=20 > =C2=A0 >=20 > So, in considering what goes in the IP layer, both its header and the m= echanics of the network of networks, it is those things that actually hav= e implementable meaning in the network of networks when processing the IP= datagram. The rest is "content" because the network of networks doesn't = need to see it. >=20 > =C2=A0 >=20 > Thus, don't put anything in the IP header that belongs in the "content"= part, just being a signal between end points. Some information used in t= he network of networks is also logically carried between endpoints. >=20 > =C2=A0 >=20 > =C2=A0 >=20 > On Friday, June 21, 2019 4:37pm, "Brian E Carpenter" said: >=20 >> Below... >> On 21-Jun-19 21:33, Luca Muscariello wrote: >> > + David Reed, as I'm not sure he's on the ecn-sane list. >> > >> > To me, it seems like a very religious position against per-flow >> queueing.=C2=A0 >> > BTW, I fail to see how this would violate (in a "profound" way ) the= e2e >> principle. >> > >> > When I read it (the e2e principle) >> > >> > Saltzer, J. H., D. P. Reed, and D. D. Clark (1981) "End-to-End Argum= ents in >> System Design".=C2=A0 >> > In: Proceedings of the Second International Conference on Distribute= d >> Computing Systems. Paris, France.=C2=A0 >> > April 8=E2=80=9310, 1981. IEEE Computer Society, pp. 509-512. >> > (available on line for free). >> > >> > It seems very much like the application of the Occam's razor to func= tion >> placement in communication networks back in the 80s. >> > I see no conflict between what is written in that paper and per-flow= queueing >> today, even after almost 40 years. >> > >> > If that was the case, then all service differentiation techniques wo= uld >> violate the e2e principle in a "profound" way too, >> > and dualQ too. A policer? A shaper? A priority queue? >> > >> > Luca >> >> Quoting RFC2638 (the "two-bit" RFC): >> >> >>> Both these >> >>> proposals seek to define a single common mechanism that is used >> by >> >>> interior network routers, pushing most of the complexity and state= >> of >> >>> differentiated services to the network edges. >> >> I can't help thinking that if DDC had felt this was against the E2E pr= inciple, >> he would have kicked up a fuss when it was written. >> >> Bob's right, however, that there might be a tussle here. If end-points= are >> attempting to pace their packets to suit their own needs, and the netw= ork is >> policing packets to support both service differentiation and fairness,= >> these may well be competing rather than collaborating behaviours. And = there >> probably isn't anything we can do about it by twiddling with algorithm= s. >> >> Brian >> >> >> >> >> >> >> >> > >> > >> > >> > >> > >> > >> > =C2=A0 >> > >> > On Fri, Jun 21, 2019 at 9:00 AM Sebastian Moeller > > wrote: >> > >> > >> > >> > > On Jun 19, 2019, at 16:12, Bob Briscoe > > wrote: >> > > >> > > Jake, all, >> > > >> > > You may not be aware of my long history of concern about how >> per-flow scheduling within endpoints and networks will limit the Inter= net in >> future. I find per-flow scheduling a violation of the e2e principle in= such a >> profound way - the dynamic choice of the spacing between packets - tha= t most >> people don't even associate it with the e2e principle. >> > >> > Maybe because it is not a violation of the e2e principle at all? My = point >> is that with shared resources between the endpoints, the endpoints sim= ply should >> have no expectancy that their choice of spacing between packets will b= e conserved. >> For the simple reason that it seems generally impossible to guarantee = that >> inter-packet spacing is conserved (think "cross-traffic" at the bottle= neck hop >> along the path and general bunching up of packets in the queue of a fa= st to slow >> transition*). I also would claim that the way L4S works (if it works) = is to >> synchronize all active flows at the bottleneck which in tirn means eac= h sender has >> only a very small timewindow in which to transmit a packet for it to h= its its >> "slot" in the bottleneck L4S scheduler, otherwise, L4S's low queueing = delay >> guarantees will not work. In other words the senders have basically no= say in the >> "spacing between packets", I fail to see how L4S improves upon FQ in t= hat regard. >> > >> > >> > =C2=A0IMHO having per-flow fairness as the defaults seems quite >> reasonable, endpoints can still throttle flows to their liking. Now pe= r-flow >> fairness still can be "abused", so by itself it might not be sufficien= t, but >> neither is L4S as it has at best stochastic guarantees, as a single qu= eue AQM >> (let's ignore the RFC3168 part of the AQM) there is the probability to= send a >> throtteling signal to a low bandwidth flow (fair enough, it is only a = mild >> throtteling signal, but still). >> > But enough about my opinion, what is the ideal fairness measure in y= our >> mind, and what is realistically achievable over the internet? >> > >> > >> > Best Regards >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 Sebastian >> > >> > >> > >> > >> > > >> > > I detected that you were talking about FQ in a way that might have= >> assumed my concern with it was just about implementation complexity. I= f you (or >> anyone watching) is not aware of the architectural concerns with per-f= low >> scheduling, I can enumerate them. >> > > >> > > I originally started working on what became L4S to prove that it w= as >> possible to separate out reducing queuing delay from throughput schedu= ling. When >> Koen and I started working together on this, we discovered we had iden= tical >> concerns on this. >> > > >> > > >> > > >> > > Bob >> > > >> > > >> > > -- >> > > ________________________________________________________________ >> > > Bob Briscoe=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0http://bobbrisc= oe.net/ >> > > >> > > _______________________________________________ >> > > Ecn-sane mailing list >> > > Ecn-sane@lists.bufferbloat.net >> >> > > https://lists.bufferbloat.net/listinfo/ecn-sane >> > >> > _______________________________________________ >> > Ecn-sane mailing list >> > Ecn-sane@lists.bufferbloat.net >> >> > https://lists.bufferbloat.net/listinfo/ecn-sane >> > >> >> >=20