From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp84.iad3a.emailsrvr.com (smtp84.iad3a.emailsrvr.com [173.203.187.84]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 67CDF3CB3A for ; Wed, 17 Jul 2019 18:18:37 -0400 (EDT) Received: from smtp27.relay.iad3a.emailsrvr.com (localhost [127.0.0.1]) by smtp27.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 38D8F24A7D; Wed, 17 Jul 2019 18:18:37 -0400 (EDT) X-SMTPDoctor-Processed: csmtpprox beta Received: from app13.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp27.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 163C623A69; Wed, 17 Jul 2019 18:18:37 -0400 (EDT) X-Sender-Id: dpreed@deepplum.com Received: from app13.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12); Wed, 17 Jul 2019 18:18:37 -0400 Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app13.wa-webapps.iad3a (Postfix) with ESMTP id 0300DA004C; Wed, 17 Jul 2019 18:18:37 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Wed, 17 Jul 2019 18:18:37 -0400 (EDT) X-Auth-ID: dpreed@deepplum.com Date: Wed, 17 Jul 2019 18:18:37 -0400 (EDT) From: "David P. Reed" To: "Sebastian Moeller" Cc: "Bob Briscoe" , "ecn-sane@lists.bufferbloat.net" , "tsvwg IETF list" MIME-Version: 1.0 Content-Type: text/plain;charset=UTF-8 Content-Transfer-Encoding: quoted-printable Importance: Normal X-Priority: 3 (Normal) X-Type: plain In-Reply-To: <40605F1F-A6F5-4402-9944-238F92926EA6@gmx.de> References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net> <40605F1F-A6F5-4402-9944-238F92926EA6@gmx.de> Message-ID: <1563401917.00951412@apps.rackspace.com> X-Mailer: webmail/16.4.5-RC Subject: Re: [Ecn-sane] per-flow scheduling X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jul 2019 22:18:37 -0000 I do want to toss in my personal observations about the "end-to-end argumen= t" related to per-flow-scheduling. (Such arguments are, of course, a class = of arguments to which my name is attached. Not that I am a judge/jury of su= ch questions...)=0A=0AA core principle of the Internet design is to move fu= nction out of the network, including routers and middleboxes, if those func= tions=0A=0Aa) can be properly accomplished by the endpoints, and =0Ab) are = not relevant to all uses of the Internet transport fabric being used by the= ends.=0A=0AThe rationale here has always seemed obvious to me. Like Bob Br= iscoe suggests, we were very wary of throwing features into the network tha= t would preclude unanticipated future interoperability needs, new applicati= ons, and new technology in the infrastructure of the Internet as a whole. = =0A=0ASo what are we talking about here (ignoring the fine points of SCE, s= ome of which I think are debatable - especially the focus on TCP alone, sin= ce much traffic will likely move away from TCP in the near future.=0A=0AA s= econd technical requirement (necessary invariant) of the Internet's transpo= rt is that the entire Internet depends on rigorously stopping queueing dela= y from building up anywhere except at the endpoints, where the ends can man= age it.This is absolutely critical, though it is peculiar in that many engi= neers, especially those who work at the IP layer and below, have a mental m= odel of routing as essentially being about building up queueing delay (in o= rder to manage priority in some trivial way by building up the queue on pur= pose, apparently).=0A=0AThis second technical requirement cannot be resolve= d merely by the endpoints.=0AThe reason is that the endpoints cannot know a= ccurately what host-host paths share common queues.=0A=0AThis lack of a way= to "cooperate" among independent users of a queue cannot be solved by a pu= rely end-to-end solution. (well, I suppose some genius might invent a way, = but I have not seen one in my 36 years closely watching the Internet in ope= ration since it went live in 1983.)=0A=0ASo, what the end-to-end argument w= ould tend to do here, in my opinion, is to provide the most minimal mechani= sm in the devices that are capable of building up a queue in order to allow= all the ends sharing that queue to do their job - which is to stop filling= up the queue!=0A=0AOnly the endpoints can prevent filling up queues. And d= epending on the protocol, they may need to make very different, yet compati= ble choices.=0A=0AThis is a question of design at the architectural level. = And the future matters.=0A=0ASo there is an end-to-end argument to be made = here, but it is a subtle one.=0A=0AThe basic mechanism for controlling queu= e depth has been, and remains, quite simple: dropping packets. This has two= impacts: 1) immediately reducing queueing delay, and 2) signalling to endp= oints that are paying attention that they have contributed to an overfull q= ueue.=0A=0AThe optimum queueing delay in a steady state would always be one= packet or less. Kleinrock has shown this in the last few years. Of course = there aren't steady states. But we don't want a mechanism that can't conver= ge to that steady state *quickly*, for all queues in the network.=0A=0AAnot= her issue is that endpoints are not aware of the fact that packets can take= multiple paths to any destination. In the future, alternate path choices c= an be made by routers (when we get smarter routing algorithms based on traf= fic engineering).=0A=0ASo again, some minimal kind of information must be e= xposed to endpoints that will continue to communicate. Again, the routers m= ust be able to help a wide variety of endpoints with different use cases to= decide how to move queue buildup out of the network itself.=0A=0ANow the d= ecision made by the endpoints must be made in the context of information ab= out fairness. Maybe this is what is not obvious.=0A=0AThe most obvious noti= on of fairness is equal shares among source host, dest host pairs. There ar= e drawbacks to that, but the benefit of it is that it affects the IP layer = alone, and deals with lots of boundary cases like the case where a single h= ost opens a zillion TCP connections or uses lots of UDP source ports or des= tinations to somehow "cheat" by appearing to have "lots of flows".=0A=0AAno= ther way to deal with dividing up flows is to ignore higher level protocol = information entirely, and put the flow idenfitication in the IP layer. A 32= -bit or 64-bit random number could be added as an "option" to IP to somehow= extend the flow space.=0A=0ABut that is not the most important thing today= .=0A=0AI write this to say:=0A1) some kind of per-flow queueing, during the= transient state where a queue is overloaded before packets are dropped wou= ld provide much needed information to the ends of every flow sharing a comm= on queue.=0A2) per-flow queueing, minimized to a very low level, using IP e= nvelope address information (plus maybe UDP and TCP addresses for those pro= tocols in an extended address-based flow definition) is totally compatible = with end-to-end arguments, but ONLY if the decisions made are certain to dr= ive queueing delay out of the router to the endpoints.=0A=0A=0A=0A=0AOn Wed= nesday, July 17, 2019 5:33pm, "Sebastian Moeller" said:= =0A=0A> Dear Bob, dear IETF team,=0A> =0A> =0A>> On Jun 19, 2019, at 16:12,= Bob Briscoe wrote:=0A>>=0A>> Jake, all,=0A>>=0A>> Yo= u may not be aware of my long history of concern about how per-flow schedul= ing=0A>> within endpoints and networks will limit the Internet in future. I= find per-flow=0A>> scheduling a violation of the e2e principle in such a p= rofound way - the dynamic=0A>> choice of the spacing between packets - that= most people don't even associate it=0A>> with the e2e principle.=0A> =0A> = =09This does not rhyme well with the L4S stated advantage of allowing packe= t=0A> reordering (due to mandating RACK for all L4S tcp endpoints). Because= surely=0A> changing the order of packets messes up the "the dynamic choice= of the spacing=0A> between packets" in a significant way. IMHO it is eithe= r L4S is great because it=0A> will give intermediate hops more leeway to re= -order packets, or "a sender's=0A> packet spacing" is sacred, please make u= p your mind which it is.=0A> =0A>>=0A>> I detected that you were talking ab= out FQ in a way that might have assumed my=0A>> concern with it was just ab= out implementation complexity. If you (or anyone=0A>> watching) is not awar= e of the architectural concerns with per-flow scheduling, I=0A>> can enumer= ate them.=0A> =0A> =09Please do not hesitate to do so after your deserved h= oliday, and please state a=0A> superior alternative.=0A> =0A> Best Regards= =0A> =09Sebastian=0A> =0A> =0A>>=0A>> I originally started working on what = became L4S to prove that it was possible to=0A>> separate out reducing queu= ing delay from throughput scheduling. When Koen and I=0A>> started working = together on this, we discovered we had identical concerns on=0A>> this.=0A>= >=0A>>=0A>>=0A>> Bob=0A>>=0A>>=0A>> --=0A>> _______________________________= _________________________________=0A>> Bob Briscoe = http://bobbriscoe.net/=0A>>=0A>> ____________________________________= ___________=0A>> Ecn-sane mailing list=0A>> Ecn-sane@lists.bufferbloat.net= =0A>> https://lists.bufferbloat.net/listinfo/ecn-sane=0A> =0A> ____________= ___________________________________=0A> Ecn-sane mailing list=0A> Ecn-sane@= lists.bufferbloat.net=0A> https://lists.bufferbloat.net/listinfo/ecn-sane= =0A> =0A