From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@deepplum.com>
Received: from smtp98.iad3a.emailsrvr.com (smtp98.iad3a.emailsrvr.com
 [173.203.187.98])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 8DA2F3B2A4
 for <ecn-sane@lists.bufferbloat.net>; Sat, 22 Jun 2019 18:25:14 -0400 (EDT)
Received: from smtp37.relay.iad3a.emailsrvr.com (localhost [127.0.0.1])
 by smtp37.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 56E0E2DFE;
 Sat, 22 Jun 2019 18:25:14 -0400 (EDT)
X-SMTPDoctor-Processed: csmtpprox beta
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=g001.emailsrvr.com;
 s=20190322-9u7zjiwi; t=1561242314;
 bh=WU7YZJcZOQzjGcZ8qYKlGYaBLgAO8xQCK8+v6gW2Rn4=;
 h=Date:Subject:From:To:From;
 b=u1Bjk6mjzbotlBOClRrvGHic5IqPp05enr86lD2kuGUgxGVqeXJd33wM4abjDrfsK
 m2AUoJbVyc9ctdkH02fRgvjP4vhP5iTTqkXiS6CfBC+j2SsHlQB1r4kFTPJruQaD/c
 VNqwnW6nxiy7MytSsqo/kZyogscX3b8u9iGmbHDA=
Received: from app2.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140])
 by smtp37.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id F120018BF;
 Sat, 22 Jun 2019 18:25:13 -0400 (EDT)
X-Sender-Id: dpreed@deepplum.com
Received: from app2.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12);
 Sat, 22 Jun 2019 18:25:14 -0400
Received: from deepplum.com (localhost.localdomain [127.0.0.1])
 by app2.wa-webapps.iad3a (Postfix) with ESMTP id DD514A004F;
 Sat, 22 Jun 2019 18:25:13 -0400 (EDT)
Received: by apps.rackspace.com
 (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) 
 with HTTP; Sat, 22 Jun 2019 18:25:13 -0400 (EDT)
X-Auth-ID: dpreed@deepplum.com
Date: Sat, 22 Jun 2019 18:25:13 -0400 (EDT)
From: "David P. Reed" <dpreed@deepplum.com>
To: "Brian E Carpenter" <brian.e.carpenter@gmail.com>
Cc: "Luca Muscariello" <luca.muscariello@gmail.com>,
 "Sebastian Moeller" <moeller0@gmx.de>,
 "ecn-sane@lists.bufferbloat.net" <ecn-sane@lists.bufferbloat.net>,
 "tsvwg IETF list" <tsvwg@ietf.org>
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="----=_20190622182513000000_87064"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <85397b28-4a7a-e125-40b9-9cfce574260a@gmail.com>
References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net> 
 <46D1ABD8-715D-44D2-B7A0-12FE2A9263FE@gmx.de> 
 <CAHx=1M4+sJBEe-wqCyuVyy=oDz7A+SG_ZxBbu_ZZDZiCHrX2uw@mail.gmail.com> 
 <835b1fb3-e8d5-c58c-e2f8-03d2b886af38@gmail.com> 
 <1561233009.95886420@apps.rackspace.com> 
 <85397b28-4a7a-e125-40b9-9cfce574260a@gmail.com>
Message-ID: <1561242313.903247453@apps.rackspace.com>
X-Mailer: webmail/16.4.5-RC
Subject: Re: [Ecn-sane] [tsvwg]  per-flow scheduling
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
 Internet <ecn-sane.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/ecn-sane>
List-Post: <mailto:ecn-sane@lists.bufferbloat.net>
List-Help: <mailto:ecn-sane-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sat, 22 Jun 2019 22:25:14 -0000

------=_20190622182513000000_87064
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AGiven the complexity of my broader comments, let me be clear that I have=
 no problem with the broad concept of diffserv being compatible with the en=
d-to-end arguments. I was trying to lay out what I think is a useful way to=
 think about these kinds of issues within the Internet context.=0A =0ASimil=
arly, per-flow scheduling as an end-to-end concept (different flows defined=
 by address pairs being jointly managed as entities) makes great sense, but=
 it's really important to be clear that queue prioritization within a singl=
e queue at entry to a bottleneck link is a special case mechanism, and not =
a general end-to-end concept at the IP datagram level, given the generality=
 of IP as a network packet transport protocol. It's really tied closely to =
routing, which isn't specified in any way by IP, other than "best efforts",=
 a term that has become much more well defined over the years (including th=
e notions of dropping rather than storing packets, the idea that successive=
 IP datagrams should traverse roughly the same path in order to have stable=
 congestion detection, ...).=0A =0APer-flow scheduling seems to work quite =
well in the cases where it applies, transparently below the IP datagram lay=
er (that is, underneath the hourglass neck). IP effectively defines "flows"=
, and it is reasonable to me that "best efforts" as a concept could include=
 some notion of network-wide fairness among flows. Link-level "fairness" is=
n't a necessary precondition to network level fairness.=0A =0AOn Saturday, =
June 22, 2019 5:10pm, "Brian E Carpenter" <brian.e.carpenter@gmail.com> sai=
d:=0A=0A=0A=0A> Just three or four small comments:=0A> =0A> On 23-Jun-19 07=
:50, David P. Reed wrote:=0A> > Two points:=0A> >=0A> >  =0A> >=0A> > - Jer=
ry Saltzer and I were the primary authors of the End-to-end argument=0A> pa=
per, and the motivation was based *my* work on the original TCP and IP=0A> =
protocols. Dave Clark got involved significantly later than all those decis=
ions,=0A> which were basically complete when he got involved. (Jerry was my=
 thesis=0A> supervisor, I was his student, and I operated largely independe=
ntly, taking input=0A> from various others at MIT). I mention this because =
Dave understands the=0A> end-to-end arguments, but he understands (as we al=
l did) that it was a design=0A> *principle* and not a perfectly strict rule=
. That said, it's a rule that has a=0A> strong foundational argument from m=
odularity and evolvability in a context where=0A> the system has to work on=
 a wide range of infrastructures (not all knowable in=0A> advance) and supp=
ort a wide range of usage/application-areas (not all knowable in=0A> advanc=
e). Treating the paper as if it were "DDC" declaring a law is just wrong. H=
e=0A> wasn't Moses and it is not written on tablets. Dave=0A> > did have so=
me "power" in his role of trying to achieve interoperability=0A> across div=
erse implementations. But his focus was primarily on interoperability,=0A> =
not other things. So ideas in the IP protocol like "TOS" which were largely=
=0A> placeholders for not-completely-worked-out concepts deferred to the fu=
ture were=0A> left till later.=0A> =0A> Yes, well understood, but he was in=
 fact the link between the e2e paper and the=0A> differentiated services wo=
rk. Although not a nominal author of the "two-bit" RFC,=0A> he was heavily =
involved in it, which is why I mentioned him. And he was very=0A> active in=
 the IETF diffserv WG.=0A> > - It is clear (at least to me) that from the p=
oint of view of the source of=0A> an IP datagram, the "handling" of that da=
tagram within the network of networks can=0A> vary, and so that is why ther=
e is a TOS field - to specify an interoperable,=0A> meaningfully described =
per-packet indicator of differential handling. In regards=0A> to the end-to=
-end argument, that handling choice is a network function, *to the=0A> exte=
nt that it can completely be implemented in the network itself*.=0A> >=0A> =
> Congestion management, however, is not achievable entirely and only withi=
n=0A> the network. That's completely obvious: congestion happens when the=
=0A> source-destination flows exceed the capacity of the network of network=
s to satisfy=0A> all demands.=0A> >=0A> > The network can only implement *c=
ertain* general kinds of mechanisms that may=0A> be used by the endpoints t=
o resolve congestion:=0A> >=0A> > 1) admission controls. These are implemen=
ted at the interface between the=0A> source entity and the network of netwo=
rks. They tend to be impractical in the=0A> Internet context, because there=
 is, by a fundamental and irreversible design=0A> choice made by Cerf and K=
ahn (and the rest of us), no central controller of the=0A> entire network o=
f networks. This is to make evolvability and scalability work. 5G=0A> (not =
an Internet system) implies a central controller, as does SNA, LTE, and man=
y=0A> other networks. The Internet is an overlay on top of such networks.=
=0A> >=0A> > 2) signalling congestion to the endpoints, which will respond =
by slowing=0A> their transmission rate (or explicitly re-routing transmissi=
on, or compressing=0A> their content) through the network to match capacity=
. This response is done=0A> *above* the IP layer, and has proven very pract=
ical. The function in the network=0A> is reduced to "congestion signalling"=
, in a universally understandable meaningful=0A> mechanism: packet drops, E=
CN, packet-pair separation in arrival time, ... =0A> This limited function =
is essential within the network, because it is the state of=0A> the path(s)=
 that is needed to implement the full function at the end points. So=0A> co=
ngestion signalling, like ECN, is implemented according to the end-to-end=
=0A> argument by carefully defining the network function to be the minimum =
necessary=0A> mechanism so that endpoints can control their rates.=0A> >=0A=
> > 3) automatic selection of routes for flows. It's perfectly fine to sele=
ct=0A> different routes based on information in the IP header (the part tha=
t is intended=0A> to be read and understood by the network of networks). No=
w this is currently=0A> *rarely* done, due to the complexity of tracking mo=
re detailed routing information=0A> at the router level. But we had expecte=
d that eventually the Internet would be so=0A> well connected that there wo=
uld be diverse routes with diverse capabilities. For=0A> example, the "Inte=
rplanetary Internet" works with datagrams, that can be=0A> implemented with=
 IP, but not using TCP, which requires very low end-to-end=0A> latency. Thu=
s, one would expect that TCP would not want any packets transferred=0A> ove=
r a path via Mars, or for that matter a geosynchronous satellite, even if t=
he=0A> throughput would be higher.=0A> >=0A> > So one can imagine that even=
tually a "TOS" might say - send this packet=0A> preferably along a path tha=
t has at most 200 ms. RTT, *even if that leads to=0A> congestion signalling=
*, while another TOS might say "send this path over the most=0A> "capacious=
" set of paths, ignoring RTT entirely. (these are just for illustration,=0A=
> but obviously something like this woujld work).=0A> >=0A> > Note that TOS=
 is really aimed at *route selection* preferences, and not=0A> queueing man=
agement of individual routers.=0A> =0A> That may well have been the origina=
l intention, but it was hardly mentioned at all=0A> in the diffserv WG (whi=
ch I co-chaired), and "QOS-based routing" was in very bad=0A> odour at that=
 time.=0A>  =0A> >=0A> > Queueing management to share a single queue on a p=
ath for multiple priorities=0A> of traffic is not very compatible with "end=
-to-end arguments". There are any=0A> number of reasons why this doesn't wo=
rk well. I can go into them. Mainly these=0A> reasons are why "diffserv" ha=
s never been adopted -=0A> =0A> Oh, but it has, in lots of local deployment=
s of voice over IP for example. It's=0A> what I've taken to calling a limit=
ed domain protocol. What has not happened is=0A> Internet-wide deployment, =
because...=0A> =0A> > it's NOT interoperable because the diversity of traff=
ic between endpoints is=0A> hard to specify in a way that translates into t=
he network mechanisms. Of course=0A> any queue can be managed in some algor=
ithmic way with parameters, but the=0A> endpoints that want to specify an e=
nd-to-end goal don't have a way to understand=0A> the impact of those param=
eters on a specific queue that is currently congested.=0A> =0A> Yes. And th=
anks for your insights.=0A> =0A> Brian=0A> =0A> >=0A> >  =0A> >=0A> > Inste=
ad, the history of the Internet (and for that matter *all* networks,=0A> ev=
en Bell's voice systems) has focused on minimizing queueing delay to near z=
ero=0A> throughout the network by whatever means it has at the endpoints or=
 in the design.=0A> This is why we have AIMD's MD as a response to detectio=
n of congestion.=0A> >=0A> >  =0A> >=0A> > Pragmatic networks (those that o=
perate in the real world) do not choose to=0A> operate with shared links in=
 a saturated state. That's known in the phone business=0A> as the Mother's =
Day problem. You want to have enough capacity for the rare=0A> near-overloa=
d to never result in congestion.  Which means that the normal=0A> state of =
the network is very lightly loaded indeed, in order to minimize RTT.=0A> Co=
nsequently, focusing on somehow trying to optimize the utilization of the=
=0A> network to 100% is just a purely academic exercise. Since "priority" a=
t the packet=0A> level within a queue only improves that case, it's just a =
focus of (bad) Ph.D.=0A> theses. (Good Ph.D. theses focus on actual real pr=
oblems like getting the queues=0A> down to 1 packet or less by signalling t=
he endpoints with information that allows=0A> them to do their job).=0A> >=
=0A> >  =0A> >=0A> > So, in considering what goes in the IP layer, both its=
 header and the=0A> mechanics of the network of networks, it is those thing=
s that actually have=0A> implementable meaning in the network of networks w=
hen processing the IP datagram.=0A> The rest is "content" because the netwo=
rk of networks doesn't need to see it.=0A> >=0A> >  =0A> >=0A> > Thus, don'=
t put anything in the IP header that belongs in the "content" part,=0A> jus=
t being a signal between end points. Some information used in the network o=
f=0A> networks is also logically carried between endpoints.=0A> >=0A> >  =
=0A> >=0A> >  =0A> >=0A> > On Friday, June 21, 2019 4:37pm, "Brian E Carpen=
ter"=0A> <brian.e.carpenter@gmail.com> said:=0A> >=0A> >> Below...=0A> >> O=
n 21-Jun-19 21:33, Luca Muscariello wrote:=0A> >> > + David Reed, as I'm no=
t sure he's on the ecn-sane list.=0A> >> >=0A> >> > To me, it seems like a =
very religious position against per-flow=0A> >> queueing. =0A> >> > BTW, I =
fail to see how this would violate (in a "profound" way ) the=0A> e2e=0A> >=
> principle.=0A> >> >=0A> >> > When I read it (the e2e principle)=0A> >> >=
=0A> >> > Saltzer, J. H., D. P. Reed, and D. D. Clark (1981) "End-to-End=0A=
> Arguments in=0A> >> System Design". =0A> >> > In: Proceedings of the Seco=
nd International Conference on=0A> Distributed=0A> >> Computing Systems. Pa=
ris, France. =0A> >> > April 8=E2=80=9310, 1981. IEEE Computer Society, pp.=
 509-512.=0A> >> > (available on line for free).=0A> >> >=0A> >> > It seems=
 very much like the application of the Occam's razor to=0A> function=0A> >>=
 placement in communication networks back in the 80s.=0A> >> > I see no con=
flict between what is written in that paper and per-flow=0A> queueing=0A> >=
> today, even after almost 40 years.=0A> >> >=0A> >> > If that was the case=
, then all service differentiation techniques=0A> would=0A> >> violate the =
e2e principle in a "profound" way too,=0A> >> > and dualQ too. A policer? A=
 shaper? A priority queue?=0A> >> >=0A> >> > Luca=0A> >>=0A> >> Quoting RFC=
2638 (the "two-bit" RFC):=0A> >>=0A> >> >>> Both these=0A> >> >>> proposals=
 seek to define a single common mechanism that is=0A> used=0A> >> by=0A> >>=
 >>> interior network routers, pushing most of the complexity and=0A> state=
=0A> >> of=0A> >> >>> differentiated services to the network edges.=0A> >>=
=0A> >> I can't help thinking that if DDC had felt this was against the E2E=
=0A> principle,=0A> >> he would have kicked up a fuss when it was written.=
=0A> >>=0A> >> Bob's right, however, that there might be a tussle here. If =
end-points=0A> are=0A> >> attempting to pace their packets to suit their ow=
n needs, and the network=0A> is=0A> >> policing packets to support both ser=
vice differentiation and fairness,=0A> >> these may well be competing rathe=
r than collaborating behaviours. And=0A> there=0A> >> probably isn't anythi=
ng we can do about it by twiddling with algorithms.=0A> >>=0A> >> Brian=0A>=
 >>=0A> >>=0A> >>=0A> >>=0A> >>=0A> >>=0A> >>=0A> >> >=0A> >> >=0A> >> >=0A=
> >> >=0A> >> >=0A> >> >=0A> >> >  =0A> >> >=0A> >> > On Fri, Jun 21, 2019 =
at 9:00 AM Sebastian Moeller=0A> <moeller0@gmx.de=0A> >> <mailto:moeller0@g=
mx.de>> wrote:=0A> >> >=0A> >> >=0A> >> >=0A> >> > > On Jun 19, 2019, at 16=
:12, Bob Briscoe <ietf@bobbriscoe.net=0A> >> <mailto:ietf@bobbriscoe.net>> =
wrote:=0A> >> > >=0A> >> > > Jake, all,=0A> >> > >=0A> >> > > You may not b=
e aware of my long history of concern about how=0A> >> per-flow scheduling =
within endpoints and networks will limit the Internet=0A> in=0A> >> future.=
 I find per-flow scheduling a violation of the e2e principle in=0A> such a=
=0A> >> profound way - the dynamic choice of the spacing between packets - =
that=0A> most=0A> >> people don't even associate it with the e2e principle.=
=0A> >> >=0A> >> > Maybe because it is not a violation of the e2e principle=
 at all? My=0A> point=0A> >> is that with shared resources between the endp=
oints, the endpoints simply=0A> should=0A> >> have no expectancy that their=
 choice of spacing between packets will be=0A> conserved.=0A> >> For the si=
mple reason that it seems generally impossible to guarantee=0A> that=0A> >>=
 inter-packet spacing is conserved (think "cross-traffic" at the=0A> bottle=
neck hop=0A> >> along the path and general bunching up of packets in the qu=
eue of a fast=0A> to slow=0A> >> transition*). I also would claim that the =
way L4S works (if it works) is=0A> to=0A> >> synchronize all active flows a=
t the bottleneck which in tirn means each=0A> sender has=0A> >> only a very=
 small timewindow in which to transmit a packet for it to hits=0A> its=0A> =
>> "slot" in the bottleneck L4S scheduler, otherwise, L4S's low queueing=0A=
> delay=0A> >> guarantees will not work. In other words the senders have ba=
sically no=0A> say in the=0A> >> "spacing between packets", I fail to see h=
ow L4S improves upon FQ in that=0A> regard.=0A> >> >=0A> >> >=0A> >> >  IMH=
O having per-flow fairness as the defaults seems quite=0A> >> reasonable, e=
ndpoints can still throttle flows to their liking. Now=0A> per-flow=0A> >> =
fairness still can be "abused", so by itself it might not be sufficient,=0A=
> but=0A> >> neither is L4S as it has at best stochastic guarantees, as a s=
ingle queue=0A> AQM=0A> >> (let's ignore the RFC3168 part of the AQM) there=
 is the probability to=0A> send a=0A> >> throtteling signal to a low bandwi=
dth flow (fair enough, it is only a=0A> mild=0A> >> throtteling signal, but=
 still).=0A> >> > But enough about my opinion, what is the ideal fairness m=
easure in=0A> your=0A> >> mind, and what is realistically achievable over t=
he internet?=0A> >> >=0A> >> >=0A> >> > Best Regards=0A> >> >         Sebas=
tian=0A> >> >=0A> >> >=0A> >> >=0A> >> >=0A> >> > >=0A> >> > > I detected t=
hat you were talking about FQ in a way that might=0A> have=0A> >> assumed m=
y concern with it was just about implementation complexity. If=0A> you (or=
=0A> >> anyone watching) is not aware of the architectural concerns with=0A=
> per-flow=0A> >> scheduling, I can enumerate them.=0A> >> > >=0A> >> > > I=
 originally started working on what became L4S to prove that=0A> it was=0A>=
 >> possible to separate out reducing queuing delay from throughput=0A> sch=
eduling. When=0A> >> Koen and I started working together on this, we discov=
ered we had=0A> identical=0A> >> concerns on this.=0A> >> > >=0A> >> > >=0A=
> >> > >=0A> >> > > Bob=0A> >> > >=0A> >> > >=0A> >> > > --=0A> >> > >=0A> =
________________________________________________________________=0A> >> > >=
 Bob Briscoe             =0A>  =0A> >>              =0A>  http://bobbriscoe=
.net/=0A> >> > >=0A> >> > > _______________________________________________=
=0A> >> > > Ecn-sane mailing list=0A> >> > > Ecn-sane@lists.bufferbloat.net=
=0A> >> <mailto:Ecn-sane@lists.bufferbloat.net>=0A> >> > > https://lists.bu=
fferbloat.net/listinfo/ecn-sane=0A> >> >=0A> >> > _________________________=
______________________=0A> >> > Ecn-sane mailing list=0A> >> > Ecn-sane@lis=
ts.bufferbloat.net=0A> >> <mailto:Ecn-sane@lists.bufferbloat.net>=0A> >> > =
https://lists.bufferbloat.net/listinfo/ecn-sane=0A> >> >=0A> >>=0A> >>=0A> =
>=0A> =0A> 
------=_20190622182513000000_87064
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"arial" size=3D"3"><p style=3D"margin:0;padding:0;font-family:=
 arial; font-size: 12pt; overflow-wrap: break-word;">Given the complexity o=
f my broader comments,&nbsp;let me be clear that I have no problem with the=
 broad concept of diffserv being compatible with the end-to-end arguments. =
I was trying to lay out what I think is a useful way to think about these k=
inds of issues within the Internet context.</p>=0A<p style=3D"margin:0;padd=
ing:0;font-family: arial; font-size: 12pt; overflow-wrap: break-word;">&nbs=
p;</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt=
; overflow-wrap: break-word;">Similarly, per-flow scheduling as an end-to-e=
nd concept (different flows defined by address pairs being jointly managed =
as entities) makes great sense, but it's really important to be clear that =
queue prioritization within a single queue at entry to a bottleneck link is=
 a special case mechanism, and not a general end-to-end concept at the IP d=
atagram level, given the generality of IP as a network packet transport pro=
tocol. It's really tied closely to routing, which isn't specified in any wa=
y by IP, other than "best efforts", a term that has become much more well d=
efined over the years (including the notions of dropping rather than storin=
g packets, the idea that successive IP datagrams should traverse roughly th=
e same path in order to have stable congestion detection, ...).</p>=0A<p st=
yle=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; overflow-wra=
p: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: ar=
ial; font-size: 12pt; overflow-wrap: break-word;">Per-flow scheduling seems=
 to work quite well in the cases where it applies, transparently below the =
IP datagram layer (that is, underneath the hourglass neck). IP effectively =
defines "flows", and it is reasonable to me that "best efforts" as a concep=
t could include some notion of network-wide fairness among flows. Link-leve=
l "fairness" isn't a necessary precondition to network level fairness.</p>=
=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; over=
flow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-fa=
mily: arial; font-size: 12pt; overflow-wrap: break-word;">On Saturday, June=
 22, 2019 5:10pm, "Brian E Carpenter" &lt;brian.e.carpenter@gmail.com&gt; s=
aid:<br /><br /></p>=0A<div id=3D"SafeStyles1561241483">=0A<p style=3D"marg=
in:0;padding:0;font-family: arial; font-size: 12pt; overflow-wrap: break-wo=
rd;">&gt; Just three or four small comments:<br />&gt; <br />&gt; On 23-Jun=
-19 07:50, David P. Reed wrote:<br />&gt; &gt; Two points:<br />&gt; &gt;<b=
r />&gt; &gt; &nbsp;<br />&gt; &gt;<br />&gt; &gt; - Jerry Saltzer and I we=
re the primary authors of the End-to-end argument<br />&gt; paper, and the =
motivation was based *my* work on the original TCP and IP<br />&gt; protoco=
ls. Dave Clark got involved significantly later than all those decisions,<b=
r />&gt; which were basically complete when he got involved. (Jerry was my =
thesis<br />&gt; supervisor, I was his student, and I operated largely inde=
pendently, taking input<br />&gt; from various others at MIT). I mention th=
is because Dave understands the<br />&gt; end-to-end arguments, but he unde=
rstands (as we all did) that it was a design<br />&gt; *principle* and not =
a perfectly strict rule. That said, it's a rule that has a<br />&gt; strong=
 foundational argument from modularity and evolvability in a context where<=
br />&gt; the system has to work on a wide range of infrastructures (not al=
l knowable in<br />&gt; advance) and support a wide range of usage/applicat=
ion-areas (not all knowable in<br />&gt; advance). Treating the paper as if=
 it were "DDC" declaring a law is just wrong. He<br />&gt; wasn't Moses and=
 it is not written on tablets. Dave<br />&gt; &gt; did have some "power" in=
 his role of trying to achieve interoperability<br />&gt; across diverse im=
plementations. But his focus was primarily on interoperability,<br />&gt; n=
ot other things. So ideas in the IP protocol like "TOS" which were largely<=
br />&gt; placeholders for not-completely-worked-out concepts deferred to t=
he future were<br />&gt; left till later.<br />&gt; <br />&gt; Yes, well un=
derstood, but he was in fact the link between the e2e paper and the<br />&g=
t; differentiated services work. Although not a nominal author of the "two-=
bit" RFC,<br />&gt; he was heavily involved in it, which is why I mentioned=
 him. And he was very<br />&gt; active in the IETF diffserv WG.<br />&gt; &=
gt; - It is clear (at least to me) that from the point of view of the sourc=
e of<br />&gt; an IP datagram, the "handling" of that datagram within the n=
etwork of networks can<br />&gt; vary, and so that is why there is a TOS fi=
eld - to specify an interoperable,<br />&gt; meaningfully described per-pac=
ket indicator of differential handling. In regards<br />&gt; to the end-to-=
end argument, that handling choice is a network function, *to the<br />&gt;=
 extent that it can completely be implemented in the network itself*.<br />=
&gt; &gt;<br />&gt; &gt; Congestion management, however, is not achievable =
entirely and only within<br />&gt; the network. That's completely obvious: =
congestion happens when the<br />&gt; source-destination flows exceed the c=
apacity of the network of networks to satisfy<br />&gt; all demands.<br />&=
gt; &gt;<br />&gt; &gt; The network can only implement *certain* general ki=
nds of mechanisms that may<br />&gt; be used by the endpoints to resolve co=
ngestion:<br />&gt; &gt;<br />&gt; &gt; 1) admission controls. These are im=
plemented at the interface between the<br />&gt; source entity and the netw=
ork of networks. They tend to be impractical in the<br />&gt; Internet cont=
ext, because there is, by a fundamental and irreversible design<br />&gt; c=
hoice made by Cerf and Kahn (and the rest of us), no central controller of =
the<br />&gt; entire network of networks. This is to make evolvability and =
scalability work. 5G<br />&gt; (not an Internet system) implies a central c=
ontroller, as does SNA, LTE, and many<br />&gt; other networks. The Interne=
t is an overlay on top of such networks.<br />&gt; &gt;<br />&gt; &gt; 2) s=
ignalling congestion to the endpoints, which will respond by slowing<br />&=
gt; their transmission rate (or explicitly re-routing transmission, or comp=
ressing<br />&gt; their content) through the network to match capacity. Thi=
s response is done<br />&gt; *above* the IP layer, and has proven very prac=
tical. The function in the network<br />&gt; is reduced to "congestion sign=
alling", in a universally understandable meaningful<br />&gt; mechanism: pa=
cket drops, ECN, packet-pair separation in arrival time, ...&nbsp;<br />&gt=
; This limited function is essential within the network, because it is the =
state of<br />&gt; the path(s) that is needed to implement the full functio=
n at the end points. So<br />&gt; congestion signalling, like ECN, is imple=
mented according to the end-to-end<br />&gt; argument by carefully defining=
 the network function to be the minimum necessary<br />&gt; mechanism so th=
at endpoints can control their rates.<br />&gt; &gt;<br />&gt; &gt; 3) auto=
matic selection of routes for flows. It's perfectly fine to select<br />&gt=
; different routes based on information in the IP header (the part that is =
intended<br />&gt; to be read and understood by the network of networks). N=
ow this is currently<br />&gt; *rarely* done, due to the complexity of trac=
king more detailed routing information<br />&gt; at the router level. But w=
e had expected that eventually the Internet would be so<br />&gt; well conn=
ected that there would be diverse routes with diverse capabilities. For<br =
/>&gt; example, the "Interplanetary Internet" works with datagrams, that ca=
n be<br />&gt; implemented with IP, but not using TCP, which requires very =
low end-to-end<br />&gt; latency. Thus, one would expect that TCP would not=
 want any packets transferred<br />&gt; over a path via Mars, or for that m=
atter a geosynchronous satellite, even if the<br />&gt; throughput would be=
 higher.<br />&gt; &gt;<br />&gt; &gt; So one can imagine that eventually a=
 "TOS" might say - send this packet<br />&gt; preferably along a path that =
has at most 200 ms. RTT, *even if that leads to<br />&gt; congestion signal=
ling*, while another TOS might say "send this path over the most<br />&gt; =
"capacious" set of paths, ignoring RTT entirely. (these are just for illust=
ration,<br />&gt; but obviously something like this woujld work).<br />&gt;=
 &gt;<br />&gt; &gt; Note that TOS is really aimed at *route selection* pre=
ferences, and not<br />&gt; queueing management of individual routers.<br /=
>&gt; <br />&gt; That may well have been the original intention, but it was=
 hardly mentioned at all<br />&gt; in the diffserv WG (which I co-chaired),=
 and "QOS-based routing" was in very bad<br />&gt; odour at that time.<br /=
>&gt; &nbsp;<br />&gt; &gt;<br />&gt; &gt; Queueing management to share a s=
ingle queue on a path for multiple priorities<br />&gt; of traffic is not v=
ery compatible with "end-to-end arguments". There are any<br />&gt; number =
of reasons why this doesn't work well. I can go into them. Mainly these<br =
/>&gt; reasons are why "diffserv" has never been adopted -<br />&gt; <br />=
&gt; Oh, but it has, in lots of local deployments of voice over IP for exam=
ple. It's<br />&gt; what I've taken to calling a limited domain protocol. W=
hat has not happened is<br />&gt; Internet-wide deployment, because...<br /=
>&gt; <br />&gt; &gt; it's NOT interoperable because the diversity of traff=
ic between endpoints is<br />&gt; hard to specify in a way that translates =
into the network mechanisms. Of course<br />&gt; any queue can be managed i=
n some algorithmic way with parameters, but the<br />&gt; endpoints that wa=
nt to specify an end-to-end goal don't have a way to understand<br />&gt; t=
he impact of those parameters on a specific queue that is currently congest=
ed.<br />&gt; <br />&gt; Yes. And thanks for your insights.<br />&gt; <br /=
>&gt; Brian<br />&gt; <br />&gt; &gt;<br />&gt; &gt; &nbsp;<br />&gt; &gt;<=
br />&gt; &gt; Instead, the history of the Internet (and for that matter *a=
ll* networks,<br />&gt; even Bell's voice systems) has focused on minimizin=
g queueing delay to near zero<br />&gt; throughout the network by whatever =
means it has at the endpoints or in the design.<br />&gt; This is why we ha=
ve AIMD's MD as a response to detection of congestion.<br />&gt; &gt;<br />=
&gt; &gt; &nbsp;<br />&gt; &gt;<br />&gt; &gt; Pragmatic networks (those th=
at operate in the real world) do not choose to<br />&gt; operate with share=
d links in a saturated state. That's known in the phone business<br />&gt; =
as the Mother's Day problem. You want to have enough capacity for the rare<=
br />&gt; near-overload to never result in congestion.&nbsp; Which means th=
at the normal<br />&gt; state of the network is very lightly loaded indeed,=
 in order to minimize RTT.<br />&gt; Consequently, focusing on somehow tryi=
ng to optimize the utilization of the<br />&gt; network to 100% is just a p=
urely academic exercise. Since "priority" at the packet<br />&gt; level wit=
hin a queue only improves that case, it's just a focus of (bad) Ph.D.<br />=
&gt; theses. (Good Ph.D. theses focus on actual real problems like getting =
the queues<br />&gt; down to 1 packet or less by signalling the endpoints w=
ith information that allows<br />&gt; them to do their job).<br />&gt; &gt;=
<br />&gt; &gt; &nbsp;<br />&gt; &gt;<br />&gt; &gt; So, in considering wha=
t goes in the IP layer, both its header and the<br />&gt; mechanics of the =
network of networks, it is those things that actually have<br />&gt; implem=
entable meaning in the network of networks when processing the IP datagram.=
<br />&gt; The rest is "content" because the network of networks doesn't ne=
ed to see it.<br />&gt; &gt;<br />&gt; &gt; &nbsp;<br />&gt; &gt;<br />&gt;=
 &gt; Thus, don't put anything in the IP header that belongs in the "conten=
t" part,<br />&gt; just being a signal between end points. Some information=
 used in the network of<br />&gt; networks is also logically carried betwee=
n endpoints.<br />&gt; &gt;<br />&gt; &gt; &nbsp;<br />&gt; &gt;<br />&gt; =
&gt; &nbsp;<br />&gt; &gt;<br />&gt; &gt; On Friday, June 21, 2019 4:37pm, =
"Brian E Carpenter"<br />&gt; &lt;brian.e.carpenter@gmail.com&gt; said:<br =
/>&gt; &gt;<br />&gt; &gt;&gt; Below...<br />&gt; &gt;&gt; On 21-Jun-19 21:=
33, Luca Muscariello wrote:<br />&gt; &gt;&gt; &gt; + David Reed, as I'm no=
t sure he's on the ecn-sane list.<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt=
; &gt; To me, it seems like a very religious position against per-flow<br /=
>&gt; &gt;&gt; queueing.&nbsp;<br />&gt; &gt;&gt; &gt; BTW, I fail to see h=
ow this would violate (in a "profound" way ) the<br />&gt; e2e<br />&gt; &g=
t;&gt; principle.<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; When I re=
ad it (the e2e principle)<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; S=
altzer, J. H., D. P. Reed, and D. D. Clark (1981) "End-to-End<br />&gt; Arg=
uments in<br />&gt; &gt;&gt; System Design".&nbsp;<br />&gt; &gt;&gt; &gt; =
In: Proceedings of the Second International Conference on<br />&gt; Distrib=
uted<br />&gt; &gt;&gt; Computing Systems. Paris, France.&nbsp;<br />&gt; &=
gt;&gt; &gt; April 8=E2=80=9310, 1981. IEEE Computer Society, pp. 509-512.<=
br />&gt; &gt;&gt; &gt; (available on line for free).<br />&gt; &gt;&gt; &g=
t;<br />&gt; &gt;&gt; &gt; It seems very much like the application of the O=
ccam's razor to<br />&gt; function<br />&gt; &gt;&gt; placement in communic=
ation networks back in the 80s.<br />&gt; &gt;&gt; &gt; I see no conflict b=
etween what is written in that paper and per-flow<br />&gt; queueing<br />&=
gt; &gt;&gt; today, even after almost 40 years.<br />&gt; &gt;&gt; &gt;<br =
/>&gt; &gt;&gt; &gt; If that was the case, then all service differentiation=
 techniques<br />&gt; would<br />&gt; &gt;&gt; violate the e2e principle in=
 a "profound" way too,<br />&gt; &gt;&gt; &gt; and dualQ too. A policer? A =
shaper? A priority queue?<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; L=
uca<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Quoting RFC2638 (the "two-bit" RF=
C):<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; &gt;&gt;&gt; Both these<br />&gt;=
 &gt;&gt; &gt;&gt;&gt; proposals seek to define a single common mechanism t=
hat is<br />&gt; used<br />&gt; &gt;&gt; by<br />&gt; &gt;&gt; &gt;&gt;&gt;=
 interior network routers, pushing most of the complexity and<br />&gt; sta=
te<br />&gt; &gt;&gt; of<br />&gt; &gt;&gt; &gt;&gt;&gt; differentiated ser=
vices to the network edges.<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; I can't h=
elp thinking that if DDC had felt this was against the E2E<br />&gt; princi=
ple,<br />&gt; &gt;&gt; he would have kicked up a fuss when it was written.=
<br />&gt; &gt;&gt;<br />&gt; &gt;&gt; Bob's right, however, that there mig=
ht be a tussle here. If end-points<br />&gt; are<br />&gt; &gt;&gt; attempt=
ing to pace their packets to suit their own needs, and the network<br />&gt=
; is<br />&gt; &gt;&gt; policing packets to support both service differenti=
ation and fairness,<br />&gt; &gt;&gt; these may well be competing rather t=
han collaborating behaviours. And<br />&gt; there<br />&gt; &gt;&gt; probab=
ly isn't anything we can do about it by twiddling with algorithms.<br />&gt=
; &gt;&gt;<br />&gt; &gt;&gt; Brian<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<b=
r />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<=
br />&gt; &gt;&gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt=
; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &=
gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; &nbsp;<br />&gt; &gt;&gt; &gt;<br />&g=
t; &gt;&gt; &gt; On Fri, Jun 21, 2019 at 9:00 AM Sebastian Moeller<br />&gt=
; &lt;moeller0@gmx.de<br />&gt; &gt;&gt; &lt;mailto:moeller0@gmx.de&gt;&gt;=
 wrote:<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; =
&gt;<br />&gt; &gt;&gt; &gt; &gt; On Jun 19, 2019, at 16:12, Bob Briscoe &l=
t;ietf@bobbriscoe.net<br />&gt; &gt;&gt; &lt;mailto:ietf@bobbriscoe.net&gt;=
&gt; wrote:<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt; Jake,=
 all,<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt; You may not=
 be aware of my long history of concern about how<br />&gt; &gt;&gt; per-fl=
ow scheduling within endpoints and networks will limit the Internet<br />&g=
t; in<br />&gt; &gt;&gt; future. I find per-flow scheduling a violation of =
the e2e principle in<br />&gt; such a<br />&gt; &gt;&gt; profound way - the=
 dynamic choice of the spacing between packets - that<br />&gt; most<br />&=
gt; &gt;&gt; people don't even associate it with the e2e principle.<br />&g=
t; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; Maybe because it is not a violatio=
n of the e2e principle at all? My<br />&gt; point<br />&gt; &gt;&gt; is tha=
t with shared resources between the endpoints, the endpoints simply<br />&g=
t; should<br />&gt; &gt;&gt; have no expectancy that their choice of spacin=
g between packets will be<br />&gt; conserved.<br />&gt; &gt;&gt; For the s=
imple reason that it seems generally impossible to guarantee<br />&gt; that=
<br />&gt; &gt;&gt; inter-packet spacing is conserved (think "cross-traffic=
" at the<br />&gt; bottleneck hop<br />&gt; &gt;&gt; along the path and gen=
eral bunching up of packets in the queue of a fast<br />&gt; to slow<br />&=
gt; &gt;&gt; transition*). I also would claim that the way L4S works (if it=
 works) is<br />&gt; to<br />&gt; &gt;&gt; synchronize all active flows at =
the bottleneck which in tirn means each<br />&gt; sender has<br />&gt; &gt;=
&gt; only a very small timewindow in which to transmit a packet for it to h=
its<br />&gt; its<br />&gt; &gt;&gt; "slot" in the bottleneck L4S scheduler=
, otherwise, L4S's low queueing<br />&gt; delay<br />&gt; &gt;&gt; guarante=
es will not work. In other words the senders have basically no<br />&gt; sa=
y in the<br />&gt; &gt;&gt; "spacing between packets", I fail to see how L4=
S improves upon FQ in that<br />&gt; regard.<br />&gt; &gt;&gt; &gt;<br />&=
gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; &nbsp;IMHO having per-flow fairne=
ss as the defaults seems quite<br />&gt; &gt;&gt; reasonable, endpoints can=
 still throttle flows to their liking. Now<br />&gt; per-flow<br />&gt; &gt=
;&gt; fairness still can be "abused", so by itself it might not be sufficie=
nt,<br />&gt; but<br />&gt; &gt;&gt; neither is L4S as it has at best stoch=
astic guarantees, as a single queue<br />&gt; AQM<br />&gt; &gt;&gt; (let's=
 ignore the RFC3168 part of the AQM) there is the probability to<br />&gt; =
send a<br />&gt; &gt;&gt; throtteling signal to a low bandwidth flow (fair =
enough, it is only a<br />&gt; mild<br />&gt; &gt;&gt; throtteling signal, =
but still).<br />&gt; &gt;&gt; &gt; But enough about my opinion, what is th=
e ideal fairness measure in<br />&gt; your<br />&gt; &gt;&gt; mind, and wha=
t is realistically achievable over the internet?<br />&gt; &gt;&gt; &gt;<br=
 />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; Best Regards<br />&gt; &gt;&g=
t; &gt; &nbsp; &nbsp; &nbsp; &nbsp; Sebastian<br />&gt; &gt;&gt; &gt;<br />=
&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt;<br />&gt=
; &gt;&gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt; I detected that you were =
talking about FQ in a way that might<br />&gt; have<br />&gt; &gt;&gt; assu=
med my concern with it was just about implementation complexity. If<br />&g=
t; you (or<br />&gt; &gt;&gt; anyone watching) is not aware of the architec=
tural concerns with<br />&gt; per-flow<br />&gt; &gt;&gt; scheduling, I can=
 enumerate them.<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt; =
I originally started working on what became L4S to prove that<br />&gt; it =
was<br />&gt; &gt;&gt; possible to separate out reducing queuing delay from=
 throughput<br />&gt; scheduling. When<br />&gt; &gt;&gt; Koen and I starte=
d working together on this, we discovered we had<br />&gt; identical<br />&=
gt; &gt;&gt; concerns on this.<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&=
gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt; Bob=
<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt;&g=
t; &gt; &gt; --<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; ____________________=
____________________________________________<br />&gt; &gt;&gt; &gt; &gt; B=
ob Briscoe&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />&gt; &nbsp;=
<br />&gt; &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />&=
gt; &nbsp;http://bobbriscoe.net/<br />&gt; &gt;&gt; &gt; &gt;<br />&gt; &gt=
;&gt; &gt; &gt; _______________________________________________<br />&gt; &=
gt;&gt; &gt; &gt; Ecn-sane mailing list<br />&gt; &gt;&gt; &gt; &gt; Ecn-sa=
ne@lists.bufferbloat.net<br />&gt; &gt;&gt; &lt;mailto:Ecn-sane@lists.buffe=
rbloat.net&gt;<br />&gt; &gt;&gt; &gt; &gt; https://lists.bufferbloat.net/l=
istinfo/ecn-sane<br />&gt; &gt;&gt; &gt;<br />&gt; &gt;&gt; &gt; __________=
_____________________________________<br />&gt; &gt;&gt; &gt; Ecn-sane mail=
ing list<br />&gt; &gt;&gt; &gt; Ecn-sane@lists.bufferbloat.net<br />&gt; &=
gt;&gt; &lt;mailto:Ecn-sane@lists.bufferbloat.net&gt;<br />&gt; &gt;&gt; &g=
t; https://lists.bufferbloat.net/listinfo/ecn-sane<br />&gt; &gt;&gt; &gt;<=
br />&gt; &gt;&gt;<br />&gt; &gt;&gt;<br />&gt; &gt;<br />&gt; <br />&gt; <=
/p>=0A</div></font>
------=_20190622182513000000_87064--