From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <brian.e.carpenter@gmail.com>
Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com
 [IPv6:2607:f8b0:4864:20::52c])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 274CF3B2A4
 for <ecn-sane@lists.bufferbloat.net>; Sat, 22 Jun 2019 17:10:31 -0400 (EDT)
Received: by mail-pg1-x52c.google.com with SMTP id f25so5012037pgv.10
 for <ecn-sane@lists.bufferbloat.net>; Sat, 22 Jun 2019 14:10:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-language:content-transfer-encoding;
 bh=AZbFg5jSjbobq/FPwP23vQfXtOGg+O3/R3fCZ4jfjIs=;
 b=ipHwpo4tU55+1z+ZELOUS8ojnmuCWaDLzL6F/vnCgU19bevtvumgyXlAwVPxid93Q4
 BYMsRZVa4jYHOyGoV2UP7kUUCpNxd77TG8xG5dxQqxkEH8JYx/7TOupmwrviUuHtbWKU
 tas9tU/aDtcclfhfWIiMKoRQLHe7bzq9rouwpgpT0gtYD5jg5Eesg3WDz7RF013RWMCo
 2K8tY/EwRC32+vdAex0gWxdhlSveStsAkBRu0uuiEjIAxj2f4MgnTxcG+cxY7/s3/GLd
 AubTeAJ9+ks2Pq0e92efPhhhKqUqjzCaLRQhkGBNusly+o23vakWc0g2UXprP+rtoVKn
 eJkw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=AZbFg5jSjbobq/FPwP23vQfXtOGg+O3/R3fCZ4jfjIs=;
 b=k2olal2G2PWHqyAl7nUjkPIXdaTq8Y7126H8v8zBcfXG6EuOpFSdAdlfvrLE6WTLP6
 d1vOpCS4wbLWP+tpOcNX/PQf++GWMC3y0a0csvsSl/5HEn7Yg36HOSIqv1Kc3LDx1i6r
 9Ir4MexNUrhl7L6jKWI+ivjPe3LwN3ShqpzmtFz+JRnuy3gAr5mQTsL34LfTOxenL82g
 6tpz7YNy3bcvPC9wI/5xfDcPFWOmtdgE6LaU8ZgCudeGzAq6w5cgnYsf79sh6hutn1oI
 LGCMrN4q71Jxw77p84OcUUAZ99fsvEdgTTfZnesdIZWMilTz0jhbXbZFyq35zTpbx/A/
 /IwA==
X-Gm-Message-State: APjAAAUidBhA7IQJL+YEKkVHF+a6j/+QFCBJMV8jWKyWq288Hr6l6DMr
 Uty+QYQl4lDY9PXa0TdNB3o=
X-Google-Smtp-Source: APXvYqx7VrQswkJCGYQ7V4S8zAwMqKyEbuzT0O0q23pA7g58GFvrxfg68nak8kpbDGnIFWGakKhHdw==
X-Received: by 2002:a17:90a:af8e:: with SMTP id
 w14mr15282371pjq.89.1561237830152; 
 Sat, 22 Jun 2019 14:10:30 -0700 (PDT)
Received: from [192.168.178.30] (32.23.255.123.dynamic.snap.net.nz.
 [123.255.23.32])
 by smtp.gmail.com with ESMTPSA id d4sm5706424pju.19.2019.06.22.14.10.26
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sat, 22 Jun 2019 14:10:29 -0700 (PDT)
To: "David P. Reed" <dpreed@deepplum.com>
Cc: Luca Muscariello <luca.muscariello@gmail.com>,
 Sebastian Moeller <moeller0@gmx.de>,
 "ecn-sane@lists.bufferbloat.net" <ecn-sane@lists.bufferbloat.net>,
 tsvwg IETF list <tsvwg@ietf.org>
References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net>
 <46D1ABD8-715D-44D2-B7A0-12FE2A9263FE@gmx.de>
 <CAHx=1M4+sJBEe-wqCyuVyy=oDz7A+SG_ZxBbu_ZZDZiCHrX2uw@mail.gmail.com>
 <835b1fb3-e8d5-c58c-e2f8-03d2b886af38@gmail.com>
 <1561233009.95886420@apps.rackspace.com>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Message-ID: <85397b28-4a7a-e125-40b9-9cfce574260a@gmail.com>
Date: Sun, 23 Jun 2019 09:10:24 +1200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101
 Thunderbird/60.7.2
MIME-Version: 1.0
In-Reply-To: <1561233009.95886420@apps.rackspace.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
X-Mailman-Approved-At: Sat, 22 Jun 2019 21:12:25 -0400
Subject: Re: [Ecn-sane] [tsvwg]  per-flow scheduling
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
 Internet <ecn-sane.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/ecn-sane>
List-Post: <mailto:ecn-sane@lists.bufferbloat.net>
List-Help: <mailto:ecn-sane-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sat, 22 Jun 2019 21:10:31 -0000

Just three or four small comments:

On 23-Jun-19 07:50, David P. Reed wrote:
> Two points:
>=20
> =C2=A0
>=20
> - Jerry Saltzer and I were the primary authors of the End-to-end argume=
nt paper, and the motivation was based *my* work on the original TCP and =
IP protocols. Dave Clark got involved significantly later than all those =
decisions, which were basically complete when he got involved. (Jerry was=
 my thesis supervisor, I was his student, and I operated largely independ=
ently, taking input from various others at MIT). I mention this because D=
ave understands the end-to-end arguments, but he understands (as we all d=
id) that it was a design *principle* and not a perfectly strict rule. Tha=
t said, it's a rule that has a strong foundational argument from modulari=
ty and evolvability in a context where the system has to work on a wide r=
ange of infrastructures (not all knowable in advance) and support a wide =
range of usage/application-areas (not all knowable in advance). Treating =
the paper as if it were "DDC" declaring a law is just wrong. He wasn't Mo=
ses and it is not written on tablets. Dave
> did have some "power" in his role of trying to achieve interoperability=
 across diverse implementations. But his focus was primarily on interoper=
ability, not other things. So ideas in the IP protocol like "TOS" which w=
ere largely placeholders for not-completely-worked-out concepts deferred =
to the future were left till later.

Yes, well understood, but he was in fact the link between the e2e paper a=
nd the differentiated services work. Although not a nominal author of the=
 "two-bit" RFC, he was heavily involved in it, which is why I mentioned h=
im. And he was very active in the IETF diffserv WG.=20
> - It is clear (at least to me) that from the point of view of the sourc=
e of an IP datagram, the "handling" of that datagram within the network o=
f networks can vary, and so that is why there is a TOS field - to specify=
 an interoperable, meaningfully described per-packet indicator of differe=
ntial handling. In regards to the end-to-end argument, that handling choi=
ce is a network function, *to the extent that it can completely be implem=
ented in the network itself*.
>=20
> Congestion management, however, is not achievable entirely and only wit=
hin the network. That's completely obvious: congestion happens when the s=
ource-destination flows exceed the capacity of the network of networks to=
 satisfy all demands.
>=20
> The network can only implement *certain* general kinds of mechanisms th=
at may be used by the endpoints to resolve congestion:
>=20
> 1) admission controls. These are implemented at the interface between t=
he source entity and the network of networks. They tend to be impractical=
 in the Internet context, because there is, by a fundamental and irrevers=
ible design choice made by Cerf and Kahn (and the rest of us), no central=
 controller of the entire network of networks. This is to make evolvabili=
ty and scalability work. 5G (not an Internet system) implies a central co=
ntroller, as does SNA, LTE, and many other networks. The Internet is an o=
verlay on top of such networks.
>=20
> 2) signalling congestion to the endpoints, which will respond by slowin=
g their transmission rate (or explicitly re-routing transmission, or comp=
ressing their content) through the network to match capacity. This respon=
se is done *above* the IP layer, and has proven very practical. The funct=
ion in the network is reduced to "congestion signalling", in a universall=
y understandable meaningful mechanism: packet drops, ECN, packet-pair sep=
aration in arrival time, ...=C2=A0 This limited function is essential wit=
hin the network, because it is the state of the path(s) that is needed to=
 implement the full function at the end points. So congestion signalling,=
 like ECN, is implemented according to the end-to-end argument by careful=
ly defining the network function to be the minimum necessary mechanism so=
 that endpoints can control their rates.
>=20
> 3) automatic selection of routes for flows. It's perfectly fine to sele=
ct different routes based on information in the IP header (the part that =
is intended to be read and understood by the network of networks). Now th=
is is currently *rarely* done, due to the complexity of tracking more det=
ailed routing information at the router level. But we had expected that e=
ventually the Internet would be so well connected that there would be div=
erse routes with diverse capabilities. For example, the "Interplanetary I=
nternet" works with datagrams, that can be implemented with IP, but not u=
sing TCP, which requires very low end-to-end latency. Thus, one would exp=
ect that TCP would not want any packets transferred over a path via Mars,=
 or for that matter a geosynchronous satellite, even if the throughput wo=
uld be higher.
>=20
> So one can imagine that eventually a "TOS" might say - send this packet=
 preferably along a path that has at most 200 ms. RTT, *even if that lead=
s to congestion signalling*, while another TOS might say "send this path =
over the most "capacious" set of paths, ignoring RTT entirely. (these are=
 just for illustration, but obviously something like this woujld work).
>=20
> Note that TOS is really aimed at *route selection* preferences, and not=
 queueing management of individual routers.

That may well have been the original intention, but it was hardly mention=
ed at all in the diffserv WG (which I co-chaired), and "QOS-based routing=
" was in very bad odour at that time.
 =C2=A0
>=20
> Queueing management to share a single queue on a path for multiple prio=
rities of traffic is not very compatible with "end-to-end arguments". The=
re are any number of reasons why this doesn't work well. I can go into th=
em. Mainly these reasons are why "diffserv" has never been adopted -=20

Oh, but it has, in lots of local deployments of voice over IP for example=
=2E It's what I've taken to calling a limited domain protocol. What has n=
ot happened is Internet-wide deployment, because...

> it's NOT interoperable because the diversity of traffic between endpoin=
ts is hard to specify in a way that translates into the network mechanism=
s. Of course any queue can be managed in some algorithmic way with parame=
ters, but the endpoints that want to specify an end-to-end goal don't hav=
e a way to understand the impact of those parameters on a specific queue =
that is currently congested.

Yes. And thanks for your insights.

   Brian

>=20
> =C2=A0
>=20
> Instead, the history of the Internet (and for that matter *all* network=
s, even Bell's voice systems) has focused on minimizing queueing delay to=
 near zero throughout the network by whatever means it has at the endpoin=
ts or in the design. This is why we have AIMD's MD as a response to detec=
tion of congestion.
>=20
> =C2=A0
>=20
> Pragmatic networks (those that operate in the real world) do not choose=
 to operate with shared links in a saturated state. That's known in the p=
hone business as the Mother's Day problem. You want to have enough capaci=
ty for the rare near-overload to never result in congestion.=C2=A0 Which =
means that the normal state of the network is very lightly loaded indeed,=
 in order to minimize RTT. Consequently, focusing on somehow trying to op=
timize the utilization of the network to 100% is just a purely academic e=
xercise. Since "priority" at the packet level within a queue only improve=
s that case, it's just a focus of (bad) Ph.D. theses. (Good Ph.D. theses =
focus on actual real problems like getting the queues down to 1 packet or=
 less by signalling the endpoints with information that allows them to do=
 their job).
>=20
> =C2=A0
>=20
> So, in considering what goes in the IP layer, both its header and the m=
echanics of the network of networks, it is those things that actually hav=
e implementable meaning in the network of networks when processing the IP=
 datagram. The rest is "content" because the network of networks doesn't =
need to see it.
>=20
> =C2=A0
>=20
> Thus, don't put anything in the IP header that belongs in the "content"=
 part, just being a signal between end points. Some information used in t=
he network of networks is also logically carried between endpoints.
>=20
> =C2=A0
>=20
> =C2=A0
>=20
> On Friday, June 21, 2019 4:37pm, "Brian E Carpenter" <brian.e.carpenter=
@gmail.com> said:
>=20
>> Below...
>> On 21-Jun-19 21:33, Luca Muscariello wrote:
>> > + David Reed, as I'm not sure he's on the ecn-sane list.
>> >
>> > To me, it seems like a very religious position against per-flow
>> queueing.=C2=A0
>> > BTW, I fail to see how this would violate (in a "profound" way ) the=
 e2e
>> principle.
>> >
>> > When I read it (the e2e principle)
>> >
>> > Saltzer, J. H., D. P. Reed, and D. D. Clark (1981) "End-to-End Argum=
ents in
>> System Design".=C2=A0
>> > In: Proceedings of the Second International Conference on Distribute=
d
>> Computing Systems. Paris, France.=C2=A0
>> > April 8=E2=80=9310, 1981. IEEE Computer Society, pp. 509-512.
>> > (available on line for free).
>> >
>> > It seems very much like the application of the Occam's razor to func=
tion
>> placement in communication networks back in the 80s.
>> > I see no conflict between what is written in that paper and per-flow=
 queueing
>> today, even after almost 40 years.
>> >
>> > If that was the case, then all service differentiation techniques wo=
uld
>> violate the e2e principle in a "profound" way too,
>> > and dualQ too. A policer? A shaper? A priority queue?
>> >
>> > Luca
>>
>> Quoting RFC2638 (the "two-bit" RFC):
>>
>> >>> Both these
>> >>> proposals seek to define a single common mechanism that is used
>> by
>> >>> interior network routers, pushing most of the complexity and state=

>> of
>> >>> differentiated services to the network edges.
>>
>> I can't help thinking that if DDC had felt this was against the E2E pr=
inciple,
>> he would have kicked up a fuss when it was written.
>>
>> Bob's right, however, that there might be a tussle here. If end-points=
 are
>> attempting to pace their packets to suit their own needs, and the netw=
ork is
>> policing packets to support both service differentiation and fairness,=

>> these may well be competing rather than collaborating behaviours. And =
there
>> probably isn't anything we can do about it by twiddling with algorithm=
s.
>>
>> Brian
>>
>>
>>
>>
>>
>>
>>
>> >
>> >
>> >
>> >
>> >
>> >
>> > =C2=A0
>> >
>> > On Fri, Jun 21, 2019 at 9:00 AM Sebastian Moeller <moeller0@gmx.de
>> <mailto:moeller0@gmx.de>> wrote:
>> >
>> >
>> >
>> > > On Jun 19, 2019, at 16:12, Bob Briscoe <ietf@bobbriscoe.net
>> <mailto:ietf@bobbriscoe.net>> wrote:
>> > >
>> > > Jake, all,
>> > >
>> > > You may not be aware of my long history of concern about how
>> per-flow scheduling within endpoints and networks will limit the Inter=
net in
>> future. I find per-flow scheduling a violation of the e2e principle in=
 such a
>> profound way - the dynamic choice of the spacing between packets - tha=
t most
>> people don't even associate it with the e2e principle.
>> >
>> > Maybe because it is not a violation of the e2e principle at all? My =
point
>> is that with shared resources between the endpoints, the endpoints sim=
ply should
>> have no expectancy that their choice of spacing between packets will b=
e conserved.
>> For the simple reason that it seems generally impossible to guarantee =
that
>> inter-packet spacing is conserved (think "cross-traffic" at the bottle=
neck hop
>> along the path and general bunching up of packets in the queue of a fa=
st to slow
>> transition*). I also would claim that the way L4S works (if it works) =
is to
>> synchronize all active flows at the bottleneck which in tirn means eac=
h sender has
>> only a very small timewindow in which to transmit a packet for it to h=
its its
>> "slot" in the bottleneck L4S scheduler, otherwise, L4S's low queueing =
delay
>> guarantees will not work. In other words the senders have basically no=
 say in the
>> "spacing between packets", I fail to see how L4S improves upon FQ in t=
hat regard.
>> >
>> >
>> > =C2=A0IMHO having per-flow fairness as the defaults seems quite
>> reasonable, endpoints can still throttle flows to their liking. Now pe=
r-flow
>> fairness still can be "abused", so by itself it might not be sufficien=
t, but
>> neither is L4S as it has at best stochastic guarantees, as a single qu=
eue AQM
>> (let's ignore the RFC3168 part of the AQM) there is the probability to=
 send a
>> throtteling signal to a low bandwidth flow (fair enough, it is only a =
mild
>> throtteling signal, but still).
>> > But enough about my opinion, what is the ideal fairness measure in y=
our
>> mind, and what is realistically achievable over the internet?
>> >
>> >
>> > Best Regards
>> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 Sebastian
>> >
>> >
>> >
>> >
>> > >
>> > > I detected that you were talking about FQ in a way that might have=

>> assumed my concern with it was just about implementation complexity. I=
f you (or
>> anyone watching) is not aware of the architectural concerns with per-f=
low
>> scheduling, I can enumerate them.
>> > >
>> > > I originally started working on what became L4S to prove that it w=
as
>> possible to separate out reducing queuing delay from throughput schedu=
ling. When
>> Koen and I started working together on this, we discovered we had iden=
tical
>> concerns on this.
>> > >
>> > >
>> > >
>> > > Bob
>> > >
>> > >
>> > > --
>> > > ________________________________________________________________
>> > > Bob Briscoe=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=

>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0http://bobbrisc=
oe.net/
>> > >
>> > > _______________________________________________
>> > > Ecn-sane mailing list
>> > > Ecn-sane@lists.bufferbloat.net
>> <mailto:Ecn-sane@lists.bufferbloat.net>
>> > > https://lists.bufferbloat.net/listinfo/ecn-sane
>> >
>> > _______________________________________________
>> > Ecn-sane mailing list
>> > Ecn-sane@lists.bufferbloat.net
>> <mailto:Ecn-sane@lists.bufferbloat.net>
>> > https://lists.bufferbloat.net/listinfo/ecn-sane
>> >
>>
>>
>=20