From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@deepplum.com>
Received: from smtp96.iad3a.emailsrvr.com (smtp96.iad3a.emailsrvr.com
 [173.203.187.96])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id E851F3B2A4
 for <ecn-sane@lists.bufferbloat.net>; Sat, 22 Jun 2019 18:09:37 -0400 (EDT)
Received: from smtp5.relay.iad3a.emailsrvr.com (localhost [127.0.0.1])
 by smtp5.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id B14E7225EB;
 Sat, 22 Jun 2019 18:09:37 -0400 (EDT)
X-SMTPDoctor-Processed: csmtpprox beta
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=g001.emailsrvr.com;
 s=20190322-9u7zjiwi; t=1561241377;
 bh=zGRas9UtVRlhyX7jjYhsfU/aPtN5LtKkg1jU1dh+xNI=;
 h=Date:Subject:From:To:From;
 b=tPNdc99tQYSSnRlFLjb7D/KY9J5dcLIsO8iQcbadfsOuFcHW7/ay7JqdEysPkAFqM
 QLzwxtwS++O1wRV+N7O5eyxUQp/iqafBi23/M+OXnV9HDMoGd2xpuzWe/LrizRYKqb
 C1YKHLdXN1pQAafDjQ/ecSLfKYHWMiCkTB80C5hw=
Received: from app59.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140])
 by smtp5.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 7440920F4D;
 Sat, 22 Jun 2019 18:09:37 -0400 (EDT)
X-Sender-Id: dpreed@deepplum.com
Received: from app59.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12);
 Sat, 22 Jun 2019 18:09:37 -0400
Received: from deepplum.com (localhost.localdomain [127.0.0.1])
 by app59.wa-webapps.iad3a (Postfix) with ESMTP id 62F7A601EB;
 Sat, 22 Jun 2019 18:09:37 -0400 (EDT)
Received: by apps.rackspace.com
 (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) 
 with HTTP; Sat, 22 Jun 2019 18:09:37 -0400 (EDT)
X-Auth-ID: dpreed@deepplum.com
Date: Sat, 22 Jun 2019 18:09:37 -0400 (EDT)
From: "David P. Reed" <dpreed@deepplum.com>
To: "Jonathan Morton" <chromatix99@gmail.com>
Cc: "Brian E Carpenter" <brian.e.carpenter@gmail.com>,
 "ecn-sane@lists.bufferbloat.net" <ecn-sane@lists.bufferbloat.net>,
 "tsvwg IETF list" <tsvwg@ietf.org>
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="----=_20190622180937000000_92011"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <71EF351D-AFBF-4C92-B6B9-7FD695B68815@gmail.com>
References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net> 
 <46D1ABD8-715D-44D2-B7A0-12FE2A9263FE@gmx.de> 
 <CAHx=1M4+sJBEe-wqCyuVyy=oDz7A+SG_ZxBbu_ZZDZiCHrX2uw@mail.gmail.com> 
 <835b1fb3-e8d5-c58c-e2f8-03d2b886af38@gmail.com> 
 <1561233009.95886420@apps.rackspace.com> 
 <71EF351D-AFBF-4C92-B6B9-7FD695B68815@gmail.com>
Message-ID: <1561241377.4026977@apps.rackspace.com>
X-Mailer: webmail/16.4.5-RC
Subject: Re: [Ecn-sane] [tsvwg] per-flow scheduling
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
 Internet <ecn-sane.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/ecn-sane>
List-Post: <mailto:ecn-sane@lists.bufferbloat.net>
List-Help: <mailto:ecn-sane-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sat, 22 Jun 2019 22:09:38 -0000

------=_20190622180937000000_92011
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AGood point, but saturation is not congestion, in my view. Optimal state =
of a single bottleneck link is a queue of length <=3D 1. This can be mainta=
ined even under full load, by endpoint rate control. (Note that the "Little=
's Law" result in queueing theory is for a special case of queuing due to u=
ncontrolled Poisson random arrivals. The traffic you refer to, downloading =
an update, is about as far from Poisson-random as possible. It's neither ra=
ndom nor Poisson).=0A =0ABut I take your point that there will be bottlenec=
ked links that are special, and thus congestion control comes into play on =
those links.=0A =0Aper-flow scheduling is appropriate on a shared link. How=
ever, the end-to-end argument would suggest that the network not try to div=
ine which flows get preferred.=0AAnd beyond the end-to-end argument, there'=
s a practical problem - since the ideal state of a shared link means that i=
t ought to have no local backlog in the queue, the information needed to sc=
hedule "fairly" isn't in the queue backlog itself.  If there is only one pa=
cket, what's to schedule?=0A =0AIn fact, what the ideal queueing discipline=
 would do is send signals to the endpoints that provide information as to w=
hat each flow's appropriate share is, and/or how far its current share is f=
rom what's fair. A somewhat less optimal discipline to achieve fairness mig=
ht also drop packets beyond the fair share. Dropping is the best way to sig=
nal the endpoint that it is trying to overuse the link.=0A =0AMerely re-ord=
ering the packets on a link is just not very effective at achieving fairnes=
s.=0A =0ANow what to do to determine fair share if the average number of pa=
ckets in a saturated link's queue is <=3D 1?=0A =0AWell, presumably the flo=
ws have definable average rates. So  good statistical estimators of the ave=
rage rate of each flow exist. For example a table with a moving average of =
the rate of bytes on each flow can be maintained, which provides wonderful =
information about the recent history of capacities used by each flow. This =
allows signalling of overuse to the endpoints of each flow, either by dropp=
ing packets or by some small number of bits (e.g. ECN marking). Underuse ca=
n also be signalled.=0A =0AWe know (from pragmatic observation) that flows =
defined as source-host, dest-host pairs, have relatively stable demand over=
 many packets, and many RTT's.  More specific flows (say, defined using TCP=
 and UDP ports, the "extended header" that maybe should have been included =
in the IP layer) are less stable in some protocol usages we see.=0A =0AHowe=
ver, what's clear is that something like fq_codel can be created that does =
not need to accumulate long queues on links to have an effect - obviously q=
ueues still result from transient overload, but one wants to avoid requirin=
g queueing to build up in order to balance traffic fairly.=0A =0AIdeally, i=
n a situation where some definition of fairness needs to be implemented amo=
ng flows competing for a limited resource, the queueing should build up at =
the source endpoint as much as possible, and yet the point where the confli=
ct occurs may be deep within the network of networks.  However, once we sta=
rt being able to balance loads across many independent paths (rather than a=
ssuming that there is exactly one path per flow), the definition of "fairne=
ss" becomes FAR more complicated, because the competing flows may not be be=
tween a particular source and destination pair - instead the traffic in eac=
h flow can be managed across different sets of multipaths.=0A =0ASo the end=
-to-end approach would suggest moving most of the scheduling back to the en=
dpoints of each flow, with the role of the routers being to extract informa=
tion about the competing flows that are congesting the network, and forward=
ing those signals (via drops or marking) to the endpoints. That's because, =
in the end-to-end argument that applies here - the router cannot do the ent=
ire function of managing congestion or priority.=0A =0AThe case of the "upl=
ink" from a residential network is a common, but not general case. Once a r=
esidence has multiple uplinks (say in a Mesh neighborhood network or just i=
n a complex wireless situation where multiple access points are reachable) =
the idea of localizing priority or congestion management in one box below t=
he IP layer no longer works well at all.=0A =0AOn Saturday, June 22, 2019 4=
:47pm, "Jonathan Morton" <chromatix99@gmail.com> said:=0A=0A=0A=0A> > On 22=
 Jun, 2019, at 10:50 pm, David P. Reed <dpreed@deepplum.com>=0A> wrote:=0A>=
 >=0A> > Pragmatic networks (those that operate in the real world) do not c=
hoose to=0A> operate with shared links in a saturated state. That's known i=
n the phone business=0A> as the Mother's Day problem. You want to have enou=
gh capacity for the rare=0A> near-overload to never result in congestion.=
=0A> =0A> This is most likely true for core networks. However, I know of se=
veral real-world=0A> networks and individual links which, in practice, are =
regularly in a saturated=0A> and/or congested state.=0A> =0A> Indeed, the a=
verage Internet consumer's ADSL or VDSL last-mile link becomes=0A> saturate=
d for a noticeable interval, every time his operating system or game=0A> ve=
ndor releases an update. In my case, I share a 3G/4G tower's airtime with=
=0A> whatever variable number of subscribers to the same network happen to =
be in the=0A> area on any given day; today, during midsummer weekend, that =
number is=0A> considerably inflated compared to normal, and my available li=
nk bandwidth is=0A> substantially impacted as a result, indicating congesti=
on.=0A> =0A> I did not see anything in your argument specifically about per=
-flow scheduling for=0A> the simple purpose of fairly sharing capacity betw=
een flows and/or between=0A> subscribers, and minimising the impact of elep=
hants on mice. Empirical evidence=0A> suggests that it makes the network ru=
n more smoothly. Does anyone have a concrete=0A> refutation?=0A> =0A> - Jon=
athan Morton
------=_20190622180937000000_92011
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"arial" size=3D"3"><p style=3D"margin:0;padding:0;font-family:=
 arial; font-size: 12pt; overflow-wrap: break-word;">Good point, but satura=
tion is not congestion, in my view. Optimal state of a single bottleneck li=
nk is a queue of length &lt;=3D 1. This can be maintained even under full l=
oad, by endpoint rate control. (Note that the "Little's Law" result in queu=
eing theory is for a special case of queuing due to uncontrolled Poisson ra=
ndom arrivals. The traffic you refer to, downloading an update, is about as=
 far from Poisson-random as possible. It's neither random nor Poisson).</p>=
=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; over=
flow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-fa=
mily: arial; font-size: 12pt; overflow-wrap: break-word;">But I take your p=
oint that there will be bottlenecked links that are special, and thus conge=
stion control comes into play on those links.</p>=0A<p style=3D"margin:0;pa=
dding:0;font-family: arial; font-size: 12pt; overflow-wrap: break-word;">&n=
bsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12=
pt; overflow-wrap: break-word;">per-flow scheduling is appropriate on a sha=
red link. However, the end-to-end argument would suggest that the network n=
ot try to divine which flows get preferred.</p>=0A<p style=3D"margin:0;padd=
ing:0;font-family: arial; font-size: 12pt; overflow-wrap: break-word;">And =
beyond the end-to-end argument, there's a practical problem - since the ide=
al state of a shared link means that it ought to have no local backlog in t=
he queue, the information needed to schedule "fairly" isn't in the queue ba=
cklog itself.&nbsp; If there is only one packet, what's to schedule?</p>=0A=
<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; overflo=
w-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-famil=
y: arial; font-size: 12pt; overflow-wrap: break-word;">In fact, what the id=
eal queueing discipline would do is send signals to the endpoints that prov=
ide information as to what each flow's appropriate share is, and/or how far=
 its current share is from what's fair. A somewhat less optimal discipline =
to achieve fairness might also drop packets beyond the fair share. Dropping=
 is the best way to signal the endpoint that it is trying to overuse the li=
nk.</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12p=
t; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;=
font-family: arial; font-size: 12pt; overflow-wrap: break-word;">Merely re-=
ordering the packets on a link is just not very effective at achieving fair=
ness.</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 1=
2pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:=
0;font-family: arial; font-size: 12pt; overflow-wrap: break-word;">Now what=
 to do to determine fair share if the average number of packets in a satura=
ted link's queue is &lt;=3D 1?</p>=0A<p style=3D"margin:0;padding:0;font-fa=
mily: arial; font-size: 12pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p s=
tyle=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; overflow-wr=
ap: break-word;">Well, presumably the flows have definable average rates. S=
o&nbsp; good statistical estimators of the average rate of each flow exist.=
 For example a table with a moving average of the rate of bytes on each flo=
w can be maintained, which provides wonderful information about the recent =
history of capacities used by each flow. This allows signalling of overuse =
to the endpoints of each flow, either by dropping packets or by some small =
number of bits (e.g. ECN marking). Underuse can also be signalled.</p>=0A<p=
 style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; overflow-=
wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family:=
 arial; font-size: 12pt; overflow-wrap: break-word;">We know (from pragmati=
c observation) that flows defined as source-host, dest-host pairs, have rel=
atively stable demand over many packets, and many RTT's.&nbsp; More specifi=
c flows (say, defined using TCP and UDP ports, the "extended header" that m=
aybe should have been included in the IP layer) are less stable in some pro=
tocol usages we see.</p>=0A<p style=3D"margin:0;padding:0;font-family: aria=
l; font-size: 12pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"ma=
rgin:0;padding:0;font-family: arial; font-size: 12pt; overflow-wrap: break-=
word;">However, what's clear is that something like fq_codel can be created=
 that does not need to accumulate long queues on links to have an effect - =
obviously queues still result from transient overload, but one wants to avo=
id requiring queueing to build up in order to balance traffic fairly.</p>=
=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; over=
flow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-fa=
mily: arial; font-size: 12pt; overflow-wrap: break-word;">Ideally, in a sit=
uation where some definition of fairness needs to be implemented among flow=
s competing for a limited resource, the queueing should build up at the sou=
rce endpoint as much as possible, and yet the point where the conflict occu=
rs may be deep within the network of networks.&nbsp; However, once we start=
 being able to balance loads across many independent paths (rather than ass=
uming that there is exactly one path per flow), the definition of "fairness=
" becomes FAR more complicated, because the competing flows may not be betw=
een a particular source and destination pair - instead the traffic in each =
flow can be managed across different sets of multipaths.</p>=0A<p style=3D"=
margin:0;padding:0;font-family: arial; font-size: 12pt; overflow-wrap: brea=
k-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; fo=
nt-size: 12pt; overflow-wrap: break-word;">So the end-to-end approach would=
 suggest moving most of the scheduling back to the endpoints of each flow, =
with the role of the routers being to extract information about the competi=
ng flows that are congesting the network, and forwarding those signals (via=
 drops or marking) to the endpoints. That's because, in the end-to-end argu=
ment that applies here - the router cannot do the entire function of managi=
ng congestion or priority.</p>=0A<p style=3D"margin:0;padding:0;font-family=
: arial; font-size: 12pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; overflow-wrap: =
break-word;">The case of the "uplink" from a residential network is a commo=
n, but not general case. Once a residence has multiple uplinks (say in a Me=
sh neighborhood network or just in a complex wireless situation where multi=
ple access points are reachable) the idea of localizing priority or congest=
ion management in one box below the IP layer no longer works well at all.</=
p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 12pt; ov=
erflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-=
family: arial; font-size: 12pt; overflow-wrap: break-word;">On Saturday, Ju=
ne 22, 2019 4:47pm, "Jonathan Morton" &lt;chromatix99@gmail.com&gt; said:<b=
r /><br /></p>=0A<div id=3D"SafeStyles1561239413">=0A<p style=3D"margin:0;p=
adding:0;font-family: arial; font-size: 12pt; overflow-wrap: break-word;">&=
gt; &gt; On 22 Jun, 2019, at 10:50 pm, David P. Reed &lt;dpreed@deepplum.co=
m&gt;<br />&gt; wrote:<br />&gt; &gt;<br />&gt; &gt; Pragmatic networks (th=
ose that operate in the real world) do not choose to<br />&gt; operate with=
 shared links in a saturated state. That's known in the phone business<br /=
>&gt; as the Mother's Day problem. You want to have enough capacity for the=
 rare<br />&gt; near-overload to never result in congestion.<br />&gt; <br =
/>&gt; This is most likely true for core networks. However, I know of sever=
al real-world<br />&gt; networks and individual links which, in practice, a=
re regularly in a saturated<br />&gt; and/or congested state.<br />&gt; <br=
 />&gt; Indeed, the average Internet consumer's ADSL or VDSL last-mile link=
 becomes<br />&gt; saturated for a noticeable interval, every time his oper=
ating system or game<br />&gt; vendor releases an update. In my case, I sha=
re a 3G/4G tower's airtime with<br />&gt; whatever variable number of subsc=
ribers to the same network happen to be in the<br />&gt; area on any given =
day; today, during midsummer weekend, that number is<br />&gt; considerably=
 inflated compared to normal, and my available link bandwidth is<br />&gt; =
substantially impacted as a result, indicating congestion.<br />&gt; <br />=
&gt; I did not see anything in your argument specifically about per-flow sc=
heduling for<br />&gt; the simple purpose of fairly sharing capacity betwee=
n flows and/or between<br />&gt; subscribers, and minimising the impact of =
elephants on mice. Empirical evidence<br />&gt; suggests that it makes the =
network run more smoothly. Does anyone have a concrete<br />&gt; refutation=
?<br />&gt; <br />&gt; - Jonathan Morton</p>=0A</div></font>
------=_20190622180937000000_92011--