From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@deepplum.com>
Received: from smtp90.iad3a.emailsrvr.com (smtp90.iad3a.emailsrvr.com
 [173.203.187.90])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 2ACF53B2A4
 for <ecn-sane@lists.bufferbloat.net>; Wed, 29 Sep 2021 15:34:23 -0400 (EDT)
Received: from app45.wa-webapps.iad3a (relay-webapps.rsapps.net
 [172.27.255.140])
 by smtp4.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 457F04250;
 Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
Received: from deepplum.com (localhost.localdomain [127.0.0.1])
 by app45.wa-webapps.iad3a (Postfix) with ESMTP id 2F0DA61CB4;
 Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
Received: by apps.rackspace.com
 (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) 
 with HTTP; Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
X-Auth-ID: dpreed@deepplum.com
Date: Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
From: "David P. Reed" <dpreed@deepplum.com>
To: "Jonathan Morton" <chromatix99@gmail.com>
Cc: "Bob Briscoe" <research@bobbriscoe.net>,
 "Mohit P. Tahiliani" <tahiliani@nitk.edu.in>,
 "ECN-Sane" <ecn-sane@lists.bufferbloat.net>,
 "Asad Sajjad Ahmed" <me@asadsa.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="----=_20210929153422000000_62627"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <F8037866-1C9B-4F05-AE05-41EAFA5C3E47@gmail.com>
References: <CAA93jw68BLggvcjqgYPmz8sdUsfMD2CpoMoKsQt3uz2HKmkT+g@mail.gmail.com> 
 <a0aef109-e681-6d02-363c-ab65fc2134c4@bobbriscoe.net> 
 <1632867355.4986972@apps.rackspace.com> 
 <F8037866-1C9B-4F05-AE05-41EAFA5C3E47@gmail.com>
X-Client-IP: 209.6.168.128
Message-ID: <1632944062.19049538@apps.rackspace.com>
X-Mailer: webmail/19.0.13-RC
X-Classification-ID: 0ad3640e-f6a6-4c8e-96d9-9ff228f16677-1-1
Subject: Re: [Ecn-sane] paper idea: praising smaller packets
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
 Internet <ecn-sane.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/ecn-sane>
List-Post: <mailto:ecn-sane@lists.bufferbloat.net>
List-Help: <mailto:ecn-sane-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Wed, 29 Sep 2021 19:34:23 -0000

------=_20210929153422000000_62627
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AJonathan  - I pretty much agree with most of what you say here. However,=
 two things:=0A =0A1) a router that has only one flow at a time traversing =
it is not a router. It's just a link that runs at memory speed in between t=
wo links. A degenerate case.=0A =0A2) The start of my email - about the fac=
t that each outbound link must be made to clear (with no queued traffic) wi=
thin a copper or fiber speed circuit of the earth (great circle route) - is=
 my criterion for NOT being 100% utilized.  But it's a description that foc=
uses on latency and capacity in a single measure.=0AIt's very close to 100%=
 utilized. (this satisfies your concern about supporting low bit rates at t=
he edges, but in a very different way).=0A =0AThe problem with links is tha=
t they can NEVER be utilized more than 100%. So utilization is a TERRIBLE m=
etric for thinking about the problem.=0A =0AI didn't mean this as a weird j=
oke - I'm very serious. utilization is just the wrong measure. And so is en=
d-to-end latency average - averages are not meaningful in a fat-tailed traf=
fic distribution, no matter how you compute them, and average latency is a =
very strange characterization, since most paths actually have no traffic be=
cause each endpoint only uses a small subset of paths.=0A =0AOnce upon a ti=
me I thought that all links should be capped at average utilization of 10% =
or 50%. But in fact that is a terrible measure too - averages are a bad met=
ric, for the same reason.=0A =0AInstead, operationally it is OK for a link =
to be "almost full", as long as the control protocols create openings frequ=
ently enough to mitigate latency issues.=0A =0A(Side note: If you want to u=
nderstand really deeply why "averages" are a terrible statistic for network=
ing, I recommend reading Nassim Taleb's book about pre-asymptotic behavior =
of random systems and the problem of applying statistical measures to syste=
ms that are not "in equilibrium" - [ https://arxiv.org/abs/2001.10488 ]( ht=
tps://arxiv.org/abs/2001.10488 ) . Seriously! It's tough sledding, sound ma=
th, and very enlightening. Much of what he says can be translated into the =
world of real networking and queueing. Sadly, most queueing theory doesn't =
touch on pre-asymptotic behavior, but instead assumes that the asymptotic b=
ehavior of a queueing system characterizes the normal behavior. )=0A(some p=
eople try to say that network traffic is "fractal", which is actually unrea=
sonable - most protocols behave highly deterministically, and there's no "s=
elf-similarity" inherent in end-to-end flow statistics, no power laws, ...)=
=0A =0A =0AOn Wednesday, September 29, 2021 6:36am, "Jonathan Morton" <chro=
matix99@gmail.com> said:=0A=0A=0A=0A> > On 29 Sep, 2021, at 1:15 am, David =
P. Reed <dpreed@deepplum.com>=0A> wrote:=0A> >=0A> > Now, it is important a=
s hell to avoid bullshit research programs that try to=0A> "optimize" ustil=
ization of link capacity at 100%. Those research programs focus on=0A> the =
absolute wrong measure - a proxy for "network capital cost" that is in fact=
=0A> the wrong measure of any real network operator's cost structure. The c=
ost of media=0A> (wires, airtime, ...) is a tiny fraction of most network o=
perations' cost in any=0A> real business or institution. We don't optimize =
highways by maximizing the number=0A> of cars on every stretch of highway, =
for obvious reasons, but also for non-obvious=0A> reasons.=0A> =0A> I think=
 it is important to distinguish between core/access networks and last-mile=
=0A> links. The technical distinction is in the level of statistical multip=
lexing -=0A> high in the former, low in the latter. The cost structure to t=
he relevant user is=0A> also significantly different.=0A> =0A> I agree with=
 your analysis when it comes to core/access networks with a high=0A> degree=
 of statistical multiplexing. These networks should be built with enough=0A=
> capacity to service their expected load. When the actual load exceeds ins=
talled=0A> capacity for whatever reason, keeping latency low maintains netw=
ork stability and,=0A> with a reasonable AQM, should not result in apprecia=
bly reduced goodput in=0A> practice.=0A> =0A> The relevant user's costs are=
 primarily in the hardware installed at each end of=0A> the link (hence min=
imising complexity in this high-speed hardware is often seen as=0A> an impo=
rtant goal), and possibly in the actual volume of traffic transferred, not=
=0A> in the raw capacity of the medium. All the same, if the medium were ch=
eap, why=0A> not just install more of it, rather than spending big on the h=
ardware at each end?=0A> There's probably a good explanation for this that =
I'm not quite aware of. =0A> Perhaps it has to do with capital versus opera=
tional costs.=0A> =0A> On a last-mile link, the relevant user is a member o=
f the household that the link=0A> leads to. He is rather likely to be *very=
* interested in getting the most goodput=0A> out of the capacity available =
to him, on those occasions when he happens to have a=0A> heavy workload for=
 it. He's just bought a game on Steam, for example, and wants=0A> to minimi=
se the time spent waiting for multiple gigabytes to download before he=0A> =
can enjoy his purchase. Assuming his ISP and the Steam CDN have built their=
=0A> networks wisely, his last-mile link will be the bottleneck for this ta=
sk - and=0A> optimising goodput over it becomes *more* important the lower =
the link capacity=0A> is.=0A> =0A> A lot of people, for one reason or anoth=
er, still have links below 50Mbps, and=0A> sometimes *much* less than that.=
 It's worth reminding the gigabit fibre crowd of=0A> that, once in a while.=
=0A> =0A> But he may not the only member of the household interested in thi=
s particular=0A> link. My landlord, for example, may commonly have his wife=
, sister, mother, and=0A> four children at home at any given time, dependin=
g on the time of year. Some of=0A> the things they wish to do may be latenc=
y-sensitive, and they are also likely to=0A> be annoyed if throughput-sensi=
tive tasks are unreasonably impaired. So the=0A> goodput of the Steam downl=
oad is not the only metric of relevance, taken=0A> holistically. And it is =
certainly not correct to maximise utilisation of the=0A> link, as you can "=
utilise" the link with a whole lot of useless junk, yet make no=0A> progres=
s whatsoever.=0A> =0A> Maximising an overall measure of network power, howe=
ver, probably *does* make=0A> sense - in both contexts. The method of doing=
 so is naturally different in each=0A> context:=0A> =0A> 1: In core/access =
networks, ensuring that demand is always met by capacity=0A> maximises usef=
ul throughput and minimises latency. This is the natural optimum=0A> for ne=
twork power.=0A> =0A> 2: It is reasonable to assume that installing more ca=
pacity has an associated=0A> cost, which may exert downward pressure on cap=
acity. In core/access networks=0A> where demand exceeds capacity, throughpu=
t is fixed at capacity, and network power=0A> is maximised by minimising de=
lays. This assumes that no individual traffic's=0A> throughput is unreasona=
bly impaired, compared to others, in the process; the=0A> "linear product-b=
ased fairness index" can be used to detect this:=0A> =0A> https://en.wikipe=
dia.org/wiki/Fairness_measure#:~:text=3DProduct-based%20Fairness%20Indices=
=0A> =0A> 3: In a last-mile link, network power is maximised by maximising =
the goodput of=0A> useful applications, ensuring that all applications have=
 a "fair" share of=0A> available capacity (for some reasonable definition o=
f "fair"), and keeping latency=0A> as low as reasonably practical while doi=
ng so. This is likely to be associated=0A> with high link utilisation when =
demand is heavy.=0A> =0A> > Operating at fully congested state - or designi=
ng TCP to essentially come=0A> close to DDoS behaviour on a bottleneck to g=
et a publishable paper - is missing=0A> the point.=0A> =0A> When writing a =
statement like that, it's probably important to indicate what a=0A> "fully =
congested state" actually means. Some might take it to mean merely 100%=0A>=
 link utilisation, which could actually be part of an optimal network power=
=0A> solution. From context, I assume you actually mean that the queues are=
 driven to=0A> maximum depth and to the point of overflow - or beyond.=0A> =
=0A> - Jonathan Morton
------=_20210929153422000000_62627
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"arial" size=3D"2"><p style=3D"margin:0;padding:0;font-family:=
 arial; font-size: 10pt; overflow-wrap: break-word;">Jonathan&nbsp; - I pre=
tty much agree with most of what you say here. However, two things:</p>=0A<=
p style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow=
-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family=
: arial; font-size: 10pt; overflow-wrap: break-word;">1) a router that has =
only one flow at a time traversing it is not a router. It's just a link tha=
t runs at memory speed in between two links. A degenerate case.</p>=0A<p st=
yle=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow-wra=
p: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: ar=
ial; font-size: 10pt; overflow-wrap: break-word;">2) The start of my email =
- about the fact that each outbound link must be made to clear (with no que=
ued traffic) within a copper or fiber speed circuit of the earth (great cir=
cle route) - is my criterion for NOT being 100% utilized.&nbsp; But it's a =
description that focuses on latency and capacity in a single measure.</p>=
=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; over=
flow-wrap: break-word;">It's very close to 100% utilized. (this satisfies y=
our concern about supporting low bit rates at the edges, but in a very diff=
erent way).</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-s=
ize: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;pa=
dding:0;font-family: arial; font-size: 10pt; overflow-wrap: break-word;">Th=
e problem with links is that they can NEVER be utilized more than 100%. So =
utilization is a TERRIBLE metric for thinking about the problem.</p>=0A<p s=
tyle=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow-wr=
ap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: a=
rial; font-size: 10pt; overflow-wrap: break-word;">I didn't mean this as a =
weird joke - I'm very serious. utilization is just the wrong measure. And s=
o is end-to-end latency average - averages are not meaningful in a fat-tail=
ed traffic distribution, no matter how you compute them, and average latenc=
y is a very strange characterization, since most paths actually have no tra=
ffic because each endpoint only uses a small subset of paths.</p>=0A<p styl=
e=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow-wrap:=
 break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family: aria=
l; font-size: 10pt; overflow-wrap: break-word;">Once upon a time I thought =
that all links should be capped at average utilization of 10% or 50%. But i=
n fact that is a terrible measure too - averages are a bad metric, for the =
same reason.</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-=
size: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;p=
adding:0;font-family: arial; font-size: 10pt; overflow-wrap: break-word;">I=
nstead, operationally it is OK for a link to be "almost full", as long as t=
he control protocols create openings frequently enough to mitigate latency =
issues.</p>=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size:=
 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;paddin=
g:0;font-family: arial; font-size: 10pt; overflow-wrap: break-word;">(Side =
note: If you want to understand really deeply why "averages" are a terrible=
 statistic for networking, I recommend reading Nassim Taleb's book about pr=
e-asymptotic behavior of random systems and the problem of applying statist=
ical measures to systems that are not "in equilibrium" -&nbsp;<a href=3D"ht=
tps://arxiv.org/abs/2001.10488">https://arxiv.org/abs/2001.10488</a>&nbsp;.=
 Seriously! It's tough sledding, sound math, and very enlightening. Much of=
 what he says can be translated into the world of real networking and queue=
ing. Sadly, most queueing theory doesn't touch on pre-asymptotic behavior, =
but instead assumes that the asymptotic behavior of a queueing system chara=
cterizes the normal behavior. )</p>=0A<p style=3D"margin:0;padding:0;font-f=
amily: arial; font-size: 10pt; overflow-wrap: break-word;">(some people try=
 to say that network traffic is "fractal", which is actually unreasonable -=
 most protocols behave highly deterministically, and there's no "self-simil=
arity" inherent in end-to-end flow statistics, no power laws, ...)</p>=0A<p=
 style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow-=
wrap: break-word;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;font-family:=
 arial; font-size: 10pt; overflow-wrap: break-word;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow-wrap: =
break-word;">On Wednesday, September 29, 2021 6:36am, "Jonathan Morton" &lt=
;chromatix99@gmail.com&gt; said:<br /><br /></p>=0A<div id=3D"SafeStyles163=
2942612">=0A<p style=3D"margin:0;padding:0;font-family: arial; font-size: 1=
0pt; overflow-wrap: break-word;">&gt; &gt; On 29 Sep, 2021, at 1:15 am, Dav=
id P. Reed &lt;dpreed@deepplum.com&gt;<br />&gt; wrote:<br />&gt; &gt;<br /=
>&gt; &gt; Now, it is important as hell to avoid bullshit research programs=
 that try to<br />&gt; "optimize" ustilization of link capacity at 100%. Th=
ose research programs focus on<br />&gt; the absolute wrong measure - a pro=
xy for "network capital cost" that is in fact<br />&gt; the wrong measure o=
f any real network operator's cost structure. The cost of media<br />&gt; (=
wires, airtime, ...) is a tiny fraction of most network operations' cost in=
 any<br />&gt; real business or institution. We don't optimize highways by =
maximizing the number<br />&gt; of cars on every stretch of highway, for ob=
vious reasons, but also for non-obvious<br />&gt; reasons.<br />&gt; <br />=
&gt; I think it is important to distinguish between core/access networks an=
d last-mile<br />&gt; links. The technical distinction is in the level of s=
tatistical multiplexing -<br />&gt; high in the former, low in the latter. =
The cost structure to the relevant user is<br />&gt; also significantly dif=
ferent.<br />&gt; <br />&gt; I agree with your analysis when it comes to co=
re/access networks with a high<br />&gt; degree of statistical multiplexing=
. These networks should be built with enough<br />&gt; capacity to service =
their expected load. When the actual load exceeds installed<br />&gt; capac=
ity for whatever reason, keeping latency low maintains network stability an=
d,<br />&gt; with a reasonable AQM, should not result in appreciably reduce=
d goodput in<br />&gt; practice.<br />&gt; <br />&gt; The relevant user's c=
osts are primarily in the hardware installed at each end of<br />&gt; the l=
ink (hence minimising complexity in this high-speed hardware is often seen =
as<br />&gt; an important goal), and possibly in the actual volume of traff=
ic transferred, not<br />&gt; in the raw capacity of the medium. All the sa=
me, if the medium were cheap, why<br />&gt; not just install more of it, ra=
ther than spending big on the hardware at each end?<br />&gt; There's proba=
bly a good explanation for this that I'm not quite aware of. <br />&gt; Per=
haps it has to do with capital versus operational costs.<br />&gt; <br />&g=
t; On a last-mile link, the relevant user is a member of the household that=
 the link<br />&gt; leads to. He is rather likely to be *very* interested i=
n getting the most goodput<br />&gt; out of the capacity available to him, =
on those occasions when he happens to have a<br />&gt; heavy workload for i=
t. He's just bought a game on Steam, for example, and wants<br />&gt; to mi=
nimise the time spent waiting for multiple gigabytes to download before he<=
br />&gt; can enjoy his purchase. Assuming his ISP and the Steam CDN have b=
uilt their<br />&gt; networks wisely, his last-mile link will be the bottle=
neck for this task - and<br />&gt; optimising goodput over it becomes *more=
* important the lower the link capacity<br />&gt; is.<br />&gt; <br />&gt; =
A lot of people, for one reason or another, still have links below 50Mbps, =
and<br />&gt; sometimes *much* less than that. It's worth reminding the gig=
abit fibre crowd of<br />&gt; that, once in a while.<br />&gt; <br />&gt; B=
ut he may not the only member of the household interested in this particula=
r<br />&gt; link. My landlord, for example, may commonly have his wife, sis=
ter, mother, and<br />&gt; four children at home at any given time, dependi=
ng on the time of year. Some of<br />&gt; the things they wish to do may be=
 latency-sensitive, and they are also likely to<br />&gt; be annoyed if thr=
oughput-sensitive tasks are unreasonably impaired. So the<br />&gt; goodput=
 of the Steam download is not the only metric of relevance, taken<br />&gt;=
 holistically. And it is certainly not correct to maximise utilisation of t=
he<br />&gt; link, as you can "utilise" the link with a whole lot of useles=
s junk, yet make no<br />&gt; progress whatsoever.<br />&gt; <br />&gt; Max=
imising an overall measure of network power, however, probably *does* make<=
br />&gt; sense - in both contexts. The method of doing so is naturally dif=
ferent in each<br />&gt; context:<br />&gt; <br />&gt; 1: In core/access ne=
tworks, ensuring that demand is always met by capacity<br />&gt; maximises =
useful throughput and minimises latency. This is the natural optimum<br />&=
gt; for network power.<br />&gt; <br />&gt; 2: It is reasonable to assume t=
hat installing more capacity has an associated<br />&gt; cost, which may ex=
ert downward pressure on capacity. In core/access networks<br />&gt; where =
demand exceeds capacity, throughput is fixed at capacity, and network power=
<br />&gt; is maximised by minimising delays. This assumes that no individu=
al traffic's<br />&gt; throughput is unreasonably impaired, compared to oth=
ers, in the process; the<br />&gt; "linear product-based fairness index" ca=
n be used to detect this:<br />&gt; <br />&gt; https://en.wikipedia.org/wik=
i/Fairness_measure#:~:text=3DProduct-based%20Fairness%20Indices<br />&gt; <=
br />&gt; 3: In a last-mile link, network power is maximised by maximising =
the goodput of<br />&gt; useful applications, ensuring that all application=
s have a "fair" share of<br />&gt; available capacity (for some reasonable =
definition of "fair"), and keeping latency<br />&gt; as low as reasonably p=
ractical while doing so. This is likely to be associated<br />&gt; with hig=
h link utilisation when demand is heavy.<br />&gt; <br />&gt; &gt; Operatin=
g at fully congested state - or designing TCP to essentially come<br />&gt;=
 close to DDoS behaviour on a bottleneck to get a publishable paper - is mi=
ssing<br />&gt; the point.<br />&gt; <br />&gt; When writing a statement li=
ke that, it's probably important to indicate what a<br />&gt; "fully conges=
ted state" actually means. Some might take it to mean merely 100%<br />&gt;=
 link utilisation, which could actually be part of an optimal network power=
<br />&gt; solution. From context, I assume you actually mean that the queu=
es are driven to<br />&gt; maximum depth and to the point of overflow - or =
beyond.<br />&gt; <br />&gt; - Jonathan Morton</p>=0A</div></font>
------=_20210929153422000000_62627--