From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path:
Received: from smtp90.iad3a.emailsrvr.com (smtp90.iad3a.emailsrvr.com
[173.203.187.90])
(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
(No client certificate requested)
by lists.bufferbloat.net (Postfix) with ESMTPS id 2ACF53B2A4
for ; Wed, 29 Sep 2021 15:34:23 -0400 (EDT)
Received: from app45.wa-webapps.iad3a (relay-webapps.rsapps.net
[172.27.255.140])
by smtp4.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 457F04250;
Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
Received: from deepplum.com (localhost.localdomain [127.0.0.1])
by app45.wa-webapps.iad3a (Postfix) with ESMTP id 2F0DA61CB4;
Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
Received: by apps.rackspace.com
(Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com)
with HTTP; Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
X-Auth-ID: dpreed@deepplum.com
Date: Wed, 29 Sep 2021 15:34:22 -0400 (EDT)
From: "David P. Reed"
To: "Jonathan Morton"
Cc: "Bob Briscoe" ,
"Mohit P. Tahiliani" ,
"ECN-Sane" ,
"Asad Sajjad Ahmed"
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_20210929153422000000_62627"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To:
References:
<1632867355.4986972@apps.rackspace.com>
X-Client-IP: 209.6.168.128
Message-ID: <1632944062.19049538@apps.rackspace.com>
X-Mailer: webmail/19.0.13-RC
X-Classification-ID: 0ad3640e-f6a6-4c8e-96d9-9ff228f16677-1-1
Subject: Re: [Ecn-sane] paper idea: praising smaller packets
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
Internet
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
X-List-Received-Date: Wed, 29 Sep 2021 19:34:23 -0000
------=_20210929153422000000_62627
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
=0AJonathan - I pretty much agree with most of what you say here. However,=
two things:=0A =0A1) a router that has only one flow at a time traversing =
it is not a router. It's just a link that runs at memory speed in between t=
wo links. A degenerate case.=0A =0A2) The start of my email - about the fac=
t that each outbound link must be made to clear (with no queued traffic) wi=
thin a copper or fiber speed circuit of the earth (great circle route) - is=
my criterion for NOT being 100% utilized. But it's a description that foc=
uses on latency and capacity in a single measure.=0AIt's very close to 100%=
utilized. (this satisfies your concern about supporting low bit rates at t=
he edges, but in a very different way).=0A =0AThe problem with links is tha=
t they can NEVER be utilized more than 100%. So utilization is a TERRIBLE m=
etric for thinking about the problem.=0A =0AI didn't mean this as a weird j=
oke - I'm very serious. utilization is just the wrong measure. And so is en=
d-to-end latency average - averages are not meaningful in a fat-tailed traf=
fic distribution, no matter how you compute them, and average latency is a =
very strange characterization, since most paths actually have no traffic be=
cause each endpoint only uses a small subset of paths.=0A =0AOnce upon a ti=
me I thought that all links should be capped at average utilization of 10% =
or 50%. But in fact that is a terrible measure too - averages are a bad met=
ric, for the same reason.=0A =0AInstead, operationally it is OK for a link =
to be "almost full", as long as the control protocols create openings frequ=
ently enough to mitigate latency issues.=0A =0A(Side note: If you want to u=
nderstand really deeply why "averages" are a terrible statistic for network=
ing, I recommend reading Nassim Taleb's book about pre-asymptotic behavior =
of random systems and the problem of applying statistical measures to syste=
ms that are not "in equilibrium" - [ https://arxiv.org/abs/2001.10488 ]( ht=
tps://arxiv.org/abs/2001.10488 ) . Seriously! It's tough sledding, sound ma=
th, and very enlightening. Much of what he says can be translated into the =
world of real networking and queueing. Sadly, most queueing theory doesn't =
touch on pre-asymptotic behavior, but instead assumes that the asymptotic b=
ehavior of a queueing system characterizes the normal behavior. )=0A(some p=
eople try to say that network traffic is "fractal", which is actually unrea=
sonable - most protocols behave highly deterministically, and there's no "s=
elf-similarity" inherent in end-to-end flow statistics, no power laws, ...)=
=0A =0A =0AOn Wednesday, September 29, 2021 6:36am, "Jonathan Morton" said:=0A=0A=0A=0A> > On 29 Sep, 2021, at 1:15 am, David =
P. Reed =0A> wrote:=0A> >=0A> > Now, it is important a=
s hell to avoid bullshit research programs that try to=0A> "optimize" ustil=
ization of link capacity at 100%. Those research programs focus on=0A> the =
absolute wrong measure - a proxy for "network capital cost" that is in fact=
=0A> the wrong measure of any real network operator's cost structure. The c=
ost of media=0A> (wires, airtime, ...) is a tiny fraction of most network o=
perations' cost in any=0A> real business or institution. We don't optimize =
highways by maximizing the number=0A> of cars on every stretch of highway, =
for obvious reasons, but also for non-obvious=0A> reasons.=0A> =0A> I think=
it is important to distinguish between core/access networks and last-mile=
=0A> links. The technical distinction is in the level of statistical multip=
lexing -=0A> high in the former, low in the latter. The cost structure to t=
he relevant user is=0A> also significantly different.=0A> =0A> I agree with=
your analysis when it comes to core/access networks with a high=0A> degree=
of statistical multiplexing. These networks should be built with enough=0A=
> capacity to service their expected load. When the actual load exceeds ins=
talled=0A> capacity for whatever reason, keeping latency low maintains netw=
ork stability and,=0A> with a reasonable AQM, should not result in apprecia=
bly reduced goodput in=0A> practice.=0A> =0A> The relevant user's costs are=
primarily in the hardware installed at each end of=0A> the link (hence min=
imising complexity in this high-speed hardware is often seen as=0A> an impo=
rtant goal), and possibly in the actual volume of traffic transferred, not=
=0A> in the raw capacity of the medium. All the same, if the medium were ch=
eap, why=0A> not just install more of it, rather than spending big on the h=
ardware at each end?=0A> There's probably a good explanation for this that =
I'm not quite aware of. =0A> Perhaps it has to do with capital versus opera=
tional costs.=0A> =0A> On a last-mile link, the relevant user is a member o=
f the household that the link=0A> leads to. He is rather likely to be *very=
* interested in getting the most goodput=0A> out of the capacity available =
to him, on those occasions when he happens to have a=0A> heavy workload for=
it. He's just bought a game on Steam, for example, and wants=0A> to minimi=
se the time spent waiting for multiple gigabytes to download before he=0A> =
can enjoy his purchase. Assuming his ISP and the Steam CDN have built their=
=0A> networks wisely, his last-mile link will be the bottleneck for this ta=
sk - and=0A> optimising goodput over it becomes *more* important the lower =
the link capacity=0A> is.=0A> =0A> A lot of people, for one reason or anoth=
er, still have links below 50Mbps, and=0A> sometimes *much* less than that.=
It's worth reminding the gigabit fibre crowd of=0A> that, once in a while.=
=0A> =0A> But he may not the only member of the household interested in thi=
s particular=0A> link. My landlord, for example, may commonly have his wife=
, sister, mother, and=0A> four children at home at any given time, dependin=
g on the time of year. Some of=0A> the things they wish to do may be latenc=
y-sensitive, and they are also likely to=0A> be annoyed if throughput-sensi=
tive tasks are unreasonably impaired. So the=0A> goodput of the Steam downl=
oad is not the only metric of relevance, taken=0A> holistically. And it is =
certainly not correct to maximise utilisation of the=0A> link, as you can "=
utilise" the link with a whole lot of useless junk, yet make no=0A> progres=
s whatsoever.=0A> =0A> Maximising an overall measure of network power, howe=
ver, probably *does* make=0A> sense - in both contexts. The method of doing=
so is naturally different in each=0A> context:=0A> =0A> 1: In core/access =
networks, ensuring that demand is always met by capacity=0A> maximises usef=
ul throughput and minimises latency. This is the natural optimum=0A> for ne=
twork power.=0A> =0A> 2: It is reasonable to assume that installing more ca=
pacity has an associated=0A> cost, which may exert downward pressure on cap=
acity. In core/access networks=0A> where demand exceeds capacity, throughpu=
t is fixed at capacity, and network power=0A> is maximised by minimising de=
lays. This assumes that no individual traffic's=0A> throughput is unreasona=
bly impaired, compared to others, in the process; the=0A> "linear product-b=
ased fairness index" can be used to detect this:=0A> =0A> https://en.wikipe=
dia.org/wiki/Fairness_measure#:~:text=3DProduct-based%20Fairness%20Indices=
=0A> =0A> 3: In a last-mile link, network power is maximised by maximising =
the goodput of=0A> useful applications, ensuring that all applications have=
a "fair" share of=0A> available capacity (for some reasonable definition o=
f "fair"), and keeping latency=0A> as low as reasonably practical while doi=
ng so. This is likely to be associated=0A> with high link utilisation when =
demand is heavy.=0A> =0A> > Operating at fully congested state - or designi=
ng TCP to essentially come=0A> close to DDoS behaviour on a bottleneck to g=
et a publishable paper - is missing=0A> the point.=0A> =0A> When writing a =
statement like that, it's probably important to indicate what a=0A> "fully =
congested state" actually means. Some might take it to mean merely 100%=0A>=
link utilisation, which could actually be part of an optimal network power=
=0A> solution. From context, I assume you actually mean that the queues are=
driven to=0A> maximum depth and to the point of overflow - or beyond.=0A> =
=0A> - Jonathan Morton
------=_20210929153422000000_62627
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Jonathan - I pre=
tty much agree with most of what you say here. However, two things:
=0A<=
p style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow=
-wrap: break-word;">
=0A1) a router that has =
only one flow at a time traversing it is not a router. It's just a link tha=
t runs at memory speed in between two links. A degenerate case.
=0A
=0A2) The start of my email =
- about the fact that each outbound link must be made to clear (with no que=
ued traffic) within a copper or fiber speed circuit of the earth (great cir=
cle route) - is my criterion for NOT being 100% utilized. But it's a =
description that focuses on latency and capacity in a single measure.
=
=0AIt's very close to 100% utilized. (this satisfies y=
our concern about supporting low bit rates at the edges, but in a very diff=
erent way).
=0A
=0ATh=
e problem with links is that they can NEVER be utilized more than 100%. So =
utilization is a TERRIBLE metric for thinking about the problem.
=0A
=0AI didn't mean this as a =
weird joke - I'm very serious. utilization is just the wrong measure. And s=
o is end-to-end latency average - averages are not meaningful in a fat-tail=
ed traffic distribution, no matter how you compute them, and average latenc=
y is a very strange characterization, since most paths actually have no tra=
ffic because each endpoint only uses a small subset of paths.
=0A
=0AOnce upon a time I thought =
that all links should be capped at average utilization of 10% or 50%. But i=
n fact that is a terrible measure too - averages are a bad metric, for the =
same reason.
=0A
=0AI=
nstead, operationally it is OK for a link to be "almost full", as long as t=
he control protocols create openings frequently enough to mitigate latency =
issues.
=0A
=0A(Side =
note: If you want to understand really deeply why "averages" are a terrible=
statistic for networking, I recommend reading Nassim Taleb's book about pr=
e-asymptotic behavior of random systems and the problem of applying statist=
ical measures to systems that are not "in equilibrium" - https://arxiv.org/abs/2001.10488 .=
Seriously! It's tough sledding, sound math, and very enlightening. Much of=
what he says can be translated into the world of real networking and queue=
ing. Sadly, most queueing theory doesn't touch on pre-asymptotic behavior, =
but instead assumes that the asymptotic behavior of a queueing system chara=
cterizes the normal behavior. )
=0A(some people try=
to say that network traffic is "fractal", which is actually unreasonable -=
most protocols behave highly deterministically, and there's no "self-simil=
arity" inherent in end-to-end flow statistics, no power laws, ...)
=0A
=0A
=0AOn Wednesday, September 29, 2021 6:36am, "Jonathan Morton" <=
;chromatix99@gmail.com> said:
=0A=0A
> > On 29 Sep, 2021, at 1:15 am, Dav=
id P. Reed <dpreed@deepplum.com>
> wrote:
> >
> > Now, it is important as hell to avoid bullshit research programs=
that try to
> "optimize" ustilization of link capacity at 100%. Th=
ose research programs focus on
> the absolute wrong measure - a pro=
xy for "network capital cost" that is in fact
> the wrong measure o=
f any real network operator's cost structure. The cost of media
> (=
wires, airtime, ...) is a tiny fraction of most network operations' cost in=
any
> real business or institution. We don't optimize highways by =
maximizing the number
> of cars on every stretch of highway, for ob=
vious reasons, but also for non-obvious
> reasons.
>
=
> I think it is important to distinguish between core/access networks an=
d last-mile
> links. The technical distinction is in the level of s=
tatistical multiplexing -
> high in the former, low in the latter. =
The cost structure to the relevant user is
> also significantly dif=
ferent.
>
> I agree with your analysis when it comes to co=
re/access networks with a high
> degree of statistical multiplexing=
. These networks should be built with enough
> capacity to service =
their expected load. When the actual load exceeds installed
> capac=
ity for whatever reason, keeping latency low maintains network stability an=
d,
> with a reasonable AQM, should not result in appreciably reduce=
d goodput in
> practice.
>
> The relevant user's c=
osts are primarily in the hardware installed at each end of
> the l=
ink (hence minimising complexity in this high-speed hardware is often seen =
as
> an important goal), and possibly in the actual volume of traff=
ic transferred, not
> in the raw capacity of the medium. All the sa=
me, if the medium were cheap, why
> not just install more of it, ra=
ther than spending big on the hardware at each end?
> There's proba=
bly a good explanation for this that I'm not quite aware of.
> Per=
haps it has to do with capital versus operational costs.
>
&g=
t; On a last-mile link, the relevant user is a member of the household that=
the link
> leads to. He is rather likely to be *very* interested i=
n getting the most goodput
> out of the capacity available to him, =
on those occasions when he happens to have a
> heavy workload for i=
t. He's just bought a game on Steam, for example, and wants
> to mi=
nimise the time spent waiting for multiple gigabytes to download before he<=
br />> can enjoy his purchase. Assuming his ISP and the Steam CDN have b=
uilt their
> networks wisely, his last-mile link will be the bottle=
neck for this task - and
> optimising goodput over it becomes *more=
* important the lower the link capacity
> is.
>
> =
A lot of people, for one reason or another, still have links below 50Mbps, =
and
> sometimes *much* less than that. It's worth reminding the gig=
abit fibre crowd of
> that, once in a while.
>
> B=
ut he may not the only member of the household interested in this particula=
r
> link. My landlord, for example, may commonly have his wife, sis=
ter, mother, and
> four children at home at any given time, dependi=
ng on the time of year. Some of
> the things they wish to do may be=
latency-sensitive, and they are also likely to
> be annoyed if thr=
oughput-sensitive tasks are unreasonably impaired. So the
> goodput=
of the Steam download is not the only metric of relevance, taken
>=
holistically. And it is certainly not correct to maximise utilisation of t=
he
> link, as you can "utilise" the link with a whole lot of useles=
s junk, yet make no
> progress whatsoever.
>
> Max=
imising an overall measure of network power, however, probably *does* make<=
br />> sense - in both contexts. The method of doing so is naturally dif=
ferent in each
> context:
>
> 1: In core/access ne=
tworks, ensuring that demand is always met by capacity
> maximises =
useful throughput and minimises latency. This is the natural optimum
&=
gt; for network power.
>
> 2: It is reasonable to assume t=
hat installing more capacity has an associated
> cost, which may ex=
ert downward pressure on capacity. In core/access networks
> where =
demand exceeds capacity, throughput is fixed at capacity, and network power=
> is maximised by minimising delays. This assumes that no individu=
al traffic's
> throughput is unreasonably impaired, compared to oth=
ers, in the process; the
> "linear product-based fairness index" ca=
n be used to detect this:
>
> https://en.wikipedia.org/wik=
i/Fairness_measure#:~:text=3DProduct-based%20Fairness%20Indices
> <=
br />> 3: In a last-mile link, network power is maximised by maximising =
the goodput of
> useful applications, ensuring that all application=
s have a "fair" share of
> available capacity (for some reasonable =
definition of "fair"), and keeping latency
> as low as reasonably p=
ractical while doing so. This is likely to be associated
> with hig=
h link utilisation when demand is heavy.
>
> > Operatin=
g at fully congested state - or designing TCP to essentially come
>=
close to DDoS behaviour on a bottleneck to get a publishable paper - is mi=
ssing
> the point.
>
> When writing a statement li=
ke that, it's probably important to indicate what a
> "fully conges=
ted state" actually means. Some might take it to mean merely 100%
>=
link utilisation, which could actually be part of an optimal network power=
> solution. From context, I assume you actually mean that the queu=
es are driven to
> maximum depth and to the point of overflow - or =
beyond.
>
> - Jonathan Morton
=0A
------=_20210929153422000000_62627--