From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp90.iad3a.emailsrvr.com (smtp90.iad3a.emailsrvr.com [173.203.187.90]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 2ACF53B2A4 for ; Wed, 29 Sep 2021 15:34:23 -0400 (EDT) Received: from app45.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp4.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 457F04250; Wed, 29 Sep 2021 15:34:22 -0400 (EDT) Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app45.wa-webapps.iad3a (Postfix) with ESMTP id 2F0DA61CB4; Wed, 29 Sep 2021 15:34:22 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Wed, 29 Sep 2021 15:34:22 -0400 (EDT) X-Auth-ID: dpreed@deepplum.com Date: Wed, 29 Sep 2021 15:34:22 -0400 (EDT) From: "David P. Reed" To: "Jonathan Morton" Cc: "Bob Briscoe" , "Mohit P. Tahiliani" , "ECN-Sane" , "Asad Sajjad Ahmed" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20210929153422000000_62627" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: <1632867355.4986972@apps.rackspace.com> X-Client-IP: 209.6.168.128 Message-ID: <1632944062.19049538@apps.rackspace.com> X-Mailer: webmail/19.0.13-RC X-Classification-ID: 0ad3640e-f6a6-4c8e-96d9-9ff228f16677-1-1 Subject: Re: [Ecn-sane] paper idea: praising smaller packets X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2021 19:34:23 -0000 ------=_20210929153422000000_62627 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AJonathan - I pretty much agree with most of what you say here. However,= two things:=0A =0A1) a router that has only one flow at a time traversing = it is not a router. It's just a link that runs at memory speed in between t= wo links. A degenerate case.=0A =0A2) The start of my email - about the fac= t that each outbound link must be made to clear (with no queued traffic) wi= thin a copper or fiber speed circuit of the earth (great circle route) - is= my criterion for NOT being 100% utilized. But it's a description that foc= uses on latency and capacity in a single measure.=0AIt's very close to 100%= utilized. (this satisfies your concern about supporting low bit rates at t= he edges, but in a very different way).=0A =0AThe problem with links is tha= t they can NEVER be utilized more than 100%. So utilization is a TERRIBLE m= etric for thinking about the problem.=0A =0AI didn't mean this as a weird j= oke - I'm very serious. utilization is just the wrong measure. And so is en= d-to-end latency average - averages are not meaningful in a fat-tailed traf= fic distribution, no matter how you compute them, and average latency is a = very strange characterization, since most paths actually have no traffic be= cause each endpoint only uses a small subset of paths.=0A =0AOnce upon a ti= me I thought that all links should be capped at average utilization of 10% = or 50%. But in fact that is a terrible measure too - averages are a bad met= ric, for the same reason.=0A =0AInstead, operationally it is OK for a link = to be "almost full", as long as the control protocols create openings frequ= ently enough to mitigate latency issues.=0A =0A(Side note: If you want to u= nderstand really deeply why "averages" are a terrible statistic for network= ing, I recommend reading Nassim Taleb's book about pre-asymptotic behavior = of random systems and the problem of applying statistical measures to syste= ms that are not "in equilibrium" - [ https://arxiv.org/abs/2001.10488 ]( ht= tps://arxiv.org/abs/2001.10488 ) . Seriously! It's tough sledding, sound ma= th, and very enlightening. Much of what he says can be translated into the = world of real networking and queueing. Sadly, most queueing theory doesn't = touch on pre-asymptotic behavior, but instead assumes that the asymptotic b= ehavior of a queueing system characterizes the normal behavior. )=0A(some p= eople try to say that network traffic is "fractal", which is actually unrea= sonable - most protocols behave highly deterministically, and there's no "s= elf-similarity" inherent in end-to-end flow statistics, no power laws, ...)= =0A =0A =0AOn Wednesday, September 29, 2021 6:36am, "Jonathan Morton" said:=0A=0A=0A=0A> > On 29 Sep, 2021, at 1:15 am, David = P. Reed =0A> wrote:=0A> >=0A> > Now, it is important a= s hell to avoid bullshit research programs that try to=0A> "optimize" ustil= ization of link capacity at 100%. Those research programs focus on=0A> the = absolute wrong measure - a proxy for "network capital cost" that is in fact= =0A> the wrong measure of any real network operator's cost structure. The c= ost of media=0A> (wires, airtime, ...) is a tiny fraction of most network o= perations' cost in any=0A> real business or institution. We don't optimize = highways by maximizing the number=0A> of cars on every stretch of highway, = for obvious reasons, but also for non-obvious=0A> reasons.=0A> =0A> I think= it is important to distinguish between core/access networks and last-mile= =0A> links. The technical distinction is in the level of statistical multip= lexing -=0A> high in the former, low in the latter. The cost structure to t= he relevant user is=0A> also significantly different.=0A> =0A> I agree with= your analysis when it comes to core/access networks with a high=0A> degree= of statistical multiplexing. These networks should be built with enough=0A= > capacity to service their expected load. When the actual load exceeds ins= talled=0A> capacity for whatever reason, keeping latency low maintains netw= ork stability and,=0A> with a reasonable AQM, should not result in apprecia= bly reduced goodput in=0A> practice.=0A> =0A> The relevant user's costs are= primarily in the hardware installed at each end of=0A> the link (hence min= imising complexity in this high-speed hardware is often seen as=0A> an impo= rtant goal), and possibly in the actual volume of traffic transferred, not= =0A> in the raw capacity of the medium. All the same, if the medium were ch= eap, why=0A> not just install more of it, rather than spending big on the h= ardware at each end?=0A> There's probably a good explanation for this that = I'm not quite aware of. =0A> Perhaps it has to do with capital versus opera= tional costs.=0A> =0A> On a last-mile link, the relevant user is a member o= f the household that the link=0A> leads to. He is rather likely to be *very= * interested in getting the most goodput=0A> out of the capacity available = to him, on those occasions when he happens to have a=0A> heavy workload for= it. He's just bought a game on Steam, for example, and wants=0A> to minimi= se the time spent waiting for multiple gigabytes to download before he=0A> = can enjoy his purchase. Assuming his ISP and the Steam CDN have built their= =0A> networks wisely, his last-mile link will be the bottleneck for this ta= sk - and=0A> optimising goodput over it becomes *more* important the lower = the link capacity=0A> is.=0A> =0A> A lot of people, for one reason or anoth= er, still have links below 50Mbps, and=0A> sometimes *much* less than that.= It's worth reminding the gigabit fibre crowd of=0A> that, once in a while.= =0A> =0A> But he may not the only member of the household interested in thi= s particular=0A> link. My landlord, for example, may commonly have his wife= , sister, mother, and=0A> four children at home at any given time, dependin= g on the time of year. Some of=0A> the things they wish to do may be latenc= y-sensitive, and they are also likely to=0A> be annoyed if throughput-sensi= tive tasks are unreasonably impaired. So the=0A> goodput of the Steam downl= oad is not the only metric of relevance, taken=0A> holistically. And it is = certainly not correct to maximise utilisation of the=0A> link, as you can "= utilise" the link with a whole lot of useless junk, yet make no=0A> progres= s whatsoever.=0A> =0A> Maximising an overall measure of network power, howe= ver, probably *does* make=0A> sense - in both contexts. The method of doing= so is naturally different in each=0A> context:=0A> =0A> 1: In core/access = networks, ensuring that demand is always met by capacity=0A> maximises usef= ul throughput and minimises latency. This is the natural optimum=0A> for ne= twork power.=0A> =0A> 2: It is reasonable to assume that installing more ca= pacity has an associated=0A> cost, which may exert downward pressure on cap= acity. In core/access networks=0A> where demand exceeds capacity, throughpu= t is fixed at capacity, and network power=0A> is maximised by minimising de= lays. This assumes that no individual traffic's=0A> throughput is unreasona= bly impaired, compared to others, in the process; the=0A> "linear product-b= ased fairness index" can be used to detect this:=0A> =0A> https://en.wikipe= dia.org/wiki/Fairness_measure#:~:text=3DProduct-based%20Fairness%20Indices= =0A> =0A> 3: In a last-mile link, network power is maximised by maximising = the goodput of=0A> useful applications, ensuring that all applications have= a "fair" share of=0A> available capacity (for some reasonable definition o= f "fair"), and keeping latency=0A> as low as reasonably practical while doi= ng so. This is likely to be associated=0A> with high link utilisation when = demand is heavy.=0A> =0A> > Operating at fully congested state - or designi= ng TCP to essentially come=0A> close to DDoS behaviour on a bottleneck to g= et a publishable paper - is missing=0A> the point.=0A> =0A> When writing a = statement like that, it's probably important to indicate what a=0A> "fully = congested state" actually means. Some might take it to mean merely 100%=0A>= link utilisation, which could actually be part of an optimal network power= =0A> solution. From context, I assume you actually mean that the queues are= driven to=0A> maximum depth and to the point of overflow - or beyond.=0A> = =0A> - Jonathan Morton ------=_20210929153422000000_62627 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Jonathan  - I pre= tty much agree with most of what you say here. However, two things:

=0A<= p style=3D"margin:0;padding:0;font-family: arial; font-size: 10pt; overflow= -wrap: break-word;"> 

=0A

1) a router that has = only one flow at a time traversing it is not a router. It's just a link tha= t runs at memory speed in between two links. A degenerate case.

=0A

 

=0A

2) The start of my email = - about the fact that each outbound link must be made to clear (with no que= ued traffic) within a copper or fiber speed circuit of the earth (great cir= cle route) - is my criterion for NOT being 100% utilized.  But it's a = description that focuses on latency and capacity in a single measure.

= =0A

It's very close to 100% utilized. (this satisfies y= our concern about supporting low bit rates at the edges, but in a very diff= erent way).

=0A

 

=0A

Th= e problem with links is that they can NEVER be utilized more than 100%. So = utilization is a TERRIBLE metric for thinking about the problem.

=0A

 

=0A

I didn't mean this as a = weird joke - I'm very serious. utilization is just the wrong measure. And s= o is end-to-end latency average - averages are not meaningful in a fat-tail= ed traffic distribution, no matter how you compute them, and average latenc= y is a very strange characterization, since most paths actually have no tra= ffic because each endpoint only uses a small subset of paths.

=0A

 

=0A

Once upon a time I thought = that all links should be capped at average utilization of 10% or 50%. But i= n fact that is a terrible measure too - averages are a bad metric, for the = same reason.

=0A

 

=0A

I= nstead, operationally it is OK for a link to be "almost full", as long as t= he control protocols create openings frequently enough to mitigate latency = issues.

=0A

 

=0A

(Side = note: If you want to understand really deeply why "averages" are a terrible= statistic for networking, I recommend reading Nassim Taleb's book about pr= e-asymptotic behavior of random systems and the problem of applying statist= ical measures to systems that are not "in equilibrium" - https://arxiv.org/abs/2001.10488 .= Seriously! It's tough sledding, sound math, and very enlightening. Much of= what he says can be translated into the world of real networking and queue= ing. Sadly, most queueing theory doesn't touch on pre-asymptotic behavior, = but instead assumes that the asymptotic behavior of a queueing system chara= cterizes the normal behavior. )

=0A

(some people try= to say that network traffic is "fractal", which is actually unreasonable -= most protocols behave highly deterministically, and there's no "self-simil= arity" inherent in end-to-end flow statistics, no power laws, ...)

=0A 

=0A

 

=0A

On Wednesday, September 29, 2021 6:36am, "Jonathan Morton" <= ;chromatix99@gmail.com> said:

=0A
=0A

> > On 29 Sep, 2021, at 1:15 am, Dav= id P. Reed <dpreed@deepplum.com>
> wrote:
> >
> > Now, it is important as hell to avoid bullshit research programs= that try to
> "optimize" ustilization of link capacity at 100%. Th= ose research programs focus on
> the absolute wrong measure - a pro= xy for "network capital cost" that is in fact
> the wrong measure o= f any real network operator's cost structure. The cost of media
> (= wires, airtime, ...) is a tiny fraction of most network operations' cost in= any
> real business or institution. We don't optimize highways by = maximizing the number
> of cars on every stretch of highway, for ob= vious reasons, but also for non-obvious
> reasons.
>
= > I think it is important to distinguish between core/access networks an= d last-mile
> links. The technical distinction is in the level of s= tatistical multiplexing -
> high in the former, low in the latter. = The cost structure to the relevant user is
> also significantly dif= ferent.
>
> I agree with your analysis when it comes to co= re/access networks with a high
> degree of statistical multiplexing= . These networks should be built with enough
> capacity to service = their expected load. When the actual load exceeds installed
> capac= ity for whatever reason, keeping latency low maintains network stability an= d,
> with a reasonable AQM, should not result in appreciably reduce= d goodput in
> practice.
>
> The relevant user's c= osts are primarily in the hardware installed at each end of
> the l= ink (hence minimising complexity in this high-speed hardware is often seen = as
> an important goal), and possibly in the actual volume of traff= ic transferred, not
> in the raw capacity of the medium. All the sa= me, if the medium were cheap, why
> not just install more of it, ra= ther than spending big on the hardware at each end?
> There's proba= bly a good explanation for this that I'm not quite aware of.
> Per= haps it has to do with capital versus operational costs.
>
&g= t; On a last-mile link, the relevant user is a member of the household that= the link
> leads to. He is rather likely to be *very* interested i= n getting the most goodput
> out of the capacity available to him, = on those occasions when he happens to have a
> heavy workload for i= t. He's just bought a game on Steam, for example, and wants
> to mi= nimise the time spent waiting for multiple gigabytes to download before he<= br />> can enjoy his purchase. Assuming his ISP and the Steam CDN have b= uilt their
> networks wisely, his last-mile link will be the bottle= neck for this task - and
> optimising goodput over it becomes *more= * important the lower the link capacity
> is.
>
> = A lot of people, for one reason or another, still have links below 50Mbps, = and
> sometimes *much* less than that. It's worth reminding the gig= abit fibre crowd of
> that, once in a while.
>
> B= ut he may not the only member of the household interested in this particula= r
> link. My landlord, for example, may commonly have his wife, sis= ter, mother, and
> four children at home at any given time, dependi= ng on the time of year. Some of
> the things they wish to do may be= latency-sensitive, and they are also likely to
> be annoyed if thr= oughput-sensitive tasks are unreasonably impaired. So the
> goodput= of the Steam download is not the only metric of relevance, taken
>= holistically. And it is certainly not correct to maximise utilisation of t= he
> link, as you can "utilise" the link with a whole lot of useles= s junk, yet make no
> progress whatsoever.
>
> Max= imising an overall measure of network power, however, probably *does* make<= br />> sense - in both contexts. The method of doing so is naturally dif= ferent in each
> context:
>
> 1: In core/access ne= tworks, ensuring that demand is always met by capacity
> maximises = useful throughput and minimises latency. This is the natural optimum
&= gt; for network power.
>
> 2: It is reasonable to assume t= hat installing more capacity has an associated
> cost, which may ex= ert downward pressure on capacity. In core/access networks
> where = demand exceeds capacity, throughput is fixed at capacity, and network power=
> is maximised by minimising delays. This assumes that no individu= al traffic's
> throughput is unreasonably impaired, compared to oth= ers, in the process; the
> "linear product-based fairness index" ca= n be used to detect this:
>
> https://en.wikipedia.org/wik= i/Fairness_measure#:~:text=3DProduct-based%20Fairness%20Indices
> <= br />> 3: In a last-mile link, network power is maximised by maximising = the goodput of
> useful applications, ensuring that all application= s have a "fair" share of
> available capacity (for some reasonable = definition of "fair"), and keeping latency
> as low as reasonably p= ractical while doing so. This is likely to be associated
> with hig= h link utilisation when demand is heavy.
>
> > Operatin= g at fully congested state - or designing TCP to essentially come
>= close to DDoS behaviour on a bottleneck to get a publishable paper - is mi= ssing
> the point.
>
> When writing a statement li= ke that, it's probably important to indicate what a
> "fully conges= ted state" actually means. Some might take it to mean merely 100%
>= link utilisation, which could actually be part of an optimal network power=
> solution. From context, I assume you actually mean that the queu= es are driven to
> maximum depth and to the point of overflow - or = beyond.
>
> - Jonathan Morton

=0A
------=_20210929153422000000_62627--