From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp104.iad3a.emailsrvr.com (smtp104.iad3a.emailsrvr.com [173.203.187.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 1DDD73B29E for ; Wed, 13 Dec 2017 13:08:15 -0500 (EST) Received: from smtp30.relay.iad3a.emailsrvr.com (localhost [127.0.0.1]) by smtp30.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id A9F4F5711; Wed, 13 Dec 2017 13:08:14 -0500 (EST) Received: from app51.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp30.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 606A05942; Wed, 13 Dec 2017 13:08:14 -0500 (EST) X-Sender-Id: dpreed@reed.com Received: from app51.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12); Wed, 13 Dec 2017 13:08:14 -0500 Received: from reed.com (localhost.localdomain [127.0.0.1]) by app51.wa-webapps.iad3a (Postfix) with ESMTP id 4DE5041286; Wed, 13 Dec 2017 13:08:14 -0500 (EST) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Wed, 13 Dec 2017 13:08:14 -0500 (EST) X-Auth-ID: dpreed@reed.com Date: Wed, 13 Dec 2017 13:08:14 -0500 (EST) From: dpreed@reed.com To: "Jonathan Morton" Cc: "Neil Davies" , cerowrt-devel@lists.bufferbloat.net, "bloat" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20171213130814000000_55402" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: <92906bd8-7bad-945d-83c8-a2f9598aac2c@lackof.org> <87bmjff7l6.fsf_-_@nemesis.taht.net> <1512417597.091724124@apps.rackspace.com> <87wp1rbxo8.fsf@nemesis.taht.net> <1513119230.638732339@apps.rackspace.com> <7D300E07-536C-4ABD-AE38-DDBAF30E80D7@pnsol.com> Message-ID: <1513188494.316722195@apps.rackspace.com> X-Mailer: webmail/12.9.10-RC Subject: Re: [Bloat] [Cerowrt-devel] DC behaviors today X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Dec 2017 18:08:15 -0000 ------=_20171213130814000000_55402 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AJust to be clear, I have built and operated a whole range of network pla= tforms, as well as diagnosing problems and planning deployments of systems = that include digital packet delivery in real contexts where cost and perfor= mance matter, for nearly 40 years now. So this isn't only some kind of radi= cal opinion, but hard-won knowledge across my entire career. I also havea v= ery strong theoretical background in queueing theory and control theory -- = enough to teach a graduate seminar, anyway.=0AThat said, there are lots of = folks out there who have opinions different than mine. But far too many (su= ch as those who think big buffers are "good", who brought us bufferbloat) a= re not aware of how networks are really used or the practical effects of th= eir poor models of usage.=0A =0AIf it comforts you to think that I am just = stating an "opinion", which must be wrong because it is not the "convention= al wisdom" in the circles where you travel, fine. You are entitled to dismi= ss any ideas you don't like. But I would suggest you get data about your as= sumptions.=0A =0AI don't know if I'm being trolled, but a couple of comment= s on the recent comments:=0A =0A1. Statistical multiplexing viewed as an av= eraging/smoothing as an idea is, in my personal opinion and experience meas= uring real network behavior, a description of a theoretical phenomenon that= is not real (e.g. "consider a spherical cow") that is amenable to theoreti= cal analysis. Such theoretical analysis can make some gross estimates, but = it breaks down quickly. The same thing is true of common economic theory th= at models practical markets by linear models (linear systems of differentia= l equations are common) and gaussian probability distributions (gaussians a= re easily analyzed, but wrong. You can read the popular books by Nassim Tal= eb for an entertaining and enlightening deeper understanding of the economi= c problems with such modeling).=0A =0AOne of the features well observed in = real measurements of real systems is that packet flows are "fractal", which= means that there is a self-similarity of rate variability all time scales = from micro to macro. As you look at smaller and smaller time scales, or lar= ger and larger time scales, the packet request density per unit time never = smooths out due to "averaging over sources". That is, there's no practical = "statistical multiplexing" effect. There's also significant correlation amo= ng many packet arrivals - assuming they are statistically independent (whic= h is required for the "law of large numbers" to apply) is often far from th= e real situation - flows that are assumed to be independent are usually str= ongly coupled.=0A =0AThe one exception where flows average out at a constan= t rate is when there is a "bottleneck". Then, there being no more capacity,= the constant rate is forced, not by statistical averaging but by a very di= fferent process. One that is almost never desirable.=0A =0AThis is just wha= t is observed in case after case. Designers may imagine that their network= s have "smooth averaging" properties. There's a strong thread in networking= literature that makes this pretty-much-always-false assumption the basis o= f protocol designs, thinking about "Quality of Service" and other sorts of = things. You can teach graduate students about a reality that does not exist= , and get papers accepted in conferences where the reviewers have been trai= ned in the same tradition of unreal assumptions.=0A =0A2. I work every day = with "datacenter" networking and distributed systems on 10 GigE and faster = Ethernet fabrics with switches and trunking. I see the packet flows driven = by distributed computing in real systems. Whenever the sustained peak load = on a switch path reaches 100%, that's not "good", that's not "efficient" re= source usage. That is a situation where computing is experiencing huge wast= ed capacity due to network congestion that is dramatically slowing down the= desired workload.=0A =0AAgain this is because *real workloads* in distribu= ted computation don't have smooth or averagable rates over interconnects. L= atency is everything in that application too!=0A =0AYes, because one buys s= witches from vendors who don't know how to build or operate a server or a d= atabase at all, you see vendors trying to demonstrate their amazing through= put, but the people who build these systems (me, for example) are not looki= ng at throughput or statistical multiplexing at all! We use "throughput" as= a proxy for "latency under load". (and it is a poor proxy! Because vendors= throw in big buffers, causing bufferbloat. See Arista Networks' attempts t= o justify their huge buffers as a "good thing" -- when it is just a case of= something you have to design around by clocking the packets so they never = accumulate in a buffer).=0A =0ASo, yes, the peak transfer rate matters, of = course. And sometimes it is utilized for very good reason (when the latency= of a file transfer as a whole is the latency that matters). But to be clea= r, just because as a user I want to download a Linux distro update as quick= ly as possible when it happens does NOT imply that the average load at any = time scale is "statistically averaged" for residential networking. Quite th= e opposite! I buy Gigabit service to my house because I cannot predict when= I will need it, but I almost never need it. My average rate (except once a= month or so) is miniscule. This is true even though my house is a heavy us= er of Netflix.=0A =0AThe way that Gigbit residential service affects my "qu= ality of service" is almost entirely that I get good "response time" to unp= redictable demands. How quickly a Netflix stream can fill its play buffer i= s the measure. The data rate of any Netflix stream is, on average much, muc= h less than a Gigabit. Buffers in the network would ruin my Netflix experie= nce, because the buffering is best done at the "edge" as the End-to-End arg= ument usually suggests. It's certainly NOT because of statistical multiplex= ing.=0A =0ASo when you are tempted to talk about "statistical multiplexing"= smoothing out traffic flow take a pause and think about whether that reall= y makes sense as a description of reality.=0A =0Afq_codel is a good thing b= ecause it handles the awkward behavior at "peak load". It smooths out the i= mpact of running out of resources. But that impact is still undesirable - i= f many Netflix flows are adding up to peak load, a new Netflix flow can't s= tart very quickly. That results in terrible QoS from a Netflix user's point= of view.=0A =0A =0A=0A=0AOn Wednesday, December 13, 2017 11:41am, "Jonatha= n Morton" said:=0A=0A=0A=0A> Have you considered wh= at this means for the economics of the operation of networks? What other in= dustry that =E2=80=9Cmoves things around=E2=80=9D (i.e logistical or simila= r) system creates a solution in which they have 10x as much infrastructure = than their peak requirement?=0ATen times peak demand? No.=0ATen times aver= age demand estimated at time of deployment, and struggling badly with peak = demand a decade later, yes. And this is the transportation industry, where= a decade is a *short* time - like less than a year in telecoms.=0A- Jonath= an Morton=0A=0A=0AOn 13 Dec 2017 17:27, "Neil Davies" <[ neil.davies@pnsol.= com ]( mailto:neil.davies@pnsol.com )> wrote:=0A=0A=0A=0A=0AOn 12 Dec 2017,= at 22:53, [ dpreed@reed.com ]( mailto:dpreed@reed.com ) wrote:=0A=0ALuca's= point tends to be correct - variable latency destroys the stability of flo= w control loops, which destroys throughput, even when there is sufficient c= apacity to handle the load.=0A =0AThis is an indirect result of Little's Le= mma (which is strictly true only for Poisson arrival, but almost any arriva= l process will have a similar interaction between latency and throughput).= =0AActually it is true for general arrival patterns (can=E2=80=99t lay my h= ands on the reference for the moment - but it was a while back that was sho= wn) - what this points to is an underlying conservation law - that =E2=80= =9Cdelay and loss=E2=80=9D are conserved in a scheduling process. This come= s out of the M/M/1/K/K queueing system and associated analysis.=0AThere is = conservation law (and Klienrock refers to this - at least in terms of dela= y - in 1965 - [ http://onlinelibrary.wiley.com/doi/10.1002/nav.3800120206/a= bstract ]( http://onlinelibrary.wiley.com/doi/10.1002/nav.3800120206/abstra= ct )) at work here.=0AAll scheduling systems can do is =E2=80=9Cdistribute= =E2=80=9D the resulting =E2=80=9Cdelay and loss=E2=80=9D differentially amo= ngst the (instantaneous set of) competing streams. =0ALet me just repeat th= at - The =E2=80=9Cdelay and loss=E2=80=9D are a conserved quantity - schedu= ling can=E2=80=99t =E2=80=9Cdestroy=E2=80=9D it (they can influence higher = level protocol behaviour) but not reduce the total amount of =E2=80=9Cdelay= and loss=E2=80=9D that is being induced into the collective set of streams= ...=0A=0A=0A =0AHowever, the other reason I say what I say so strongly is t= his:=0A =0ARant on.=0A =0APeak/avg. load ratio always exceeds a factor of 1= 0 or more, IRL. Only "benchmark setups" (or hot-rod races done for academic= reasons or marketing reasons to claim some sort of "title") operate at pea= k supportable load any significant part of the time.=0AHave you considered = what this means for the economics of the operation of networks? What other = industry that =E2=80=9Cmoves things around=E2=80=9D (i.e logistical or simi= lar) system creates a solution in which they have 10x as much infrastructur= e than their peak requirement?=0A=0A=0A =0AThe reason for this is not just = "fat pipes are better", but because bitrate of the underlying medium is an = insignificant fraction of systems operational and capital expense.=0AAgree = that (if you are the incumbent that =E2=80=98owns=E2=80=99 the low level tr= ansmission medium) that this is true (though the costs of lighting a new la= mbda are not trivial) - but that is not the experience of anyone else in th= e digital supply time=0A=0A=0A =0ASLA's are specified in "uptime" not "bits= transported", and a clogged pipe is defined as down when latency exceeds a= small number.=0ADo you have any evidence you can reference for an SLA that= treats a few ms as =E2=80=9Cdown=E2=80=9D? Most of the SLAs I=E2=80=99ve h= ad dealings with use averages over fairly long time periods (e.g. a month) = - and there is no quality in averages.=0A=0A=0A =0ATypical operating points= of corporate networks where the users are happy are single-digit percentag= e of max load.=0AOr less - they also detest the costs that they have to pay= the network providers to try and de-risk their applications. There is also= the issue that they measure averages (over 5min to 15min) they completely = fail to capture (for example) the 15seconds when delay and jitter was high = so the CEO=E2=80=99s video conference broke up.=0A=0A=0A =0AThis is also tr= ue of computer buses and memory controllers and storage interfaces IRL. Aga= in, latency is the primary measure, and the system never focuses on operati= ng points anywhere near max throughput.=0AAgreed - but wouldn=E2=80=99t it = be nice if they could? I=E2=80=99ve worked on h/w systems where we have des= igned system to run near limits (the set-top box market is pretty cut-throa= t and the closer to saturation you can run and still deliver the acceptable= outcome the cheaper the box the greater the profit margin for the set-top = box provider)=0A=0A=0A =0ARant off.=0A=0A=0ACheers=0ANeil=0A=0A=0AOn Tuesda= y, December 12, 2017 1:36pm, "Dave Taht" <[ dave@taht.net ]( mailto:dave@ta= ht.net )> said:=0A=0A=0A=0A> =0A> Luca Muscariello <[ luca.muscariello@gmai= l.com ]( mailto:luca.muscariello@gmail.com )> writes:=0A> =0A> > I think ev= erything is about response time, even throughput.=0A> >=0A> > If we compare= the time to transmit a single packet from A to B, including=0A> > propagat= ion delay, transmission delay and queuing delay,=0A> > to the time to move = a much larger amount of data from A to B we use=0A> throughput=0A> > in thi= s second case because it is a normalized=0A> > quantity w.r.t. response tim= e (bytes over delivery time). For a single=0A> > transmission we tend to us= e latency.=0A> > But in the end response time is what matters.=0A> >=0A> > = Also, even instantaneous throughput is well defined only for a time scale= =0A> which=0A> > has to be much larger than the min RTT (propagation + tran= smission delays)=0A> > Agree also that looking at video, latency and latenc= y budgets are better=0A> > quantities than throughput. At least more accura= te.=0A> >=0A> > On Fri, Dec 8, 2017 at 8:05 AM, Mikael Abrahamsson <[ swmik= e@swm.pp.se ]( mailto:swmike@swm.pp.se )>=0A> wrote:=0A> >=0A> > On Mon, 4 = Dec 2017, [ dpreed@reed.com ]( mailto:dpreed@reed.com ) wrote:=0A> >=0A> > = I suggest we stop talking about throughput, which has been the=0A> mistaken= =0A> > idea about networking for 30-40 years.=0A> >=0A> >=0A> > We need to = talk both about latency and speed. Yes, speed is talked about=0A> too=0A> >= much (relative to RTT), but it's not irrelevant.=0A> >=0A> > Speed of ligh= t in fiber means RTT is approx 1ms per 100km, so from=0A> Stockholm=0A> > t= o SFO my RTT is never going to be significantly below 85ms (8625km=0A> grea= t=0A> > circle). It's current twice that.=0A> >=0A> > So we just have to ac= cept that some services will never be deliverable=0A> > across the wider In= ternet, but have to be deployed closer to the=0A> customer=0A> > (as per yo= ur examples, some need 1ms RTT to work well), and we need=0A> lower=0A> > a= ccess latency and lower queuing delay. So yes, agreed.=0A> >=0A> > However,= I am not going to concede that speed is "mistaken idea about=0A> > network= ing". No amount of smarter queuing is going to fix the problem if=0A> I=0A>= > don't have enough throughput available to me that I need for my=0A> appl= ication.=0A> =0A> In terms of the bellcurve here, throughput has increased = much more=0A> rapidly than than latency has decreased, for most, and in an = increasing=0A> majority of human-interactive cases (like video streaming), = we often=0A> have enough throughput.=0A> =0A> And the age old argument rega= rding "just have overcapacity, always"=0A> tends to work in these cases.=0A= > =0A> I tend not to care as much about how long it takes for things that d= o=0A> not need R/T deadlines as humans and as steering wheels do.=0A> =0A> = Propigation delay, while ultimately bound by the speed of light, is also=0A= > affected by the wires wrapping indirectly around the earth - much slower= =0A> than would be possible if we worked at it:=0A> =0A> [ https://arxiv.or= g/pdf/1505.03449.pdf ]( https://arxiv.org/pdf/1505.03449.pdf )=0A> =0A> The= n there's inside the boxes themselves:=0A> =0A> A lot of my struggles of la= te has been to get latencies and adaquate=0A> sampling techniques down belo= w 3ms (my previous value for starting to=0A> reject things due to having to= o much noise) - and despite trying fairly=0A> hard, well... a process can't= even sleep accurately much below 1ms, on=0A> bare metal linux. A dream of = mine has been 8 channel high quality audio,=0A> with a video delay of not m= uch more than 2.7ms for AR applications.=0A> =0A> For comparison, an idle q= uad core aarch64 and dual core x86_64:=0A> =0A> root@nanopineo2:~# irtt sle= ep=0A> =0A> Testing sleep accuracy...=0A> =0A> Sleep Duration Mean Error % = Error=0A> =0A> 1ns 13.353=C2=B5s 1335336.9=0A> =0A> 10ns 14.34=C2=B5s 14340= 9.5=0A> =0A> 100ns 13.343=C2=B5s 13343.9=0A> =0A> 1=C2=B5s 12.791=C2=B5s 12= 79.2=0A> =0A> 10=C2=B5s 148.661=C2=B5s 1486.6=0A> =0A> 100=C2=B5s 150.907= =C2=B5s 150.9=0A> =0A> 1ms 168.001=C2=B5s 16.8=0A> =0A> 10ms 131.235=C2=B5s= 1.3=0A> =0A> 100ms 145.611=C2=B5s 0.1=0A> =0A> 200ms 162.917=C2=B5s 0.1=0A= > =0A> 500ms 169.885=C2=B5s 0.0=0A> =0A> =0A> d@nemesis:~$ irtt sleep=0A> = =0A> Testing sleep accuracy...=0A> =0A> =0A> Sleep Duration Mean Error % Er= ror=0A> =0A> 1ns 668ns 66831.9=0A> =0A> 10ns 672ns 6723.7=0A> =0A> 100ns 55= 7ns 557.6=0A> =0A> 1=C2=B5s 57.749=C2=B5s 5774.9=0A> =0A> 10=C2=B5s 63.063= =C2=B5s 630.6=0A> =0A> 100=C2=B5s 67.737=C2=B5s 67.7=0A> =0A> 1ms 153.978= =C2=B5s 15.4=0A> =0A> 10ms 169.709=C2=B5s 1.7=0A> =0A> 100ms 186.685=C2=B5s= 0.2=0A> =0A> 200ms 176.859=C2=B5s 0.1=0A> =0A> 500ms 177.271=C2=B5s 0.0=0A= > =0A> >=0A> > --=0A> > Mikael Abrahamsson email: [ swmike@swm.pp.se ]( mai= lto:swmike@swm.pp.se )=0A> > ______________________________________________= _=0A> >=0A> >=0A> > Bloat mailing list=0A> > [ Bloat@lists.bufferbloat.net = ]( mailto:Bloat@lists.bufferbloat.net )=0A> > [ https://lists.bufferbloat.n= et/listinfo/bloat ]( https://lists.bufferbloat.net/listinfo/bloat )=0A> >= =0A> >=0A> >=0A> > _______________________________________________=0A> > Bl= oat mailing list=0A> > [ Bloat@lists.bufferbloat.net ]( mailto:Bloat@lists.= bufferbloat.net )=0A> > [ https://lists.bufferbloat.net/listinfo/bloat ]( h= ttps://lists.bufferbloat.net/listinfo/bloat )=0A>=0A=0A=0A[ Spam ]( https:/= /portal.roaringpenguin.co.uk/canit/b.php?c=3Ds&i=3D03UJaRTkO&m=3D5027f7184f= f5&rlm=3Dpnsol-com&t=3D20171212 )=0A[ Not spam ]( https://portal.roaringpen= guin.co.uk/canit/b.php?c=3Dn&i=3D03UJaRTkO&m=3D5027f7184ff5&rlm=3Dpnsol-com= &t=3D20171212 )=0A[ Forget previous vote ]( https://portal.roaringpenguin.c= o.uk/canit/b.php?c=3Df&i=3D03UJaRTkO&m=3D5027f7184ff5&rlm=3Dpnsol-com&t=3D2= 0171212 )=0A_______________________________________________Bloat mailing li= st[ Bloat@lists.bufferbloat.net ]( mailto:Bloat@lists.bufferbloat.net )[ ht= tps://lists.bufferbloat.net/listinfo/bloat ]( https://lists.bufferbloat.net= /listinfo/bloat )=0A_______________________________________________=0A Bloa= t mailing list=0A[ Bloat@lists.bufferbloat.net ]( mailto:Bloat@lists.buffer= bloat.net )=0A[ https://lists.bufferbloat.net/listinfo/bloat ]( https://lis= ts.bufferbloat.net/listinfo/bloat )=0A=0A ------=_20171213130814000000_55402 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Just to be clear, = I have built and operated a whole range of network platforms, as well as di= agnosing problems and planning deployments of systems that include digital = packet delivery in real contexts where cost and performance matter, for nea= rly 40 years now. So this isn't only some kind of radical opinion, but hard= -won knowledge across my entire career. I also havea very strong theoretica= l background in queueing theory and control theory -- enough to teach a gra= duate seminar, anyway.

=0A

That said, there are lo= ts of folks out there who have opinions different than mine. But far too ma= ny (such as those who think big buffers are "good", who brought us bufferbl= oat) are not aware of how networks are really used or the practical effects= of their poor models of usage.

=0A

 

=0A<= p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overfl= ow-wrap: break-word;">If it comforts you to think that I am just stating an= "opinion", which must be wrong because it is not the "conventional wisdom"= in the circles where you travel, fine. You are entitled to dismiss any ide= as you don't like. But I would suggest you get data about your assumptions.=

=0A

 

=0A

I don't k= now if I'm being trolled, but a couple of comments on the recent comments:<= /p>=0A

 

=0A

1. Statist= ical multiplexing viewed as an averaging/smoothing as an idea is, in my per= sonal opinion and experience measuring real network behavior, a description= of a theoretical phenomenon that is not real (e.g. "consider a spherical c= ow") that is amenable to theoretical analysis. Such theoretical analysis ca= n make some gross estimates, but it breaks down quickly. The same thing is = true of common economic theory that models practical markets by linear mode= ls (linear systems of differential equations are common) and gaussian proba= bility distributions (gaussians are easily analyzed, but wrong. You can rea= d the popular books by Nassim Taleb for an entertaining and enlightening de= eper understanding of the economic problems with such modeling).

=0A

 

=0A

One of the features = well observed in real measurements of real systems is that packet flows are= "fractal", which means that there is a self-similarity of rate variability= all time scales from micro to macro. As you look at smaller and smaller ti= me scales, or larger and larger time scales, the packet request density per= unit time never smooths out due to "averaging over sources". That is, ther= e's no practical "statistical multiplexing" effect. There's also significan= t correlation among many packet arrivals - assuming they are statistically = independent (which is required for the "law of large numbers" to apply) is = often far from the real situation - flows that are assumed to be independen= t are usually strongly coupled.

=0A

 

=0A<= p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overfl= ow-wrap: break-word;">The one exception where flows average out at a consta= nt rate is when there is a "bottleneck". Then, there being no more capacity= , the constant rate is forced, not by statistical averaging but by a very d= ifferent process. One that is almost never desirable.

=0A

 

=0A

This is just what is observed i= n case after case.  Designers may imagine that their networks have "sm= ooth averaging" properties. There's a strong thread in networking literatur= e that makes this pretty-much-always-false assumption the basis of protocol= designs, thinking about "Quality of Service" and other sorts of things. Yo= u can teach graduate students about a reality that does not exist, and get = papers accepted in conferences where the reviewers have been trained in the= same tradition of unreal assumptions.

=0A

 <= /p>=0A

2. I work every day with "datacenter" networki= ng and distributed systems on 10 GigE and faster Ethernet fabrics with swit= ches and trunking. I see the packet flows driven by distributed computing i= n real systems. Whenever the sustained peak load on a switch path reaches 1= 00%, that's not "good", that's not "efficient" resource usage. That is a si= tuation where computing is experiencing huge wasted capacity due to network= congestion that is dramatically slowing down the desired workload.

=0A<= p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overfl= ow-wrap: break-word;"> 

=0A

Again this is bec= ause *real workloads* in distributed computation don't have smooth or avera= gable rates over interconnects. Latency is everything in that application t= oo!

=0A

 

=0A

Yes, b= ecause one buys switches from vendors who don't know how to build or operat= e a server or a database at all, you see vendors trying to demonstrate thei= r amazing throughput, but the people who build these systems (me, for examp= le) are not looking at throughput or statistical multiplexing at all! We us= e "throughput" as a proxy for "latency under load". (and it is a poor proxy= ! Because vendors throw in big buffers, causing bufferbloat. See Arista Net= works' attempts to justify their huge buffers as a "good thing" -- when it = is just a case of something you have to design around by clocking the packe= ts so they never accumulate in a buffer).

=0A

&nbs= p;

=0A

So, yes, the peak transfer rate matters, of= course. And sometimes it is utilized for very good reason (when the latenc= y of a file transfer as a whole is the latency that matters). But to be cle= ar, just because as a user I want to download a Linux distro update as quic= kly as possible when it happens does NOT imply that the average load at any= time scale is "statistically averaged" for residential networking. Quite t= he opposite! I buy Gigabit service to my house because I cannot predict whe= n I will need it, but I almost never need it. My average rate (except once = a month or so) is miniscule. This is true even though my house is a heavy u= ser of Netflix.

=0A

 

=0A

The way that Gigbit residential service affects my "quality of servic= e" is almost entirely that I get good "response time" to unpredictable dema= nds. How quickly a Netflix stream can fill its play buffer is the measure. = The data rate of any Netflix stream is, on average much, much less than a G= igabit. Buffers in the network would ruin my Netflix experience, because th= e buffering is best done at the "edge" as the End-to-End argument usually s= uggests. It's certainly NOT because of statistical multiplexing.

=0A

 

=0A

So when you are temp= ted to talk about "statistical multiplexing" smoothing out traffic flow tak= e a pause and think about whether that really makes sense as a description = of reality.

=0A

 

=0A

=0A

 =0A

 

=0A<= !--WM_COMPOSE_SIGNATURE_END-->=0A



On Wedn= esday, December 13, 2017 11:41am, "Jonathan Morton" <chromatix99@gmail.c= om> said:

=0A
=0A

> Have you considered what this means for the= economics of the operation of networks? What other industry that =E2=80=9C= moves things around=E2=80=9D (i.e logistical or similar) system creates a s= olution in which they have 10x as much infrastructure than their peak requi= rement?

=0A

Ten times peak demand?&nbs= p; No.

=0A

Ten times average demand es= timated at time of deployment, and struggling badly with peak demand a deca= de later, yes.  And this is the transportation industry, where a decad= e is a *short* time - like less than a year in telecoms.

=0A

- Jonathan Morton

=0A
<= br />=0A
On 13 Dec 2017 17:27, "Neil Davies" <= neil.davies@pnsol.com> wrot= e:
=0A
=0A

=0A
=0A
=0A
On 12 Dec 2017, at 22:53= , dpreed@reed.com = wrote:
=0A
=0A
=0A
Luca's point tends to be correct - variable latency de= stroys the stability of flow control loops, which destroys throughput, even= when there is sufficient capacity to handle the load.
=0A

 

=0A
This is an indirect result of Little's Lemma = (which is strictly true only for Poisson arrival, but almost any arrival pr= ocess will have a similar interaction between latency and throughput).=0A
=0A
=0A
Actually it is true for general arrival p= atterns (can=E2=80=99t lay my hands on the reference for the moment - but i= t was a while back that was shown) - what this points to is an underlying c= onservation law - that =E2=80=9Cdelay and loss=E2=80=9D are conserved in a = scheduling process. This comes out of the M/M/1/K/K queueing system and ass= ociated analysis.
=0A
There is  conservation law (and Klienro= ck refers to this - at least in terms of delay - in 1965 - http://onlinelibrary.wiley.com/doi/10.1002/nav.3800120206/abstr= act) at work here.
=0A
All scheduling systems can do is =E2=80= =9Cdistribute=E2=80=9D the resulting =E2=80=9Cdelay and loss=E2=80=9D diffe= rentially amongst the (instantaneous set of) competing streams. 
= =0A
Let me just repeat that - The =E2=80=9Cdelay and loss=E2=80=9D are = a conserved quantity - scheduling can=E2=80=99t =E2=80=9Cdestroy=E2=80=9D i= t (they can influence higher level protocol behaviour) but not reduce the t= otal amount of =E2=80=9Cdelay and loss=E2=80=9D that is being induced into = the collective set of streams...
=0A
=0A
=0A
=0A 

=0A
However, the other reason I say wha= t I say so strongly is this:
=0A

 

=0ARant on.
=0A

 

=0A
Peak/avg. lo= ad ratio always exceeds a factor of 10 or more, IRL. Only "benchmark setups= " (or hot-rod races done for academic reasons or marketing reasons to claim= some sort of "title") operate at peak supportable load any significant par= t of the time.
=0A
=0A
=0A
Have you considered wh= at this means for the economics of the operation of networks? What other in= dustry that =E2=80=9Cmoves things around=E2=80=9D (i.e logistical or simila= r) system creates a solution in which they have 10x as much infrastructure = than their peak requirement?
=0A
=0A
=0A
=0A

 

=0A
The reason for this is not just "fat pi= pes are better", but because bitrate of the underlying medium is an insigni= ficant fraction of systems operational and capital expense.
=0A
= =0A
=0A
Agree that (if you are the incumbent that =E2=80=98= owns=E2=80=99 the low level transmission medium) that this is true (though = the costs of lighting a new lambda are not trivial) - but that is not the e= xperience of anyone else in the digital supply time
=0A
=0A=0A
=0A

 

=0A
SLA's are specif= ied in "uptime" not "bits transported", and a clogged pipe is defined as do= wn when latency exceeds a small number.
=0A
=0A=0ADo you have any evidence you can reference for an SLA that treats a few = ms as =E2=80=9Cdown=E2=80=9D? Most of the SLAs I=E2=80=99ve had dealings wi= th use averages over fairly long time periods (e.g. a month) - and there is= no quality in averages.
=0A
=0A
=0A
=0A

 

=0A
Typical operating points of corporate net= works where the users are happy are single-digit percentage of max load.=0A
=0A
=0A
Or less - they also detest the costs th= at they have to pay the network providers to try and de-risk their applicat= ions. There is also the issue that they measure averages (over 5min to 15mi= n) they completely fail to capture (for example) the 15seconds when delay a= nd jitter was high so the CEO=E2=80=99s video conference broke up.
=0A=
=0A
=0A
=0A

 

=0A
T= his is also true of computer buses and memory controllers and storage inter= faces IRL. Again, latency is the primary measure, and the system never focu= ses on operating points anywhere near max throughput.
=0A
=0A=0A
Agreed - but wouldn=E2=80=99t it be nice if they could? I= =E2=80=99ve worked on h/w systems where we have designed system to run near= limits (the set-top box market is pretty cut-throat and the closer to satu= ration you can run and still deliver the acceptable outcome the cheaper the= box the greater the profit margin for the set-top box provider)
=0A=0A
=0A
=0A

 

=0A
Ran= t off.

=0A
=0A
=0A
Cheers
=0ANeil
=0A
=0A
=0A
=0A
On Tuesday, December 1= 2, 2017 1:36pm, "Dave Taht" <dave@taht.net> said:

=0A
=0A
> 
> Luca Muscariello = <luca.mu= scariello@gmail.com> writes:
> 
> > I think everythi= ng is about response time, even throughput.
> >
> > I= f we compare the time to transmit a single packet from A to B, including> > propagation delay, transmission delay and queuing delay,
= > > to the time to move a much larger amount of data from A to B we u= se
> throughput
> > in this second case because it is a = normalized
> > quantity w.r.t. response time (bytes over deliver= y time). For a single
> > transmission we tend to use latency.> > But in the end response time is what matters.
> ><= br />> > Also, even instantaneous throughput is well defined only for= a time scale
> which
> > has to be much larger than the= min RTT (propagation + transmission delays)
> > Agree also that= looking at video, latency and latency budgets are better
> > qu= antities than throughput. At least more accurate.
> >
> = > On Fri, Dec 8, 2017 at 8:05 AM, Mikael Abrahamsson <swmike@swm.pp.se>
> w= rote:
> >
> > On Mon, 4 Dec 2017, dpreed@reed.com wrote:
> >=
> > I suggest we stop talking about throughput, which has been = the
> mistaken
> > idea about networking for 30-40 years= .
> >
> >
> > We need to talk both about l= atency and speed. Yes, speed is talked about
> too
> > m= uch (relative to RTT), but it's not irrelevant.
> >
> &g= t; Speed of light in fiber means RTT is approx 1ms per 100km, so from
= > Stockholm
> > to SFO my RTT is never going to be significan= tly below 85ms (8625km
> great
> > circle). It's current= twice that.
> >
> > So we just have to accept that s= ome services will never be deliverable
> > across the wider Inte= rnet, but have to be deployed closer to the
> customer
> &g= t; (as per your examples, some need 1ms RTT to work well), and we need
> lower
> > access latency and lower queuing delay. So yes, = agreed.
> >
> > However, I am not going to concede th= at speed is "mistaken idea about
> > networking". No amount of s= marter queuing is going to fix the problem if
> I
> > do= n't have enough throughput available to me that I need for my
> app= lication.
> 
> In terms of the bellcurve here, throughput has = increased much more
> rapidly than than latency has decreased, for = most, and in an increasing
> majority of human-interactive cases (l= ike video streaming), we often
> have enough throughput.
><= span class=3D"m_-2579895552460910831Apple-converted-space"> > And the age old argument regarding "just have overcapacity, always"=
> tends to work in these cases.
> 
> I tend not to ca= re as much about how long it takes for things that do
> not need R/= T deadlines as humans and as steering wheels do.
> 
> Propigat= ion delay, while ultimately bound by the speed of light, is also
> = affected by the wires wrapping indirectly around the earth - much slower> than would be possible if we worked at it:
> 
> https://arxiv.= org/pdf/1505.03449.pdf
> 
> Then there's inside the boxes = themselves:
> 
> A lot of my struggles of late has been to get= latencies and adaquate
> sampling techniques down below 3ms (my pr= evious value for starting to
> reject things due to having too much= noise) - and despite trying fairly
> hard, well... a process can't= even sleep accurately much below 1ms, on
> bare metal linux. A dre= am of mine has been 8 channel high quality audio,
> with a video de= lay of not much more than 2.7ms for AR applications.
> 
> Fo= r comparison, an idle quad core aarch64 and dual core x86_64:
> 
= > root@nanopineo2:~# irtt sleep
> 
> Testing sleep accuracy= ...
>&n= bsp;
> Sleep Duration Mean Error % Error
> 
> 1= ns 13.353=C2=B5s 1335336.9
> 
> 10ns 14.34=C2=B5s 143409.5
> 
> 100ns 13.343=C2=B5s 13343.9
> 
> 1=C2=B5s 12.79= 1=C2=B5s 1279.2
> 
> 10=C2=B5s 148.661=C2=B5s 1486.6
>=  > 100=C2=B5s 150.907=C2=B5s 150.9
> 
> 1ms 168.001=C2= =B5s 16.8
> 
> 10ms 131.235=C2=B5s 1.3
> 
> 100ms= 145.611=C2=B5s 0.1
> 
> 200ms 162.917=C2=B5s 0.1
> 

> 500ms 169.885=C2=B5s 0.0
> 
> 
> d@nemesis:~$ irtt = sleep
>=  
> Testing sleep accuracy...
> 
> 
>= Sleep Duration Mean Error % Error
> 
> 1ns 668ns 66831.9
> 
> 10ns 672ns 6723.7
> 
> 100ns 557ns 557.6
&g= t; =
> 1=C2=B5s 57.749=C2=B5s 5774.9
> 
> 10=C2=B5s 63.063= =C2=B5s 630.6
> 
> 100=C2=B5s 67.737=C2=B5s 67.7
> 

= > 1ms 153.978=C2=B5s 15.4
> 
> 10ms 169.709=C2=B5s 1.7
> 
> 100ms 186.685=C2=B5s 0.2
> 
> 200ms 176.859=C2= =B5s 0.1
> 
> 500ms 177.271=C2=B5s 0.0
> 
> ><= br />> > --
> > Mikael Abrahamsson email: swmike@swm.pp.se
> > _= ______________________________________________
> >
> >= ;
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net> > https://lists.bufferbloat.net/listinfo/bloat
> >= ;
> >
> >
> > ____________________________= ___________________
> > Bloat mailing list
> > Bloat@lists.buff= erbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat=
>
=0A
=0A=0A_______________________= ________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat
=0A=
=0A
=0A=0A
___________________________________= ____________
Bloat mailing list
Bloat@lists.bufferbloat.net
http= s://lists.bufferbloat.net/listinfo/bloat

=0A=0A=0A
------=_20171213130814000000_55402--