From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path: Just to be clear, =
I have built and operated a whole range of network platforms, as well as di=
agnosing problems and planning deployments of systems that include digital =
packet delivery in real contexts where cost and performance matter, for nea=
rly 40 years now. So this isn't only some kind of radical opinion, but hard=
-won knowledge across my entire career. I also havea very strong theoretica=
l background in queueing theory and control theory -- enough to teach a gra=
duate seminar, anyway. That said, there are lo=
ts of folks out there who have opinions different than mine. But far too ma=
ny (such as those who think big buffers are "good", who brought us bufferbl=
oat) are not aware of how networks are really used or the practical effects=
of their poor models of usage.
=0A
I don't k= now if I'm being trolled, but a couple of comments on the recent comments:<= /p>=0A
=0A
1. Statist= ical multiplexing viewed as an averaging/smoothing as an idea is, in my per= sonal opinion and experience measuring real network behavior, a description= of a theoretical phenomenon that is not real (e.g. "consider a spherical c= ow") that is amenable to theoretical analysis. Such theoretical analysis ca= n make some gross estimates, but it breaks down quickly. The same thing is = true of common economic theory that models practical markets by linear mode= ls (linear systems of differential equations are common) and gaussian proba= bility distributions (gaussians are easily analyzed, but wrong. You can rea= d the popular books by Nassim Taleb for an entertaining and enlightening de= eper understanding of the economic problems with such modeling).
=0A=0A
One of the features = well observed in real measurements of real systems is that packet flows are= "fractal", which means that there is a self-similarity of rate variability= all time scales from micro to macro. As you look at smaller and smaller ti= me scales, or larger and larger time scales, the packet request density per= unit time never smooths out due to "averaging over sources". That is, ther= e's no practical "statistical multiplexing" effect. There's also significan= t correlation among many packet arrivals - assuming they are statistically = independent (which is required for the "law of large numbers" to apply) is = often far from the real situation - flows that are assumed to be independen= t are usually strongly coupled.
=0A=0A<= p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overfl= ow-wrap: break-word;">The one exception where flows average out at a consta= nt rate is when there is a "bottleneck". Then, there being no more capacity= , the constant rate is forced, not by statistical averaging but by a very d= ifferent process. One that is almost never desirable.=0A
=0A
This is just what is observed i= n case after case. Designers may imagine that their networks have "sm= ooth averaging" properties. There's a strong thread in networking literatur= e that makes this pretty-much-always-false assumption the basis of protocol= designs, thinking about "Quality of Service" and other sorts of things. Yo= u can teach graduate students about a reality that does not exist, and get = papers accepted in conferences where the reviewers have been trained in the= same tradition of unreal assumptions.
=0A<= /p>=0A
2. I work every day with "datacenter" networki= ng and distributed systems on 10 GigE and faster Ethernet fabrics with swit= ches and trunking. I see the packet flows driven by distributed computing i= n real systems. Whenever the sustained peak load on a switch path reaches 1= 00%, that's not "good", that's not "efficient" resource usage. That is a si= tuation where computing is experiencing huge wasted capacity due to network= congestion that is dramatically slowing down the desired workload.
=0A<= p style=3D"margin:0;padding:0;font-family: verdana; font-size: 10pt; overfl= ow-wrap: break-word;"> =0AAgain this is bec= ause *real workloads* in distributed computation don't have smooth or avera= gable rates over interconnects. Latency is everything in that application t= oo!
=0A=0A
Yes, b= ecause one buys switches from vendors who don't know how to build or operat= e a server or a database at all, you see vendors trying to demonstrate thei= r amazing throughput, but the people who build these systems (me, for examp= le) are not looking at throughput or statistical multiplexing at all! We us= e "throughput" as a proxy for "latency under load". (and it is a poor proxy= ! Because vendors throw in big buffers, causing bufferbloat. See Arista Net= works' attempts to justify their huge buffers as a "good thing" -- when it = is just a case of something you have to design around by clocking the packe= ts so they never accumulate in a buffer).
=0A&nbs= p;
=0ASo, yes, the peak transfer rate matters, of= course. And sometimes it is utilized for very good reason (when the latenc= y of a file transfer as a whole is the latency that matters). But to be cle= ar, just because as a user I want to download a Linux distro update as quic= kly as possible when it happens does NOT imply that the average load at any= time scale is "statistically averaged" for residential networking. Quite t= he opposite! I buy Gigabit service to my house because I cannot predict whe= n I will need it, but I almost never need it. My average rate (except once = a month or so) is miniscule. This is true even though my house is a heavy u= ser of Netflix.
=0A=0A
The way that Gigbit residential service affects my "quality of servic= e" is almost entirely that I get good "response time" to unpredictable dema= nds. How quickly a Netflix stream can fill its play buffer is the measure. = The data rate of any Netflix stream is, on average much, much less than a G= igabit. Buffers in the network would ruin my Netflix experience, because th= e buffering is best done at the "edge" as the End-to-End argument usually s= uggests. It's certainly NOT because of statistical multiplexing.
=0A=0A
So when you are temp= ted to talk about "statistical multiplexing" smoothing out traffic flow tak= e a pause and think about whether that really makes sense as a description = of reality.
=0A=0A
fq_codel is a good thing because it handles the awkward behavior at "peak= load". It smooths out the impact of running out of resources. But that imp= act is still undesirable - if many Netflix flows are adding up to peak load= , a new Netflix flow can't start very quickly. That results in terrible QoS= from a Netflix user's point of view.
=0A= p>=0A
=0A<= !--WM_COMPOSE_SIGNATURE_END-->=0A
On Wedn=
esday, December 13, 2017 11:41am, "Jonathan Morton" <chromatix99@gmail.c=
om> said:
> Have you considered what this means for the= economics of the operation of networks? What other industry that =E2=80=9C= moves things around=E2=80=9D (i.e logistical or similar) system creates a s= olution in which they have 10x as much infrastructure than their peak requi= rement?
=0ATen times peak demand?&nbs= p; No.
=0ATen times average demand es= timated at time of deployment, and struggling badly with peak demand a deca= de later, yes. And this is the transportation industry, where a decad= e is a *short* time - like less than a year in telecoms.
=0A- Jonathan Morton
=0A=0A=0A
=0A=0A=0A=0A=0A=0AOn 12 Dec 2017, at 22:53= , dpreed@reed.com = wrote:=0A
=0A=0A=0ALuca's point tends to be correct - variable latency de= stroys the stability of flow control loops, which destroys throughput, even= when there is sufficient capacity to handle the load.=0A=0A
This is an indirect result of Little's Lemma = (which is strictly true only for Poisson arrival, but almost any arrival pr= ocess will have a similar interaction between latency and throughput).=0AActually it is true for general arrival p= atterns (can=E2=80=99t lay my hands on the reference for the moment - but i= t was a while back that was shown) - what this points to is an underlying c= onservation law - that =E2=80=9Cdelay and loss=E2=80=9D are conserved in a = scheduling process. This comes out of the M/M/1/K/K queueing system and ass= ociated analysis.=0AThere is conservation law (and Klienro= ck refers to this - at least in terms of delay - in 1965 - http://onlinelibrary.wiley.com/doi/10.1002/nav.3800120206/abstr= act) at work here.=0AAll scheduling systems can do is =E2=80= =9Cdistribute=E2=80=9D the resulting =E2=80=9Cdelay and loss=E2=80=9D diffe= rentially amongst the (instantaneous set of) competing streams.= =0ALet me just repeat that - The =E2=80=9Cdelay and loss=E2=80=9D are = a conserved quantity - scheduling can=E2=80=99t =E2=80=9Cdestroy=E2=80=9D i= t (they can influence higher level protocol behaviour) but not reduce the t= otal amount of =E2=80=9Cdelay and loss=E2=80=9D that is being induced into = the collective set of streams...=0A
=0A=0A=0A=0A=0A
However, the other reason I say wha= t I say so strongly is this:=0A=0A
Rant on. =0A
Peak/avg. lo= ad ratio always exceeds a factor of 10 or more, IRL. Only "benchmark setups= " (or hot-rod races done for academic reasons or marketing reasons to claim= some sort of "title") operate at peak supportable load any significant par= t of the time.=0AHave you considered wh= at this means for the economics of the operation of networks? What other in= dustry that =E2=80=9Cmoves things around=E2=80=9D (i.e logistical or simila= r) system creates a solution in which they have 10x as much infrastructure = than their peak requirement?=0A
=0A=0A=0A=0A= =0A=0A
The reason for this is not just "fat pi= pes are better", but because bitrate of the underlying medium is an insigni= ficant fraction of systems operational and capital expense.=0AAgree that (if you are the incumbent that =E2=80=98= owns=E2=80=99 the low level transmission medium) that this is true (though = the costs of lighting a new lambda are not trivial) - but that is not the e= xperience of anyone else in the digital supply time=0A
=0A=0A =0A=0A=0A=0A
SLA's are specif= ied in "uptime" not "bits transported", and a clogged pipe is defined as do= wn when latency exceeds a small number.=0ADo you have any evidence you can reference for an SLA that treats a few = ms as =E2=80=9Cdown=E2=80=9D? Most of the SLAs I=E2=80=99ve had dealings wi= th use averages over fairly long time periods (e.g. a month) - and there is= no quality in averages.
=0A=0A=0A=0A=0A
Typical operating points of corporate net= works where the users are happy are single-digit percentage of max load.=0A=0A=0AOr less - they also detest the costs th= at they have to pay the network providers to try and de-risk their applicat= ions. There is also the issue that they measure averages (over 5min to 15mi= n) they completely fail to capture (for example) the 15seconds when delay a= nd jitter was high so the CEO=E2=80=99s video conference broke up.=0A=
=0A=0A=0A=0A=0A=0A
T= his is also true of computer buses and memory controllers and storage inter= faces IRL. Again, latency is the primary measure, and the system never focu= ses on operating points anywhere near max throughput.=0AAgreed - but wouldn=E2=80=99t it be nice if they could? I= =E2=80=99ve worked on h/w systems where we have designed system to run near= limits (the set-top box market is pretty cut-throat and the closer to satu= ration you can run and still deliver the acceptable outcome the cheaper the= box the greater the profit margin for the set-top box provider)=0A=0A=0A=0A=0A=0A=0A
Ran= t off.=0ACheers=0ANeil
=0A=0A=0A=0A=0A=0A==0A=0A=0A_______________________= ________________________>=0A
> Luca Muscariello = <luca.mu= scariello@gmail.com> writes:
>
> > I think everythi= ng is about response time, even throughput.
> >
> > I= f we compare the time to transmit a single packet from A to B, including
> > propagation delay, transmission delay and queuing delay,
= > > to the time to move a much larger amount of data from A to B we u= se
> throughput
> > in this second case because it is a = normalized
> > quantity w.r.t. response time (bytes over deliver= y time). For a single
> > transmission we tend to use latency.> > But in the end response time is what matters.
> ><= br />> > Also, even instantaneous throughput is well defined only for= a time scale
> which
> > has to be much larger than the= min RTT (propagation + transmission delays)
> > Agree also that= looking at video, latency and latency budgets are better
> > qu= antities than throughput. At least more accurate.
> >
> = > On Fri, Dec 8, 2017 at 8:05 AM, Mikael Abrahamsson <swmike@swm.pp.se>
> w= rote:
> >
> > On Mon, 4 Dec 2017, dpreed@reed.com wrote:
> >=
> > I suggest we stop talking about throughput, which has been = the
> mistaken
> > idea about networking for 30-40 years= .
> >
> >
> > We need to talk both about l= atency and speed. Yes, speed is talked about
> too
> > m= uch (relative to RTT), but it's not irrelevant.
> >
> &g= t; Speed of light in fiber means RTT is approx 1ms per 100km, so from
= > Stockholm
> > to SFO my RTT is never going to be significan= tly below 85ms (8625km
> great
> > circle). It's current= twice that.
> >
> > So we just have to accept that s= ome services will never be deliverable
> > across the wider Inte= rnet, but have to be deployed closer to the
> customer
> &g= t; (as per your examples, some need 1ms RTT to work well), and we need
> lower
> > access latency and lower queuing delay. So yes, = agreed.
> >
> > However, I am not going to concede th= at speed is "mistaken idea about
> > networking". No amount of s= marter queuing is going to fix the problem if
> I
> > do= n't have enough throughput available to me that I need for my
> app= lication.
>
> In terms of the bellcurve here, throughput has = increased much more
> rapidly than than latency has decreased, for = most, and in an increasing
> majority of human-interactive cases (l= ike video streaming), we often
> have enough throughput.
><= span class=3D"m_-2579895552460910831Apple-converted-space">
> And the age old argument regarding "just have overcapacity, always"=
> tends to work in these cases.
>
> I tend not to ca= re as much about how long it takes for things that do
> not need R/= T deadlines as humans and as steering wheels do.
>
> Propigat= ion delay, while ultimately bound by the speed of light, is also
> = affected by the wires wrapping indirectly around the earth - much slower
> than would be possible if we worked at it:
>
> https://arxiv.= org/pdf/1505.03449.pdf
>
> Then there's inside the boxes = themselves:
>
> A lot of my struggles of late has been to get= latencies and adaquate
> sampling techniques down below 3ms (my pr= evious value for starting to
> reject things due to having too much= noise) - and despite trying fairly
> hard, well... a process can't= even sleep accurately much below 1ms, on
> bare metal linux. A dre= am of mine has been 8 channel high quality audio,
> with a video de= lay of not much more than 2.7ms for AR applications.
>
> Fo= r comparison, an idle quad core aarch64 and dual core x86_64:
>
= > root@nanopineo2:~# irtt sleep
>
> Testing sleep accuracy= ...
>&n= bsp;
> Sleep Duration Mean Error % Error
>
> 1= ns 13.353=C2=B5s 1335336.9
>
> 10ns 14.34=C2=B5s 143409.5
>
> 100ns 13.343=C2=B5s 13343.9
>
> 1=C2=B5s 12.79= 1=C2=B5s 1279.2
>
> 10=C2=B5s 148.661=C2=B5s 1486.6
>= > 100=C2=B5s 150.907=C2=B5s 150.9
>
> 1ms 168.001=C2= =B5s 16.8
>
> 10ms 131.235=C2=B5s 1.3
>
> 100ms= 145.611=C2=B5s 0.1
>
> 200ms 162.917=C2=B5s 0.1
>
> 500ms 169.885=C2=B5s 0.0
>
>
> d@nemesis:~$ irtt = sleep
>=
> Testing sleep accuracy...
>
>
>= Sleep Duration Mean Error % Error
>
> 1ns 668ns 66831.9
>
> 10ns 672ns 6723.7
>
> 100ns 557ns 557.6
&g= t; =
> 1=C2=B5s 57.749=C2=B5s 5774.9
>
> 10=C2=B5s 63.063= =C2=B5s 630.6
>
> 100=C2=B5s 67.737=C2=B5s 67.7
>
= > 1ms 153.978=C2=B5s 15.4
>
> 10ms 169.709=C2=B5s 1.7
>
> 100ms 186.685=C2=B5s 0.2
>
> 200ms 176.859=C2= =B5s 0.1
>
> 500ms 177.271=C2=B5s 0.0
>
> ><= br />> > --
> > Mikael Abrahamsson email: swmike@swm.pp.se
> > _= ______________________________________________
> >
> >= ;
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
> >= ;
> >
> >
> > ____________________________= ___________________
> > Bloat mailing list
> > Bloat@lists.buff= erbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat=
>
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat