From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zimbra.cs.ucla.edu (zimbra.cs.ucla.edu [131.179.128.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9F2B13B2A4; Fri, 9 Jul 2021 19:01:59 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8BA311600ED; Fri, 9 Jul 2021 16:01:58 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id cb3uF6xYFswL; Fri, 9 Jul 2021 16:01:53 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4FF9F16006F; Fri, 9 Jul 2021 16:01:53 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id xf_8c04oU3P2; Fri, 9 Jul 2021 16:01:53 -0700 (PDT) Received: from smtpclient.apple (unknown [172.27.177.54]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id B19121600ED; Fri, 9 Jul 2021 16:01:52 -0700 (PDT) Content-Type: multipart/alternative; boundary="Apple-Mail=_422EA467-017F-4539-8634-56F78D0A0508" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\)) From: Leonard Kleinrock X-Priority: 3 (Normal) In-Reply-To: <1625859083.09751240@apps.rackspace.com> Date: Fri, 9 Jul 2021 16:01:52 -0700 Cc: Leonard Kleinrock , Luca Muscariello , starlink@lists.bufferbloat.net, Make-Wifi-fast , Bob McMahon , Cake List , codel@lists.bufferbloat.net, cerowrt-devel , bloat , Ben Greear Message-Id: <8C38E940-8B97-4767-A39B-25F043AE0856@cs.ucla.edu> References: <1625188609.32718319@apps.rackspace.com> <989de0c1-e06c-cda9-ebe6-1f33df8a4c24@candelatech.com> <1625773080.94974089@apps.rackspace.com> <1625859083.09751240@apps.rackspace.com> To: "David P. Reed" X-Mailer: Apple Mail (2.3654.100.0.2.22) X-Mailman-Approved-At: Fri, 09 Jul 2021 19:11:19 -0400 Subject: Re: [Cerowrt-devel] Little's Law mea culpa, but not invalidating my main point X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2021 23:02:00 -0000 --Apple-Mail=_422EA467-017F-4539-8634-56F78D0A0508 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 David, No question that non-stationarity and instability are what we often see = in networks. And, non-stationarity and instability are both topics that = lead to very complex analytical problems in queueing theory. You can = find some results on the transient analysis in the queueing theory = literature (including the second volume of my Queueing Systems book), = but they are limited and hard. Nevertheless, the literature does contain = some works on transient analysis of queueing systems as applied to = network congestion control - again limited. On the other hand, as you = said, control theory addresses stability head on and does offer some = tools as well, but again, it is hairy.=20 Averages are only averages, but they can provide valuable information. = For sure, latency can and does confound behavior. But, as you point = out, it is the proliferation of control protocols that are, in some = cases, deployed willy-nilly in networks without proper evaluation of = their behavior that can lead to the nasty cycle of large transient = latency, frantic repeating of web requests, protocols sending multiple = copies, lack of awareness of true capacity or queue size or throughput, = etc, all of which you articulate so well, create the chaos and = frustration in the network. Analyzing that is really difficult, and if = we don=E2=80=99t measure and sense, we have no hope of understanding, = controlling, or ameliorating such situations. =20 Len > On Jul 9, 2021, at 12:31 PM, David P. Reed = wrote: >=20 > Len - I admit I made a mistake in challenging Little's Law as being = based on Poisson processes. It is more general. But it tells you an = "average" in its base form, and latency averages are not useful for end = user applications. > =20 > However, Little's Law does assume something that is not actually valid = about the kind of distributions seen in the network, and in fact, it is = NOT true that networks converge on Poisson arrival times. > =20 > The key issue is well-described in the sandard analysis of the M/M/1 = queue (e.g. https://en.wikipedia.org/wiki/M/M/1_queue = ) , which is done only for = Poisson processes, and is also limited to "stable" systems. But networks = are never stable when fully loaded. They get unstable and those = instabilities persist for a long time in the network. Instability is at = core the underlying *requirement* of the Internet's usage. > =20 > So specifically: real networks, even large ones, and certainly the = Internet today, are not asymptotic limits of sums of stationary = stochastic arrival processes. Each esternal terminal of any real network = has a real user there, running a real application, and the network is a = complex graph. This makes it completely unlike a single queue. Even the = links within a network carry a relatively small number of application = flows. There's no ability to apply the Law of Large Numbers to the = distributions, because any particular path contains only a small number = of serialized flows with hightly variable rates. > =20 > Here's an example of what really happens in a real network (I've = observed this in 5 different cities on ATT's cellular network, back when = it was running Alcatel Lucent HSPA+ gear in those cities). > But you can see this on any network where transient overload occurs, = creating instability. > =20 > =20 > At 7 AM, the data transmission of the network is roughty stable. = That's because no links are overloaded within the network. Little's Law = can tell you by observing the delay and throughput on any path that the = average delay in the network is X. > =20 > Continue sampling delay in the network as the day wears on. At about = 10 AM, ping delay starts to soar into the multiple second range. No = packers are lost. The peak ping time is about 4000 milliseconds - 4 = seconds in most of the networks. This is in downtown, no radio errors = are reported, no link errors. > So it is all queueing delay.=20 > =20 > Now what Little's law doesn't tell you much about average delay, = because clearly *some* subpiece of the network is fully saturated. But = what is interesting here is what is happening and where. You can't tell = what is saturated, and in fact the entire network is quite unstable, = because the peak is constantly varying and you don't know where the = throughput is. All the packets are now arriving 4 seconds or so later. > =20 > Why is the situaton not worse than 4 seconds? Well, there are multiple = things going on: > =20 > 1) TCP may be doing a lot of retransmissions (non-Poisson at all, not = random either. The arrival process is entirely deterministic in each = source, based on the retransmission timeout) or it may not be. > =20 > 2) Users are pissed off, because they clicked on a web page, and got = nothing back. They retry on their screen, or they try another site. = Meanwhile, the underlying TCP connection remains there, pumping the = network full of more packets on that old path, which is still backed up = with packets that haven't been delivered that are sitting in queues. The = real arrival process is not Poisson at all, its a deterministic, = repeated retrsnsmission plus a new attempt to connect to a new site. > =20 > 3) When the users get a web page back eventually, it is filled with = names of other pieces needed to display that web page, which causes some = number (often as many as 100) new pages to be fetched, ALL at the same = time. Certainly not a stochastic process that will just obey the law of = large numbers. > =20 > All of these things are the result of initial instability, causing = queues to build up. > =20 > So what is the state of the system? is it stable? is it stochastic? Is = it the sum of enough stochastic stable flows to average out to Poisson? > =20 > The answer is clearly NO. Control theory (not queuing theory) suggests = that this system is completely uncontrolled and unstable. > =20 > So if the system is in this state, what does Little's Lemma tell us? = What is the meaning of that hightly variable 4 second delay on ping = packets, in terms of average utilizaton of the network? > =20 > We don't even know what all the users really might need, if the system = hadn't become unstable, because some users have given up, and others are = trying even harder, and new users are arriving. > =20 > What we do know, because ATT (at my suggestion) reconfigured their = system after blaming Apple Computer company for "bugs" in the original = iPhone in public, is that simply *dropping* packets sitting in queues = more than a couple milliseconds MADE THE USERS HAPPY. Apparently the = required capacity was there all along!=20 > =20 > So I conclude that the 4 second delay was the largest delay users = could barely tolerate before deciding the network was DOWN and going = away. And that the backup was the accumulation of useless packets = sitting in queues because none of the end systems were receiving = congestion signals (which for the Internet stack begins with packet = dropping). > =20 > I should say that most operators, and especially ATT in this case, do = not measure end-to-end latency. Instead they use Little's Lemma to query = routers for their current throughput in bits per second, and calculate = latency as if Little's Lemma applied. This results in reports to = management that literally say: > =20 > The network is not dropping packets, utilization is near 100% on = many of our switches and routers. > =20 > And management responds, Hooray! Because utilization of 100% of their = hardware is their investors' metric of maximizing profits. The hardware = they are operating is fully utilized. No waste! And users are happy = because no packets have been dropped! > =20 > Hmm... what's wrong with this picture? I can see why Donovan, CTO, = would accuse Apple of lousy software that was ruining iPhone user = experience! His network was operating without ANY problems. > So it must be Apple! > =20 > Well, no. The entire problem, as we saw when ATT just changed to = shorten egress queues and drop packets when the egress queues = overflowed, was that ATT's network was amplifying instability, not at = the link level, but at the network level. > =20 > And queueing theory can help with that, but *intro queueing theory* = cannot. > =20 > And a big part of that problem is the pervasive belief that, at the = network boundary, *Poisson arrival* is a reasonable model for use in all = cases. > =20 > =20 > =20 > =20 > =20 > =20 > =20 > =20 > =20 > =20 > On Friday, July 9, 2021 6:05am, "Luca Muscariello" = said: >=20 > For those who might be interested in Little's law > there is a nice paper by John Little on the occasion=20 > of the 50th anniversary of the result. > = https://www.informs.org/Blogs/Operations-Research-Forum/Little-s-Law-as-Vi= ewed-on-its-50th-Anniversary = > = https://www.informs.org/content/download/255808/2414681/file/little_paper.= pdf = > =20 > Nice read.=20 > Luca=20 > =20 > P.S.=20 > Who has not a copy of L. Kleinrock's books? I do have and am not ready = to lend them! > On Fri, Jul 9, 2021 at 11:01 AM Leonard Kleinrock > wrote: > David, > I totally appreciate your attention to when and when not analytical = modeling works. Let me clarify a few things from your note. > First, Little's law (also known as Little=E2=80=99s lemma or, as I use = in my book, Little=E2=80=99s result) does not assume Poisson arrivals - = it is good for any arrival process and any service process and is an = equality between time averages. It states that the time average of the = number in a system (for a sample path w) is equal to the average arrival = rate to the system multiplied by the time-averaged time in the system = for that sample path. This is often written as NTimeAvg =3D=CE=BB=C2=B7= TTimeAvg . Moreover, if the system is also ergodic, then the time = average equals the ensemble average and we often write it as N =CC=84 =3D = =CE=BB T =CC=84 . In any case, this requires neither Poisson arrivals = nor exponential service times. =20 > =20 > Queueing theorists often do study the case of Poisson arrivals. True, = it makes the analysis easier, yet there is a better reason it is often = used, and that is because the sum of a large number of independent = stationary renewal processes approaches a Poisson process. So nature = often gives us Poisson arrivals. =20 > Best, > Len > On Jul 8, 2021, at 12:38 PM, David P. Reed > wrote: >=20 > I will tell you flat out that the arrival time distribution assumption = made by Little's Lemma that allows "estimation of queue depth" is = totally unreasonable on ANY Internet in practice. > =20 > The assumption is a Poisson Arrival Process. In reality, traffic = arrivals in real internet applications are extremely far from Poisson, = and, of course, using TCP windowing, become highly intercorrelated with = crossing traffic that shares the same queue. > =20 > So, as I've tried to tell many, many net-heads (people who ignore = applications layer behavior, like the people that think latency doesn't = matter to end users, only throughput), end-to-end packet arrival times = on a practical network are incredibly far from Poisson - and they are = more like fractal probability distributions, very irregular at all = scales of time. > =20 > So, the idea that iperf can estimate queue depth by Little's Lemma by = just measuring saturation of capacity of a path is bogus.The less = Poisson, the worse the estimate gets, by a huge factor. > =20 > =20 > Where does the Poisson assumption come from? Well, like many = theorems, it is the simplest tractable closed form solution - it creates = a simplified view, by being a "single-parameter" distribution (the = parameter is called lambda for a Poisson distribution). And the = analysis of a simple queue with poisson arrival distribution and a = static, fixed service time is the first interesting Queueing Theory = example in most textbooks. It is suggestive of an interesting = phenomenon, but it does NOT characterize any real system. > =20 > It's the queueing theory equivalent of "First, we assume a spherical = cow...". in doing an example in a freshman physics class. > =20 > Unfortunately, most networking engineers understand neither queuing = theory nor application networking usage in interactive applications. = Which makes them arrogant. They assume all distributions are poisson! > =20 > =20 > On Tuesday, July 6, 2021 9:46am, "Ben Greear" > said: >=20 > > Hello, > >=20 > > I am interested to hear wish lists for network testing features. We = make test > > equipment, supporting lots > > of wifi stations and a distributed architecture, with built-in udp, = tcp, ipv6, > > http, ... protocols, > > and open to creating/improving some of our automated tests. > >=20 > > I know Dave has some test scripts already, so I'm not necessarily = looking to > > reimplement that, > > but more fishing for other/new ideas. > >=20 > > Thanks, > > Ben > >=20 > > On 7/2/21 4:28 PM, Bob McMahon wrote: > > > I think we need the language of math here. It seems like the = network > > power metric, introduced by Kleinrock and Jaffe in the late 70s, is = something > > useful. > > > Effective end/end queue depths per Little's law also seems useful. = Both are > > available in iperf 2 from a test perspective. Repurposing test = techniques to > > actual > > > traffic could be useful. Hence the question around what exact = telemetry > > is useful to apps making socket write() and read() calls. > > > > > > Bob > > > > > > On Fri, Jul 2, 2021 at 10:07 AM Dave Taht > > >> wrote: > > > > > > In terms of trying to find "Quality" I have tried to encourage = folk to > > > both read "zen and the art of motorcycle maintenance"[0], and = Deming's > > > work on "total quality management". > > > > > > My own slice at this network, computer and lifestyle "issue" is = aiming > > > for "imperceptible latency" in all things. [1]. There's a lot of > > > fallout from that in terms of not just addressing queuing delay, = but > > > caching, prefetching, and learning more about what a user really = needs > > > (as opposed to wants) to know via intelligent agents. > > > > > > [0] If you want to get depressed, read Pirsig's successor to = "zen...", > > > lila, which is in part about what happens when an engineer hits an > > > insoluble problem. > > > [1] https://www.internetsociety.org/events/latency2013/ = > > > > > > > > > > > > > > > On Thu, Jul 1, 2021 at 6:16 PM David P. Reed > > >> wrote: > > > > > > > > Well, nice that the folks doing the conference are willing to > > consider that quality of user experience has little to do with = signalling rate at > > the > > > physical layer or throughput of FTP transfers. > > > > > > > > > > > > > > > > But honestly, the fact that they call the problem "network = quality" > > suggests that they REALLY, REALLY don't understand the Internet = isn't the hardware > > or > > > the routers or even the routing algorithms *to its users*. > > > > > > > > > > > > > > > > By ignoring the diversity of applications now and in the future, > > and the fact that we DON'T KNOW what will be coming up, this = conference will > > likely fall > > > into the usual trap that net-heads fall into - optimizing for some > > imaginary reality that doesn't exist, and in fact will probably = never be what > > users > > > actually will do given the chance. > > > > > > > > > > > > > > > > I saw this issue in 1976 in the group developing the original > > Internet protocols - a desire to put *into the network* special = tricks to optimize > > ASR33 > > > logins to remote computers from terminal concentrators (aka remote > > login), bulk file transfers between file systems on different = time-sharing > > systems, and > > > "sessions" (virtual circuits) that required logins. And then = trying to > > exploit underlying "multicast" by building it into the IP layer, = because someone > > > thought that TV broadcast would be the dominant application. > > > > > > > > > > > > > > > > Frankly, to think of "quality" as something that can be = "provided" > > by "the network" misses the entire point of "end-to-end argument in = system > > design". > > > Quality is not a property defined or created by The Network. If = you want > > to talk about Quality, you need to talk about users - all the users = at all times, > > > now and into the future, and that's something you can't do if you = don't > > bother to include current and future users talking about what they = might expect > > to > > > experience that they don't experience. > > > > > > > > > > > > > > > > There was much fighting back in 1976 that basically involved > > "network experts" saying that the network was the place to "solve" = such issues as > > quality, > > > so applications could avoid having to solve such issues. > > > > > > > > > > > > > > > > What some of us managed to do was to argue that you can't = "solve" > > such issues. All you can do is provide a framework that enables = different uses to > > > *cooperate* in some way. > > > > > > > > > > > > > > > > Which is why the Internet drops packets rather than queueing = them, > > and why diffserv cannot work. > > > > > > > > (I know the latter is conftroversial, but at the moment, ALL of > > diffserv attempts to talk about end-to-end applicaiton specific = metrics, but > > never, ever > > > explains what the diffserv control points actually do w.r.t. what = the IP > > layer can actually control. So it is meaningless - another violation = of the > > > so-called end-to-end principle). > > > > > > > > > > > > > > > > Networks are about getting packets from here to there, = multiplexing > > the underlying resources. That's it. Quality is a whole different = thing. Quality > > can > > > be improved by end-to-end approaches, if the underlying network = provides > > some kind of thing that actually creates a way for end-to-end = applications to > > > affect queueing and routing decisions, and more importantly = getting > > "telemetry" from the network regarding what is actually going on = with the other > > > end-to-end users sharing the infrastructure. > > > > > > > > > > > > > > > > This conference won't talk about it this way. So don't waste = your > > time. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wednesday, June 30, 2021 8:12pm, "Dave Taht" > > = >> said: > > > > > > > > > The program committee members are *amazing*. Perhaps, finally, > > we can > > > > > move the bar for the internet's quality metrics past endless, > > blind > > > > > repetitions of speedtest. > > > > > > > > > > For complete details, please see: > > > > > https://www.iab.org/activities/workshops/network-quality/ = > > > > > > > > > > > > > Submissions Due: Monday 2nd August 2021, midnight AOE > > (Anywhere On Earth) > > > > > Invitations Issued by: Monday 16th August 2021 > > > > > > > > > > Workshop Date: This will be a virtual workshop, spread over > > three days: > > > > > > > > > > 1400-1800 UTC Tue 14th September 2021 > > > > > 1400-1800 UTC Wed 15th September 2021 > > > > > 1400-1800 UTC Thu 16th September 2021 > > > > > > > > > > Workshop co-chairs: Wes Hardaker, Evgeny Khorov, Omer Shapira > > > > > > > > > > The Program Committee members: > > > > > > > > > > Jari Arkko, Olivier Bonaventure, Vint Cerf, Stuart Cheshire, > > Sam > > > > > Crowford, Nick Feamster, Jim Gettys, Toke Hoiland-Jorgensen, > > Geoff > > > > > Huston, Cullen Jennings, Katarzyna Kosek-Szott, Mirja > > Kuehlewind, > > > > > Jason Livingood, Matt Mathias, Randall Meyer, Kathleen > > Nichols, > > > > > Christoph Paasch, Tommy Pauly, Greg White, Keith Winstein. > > > > > > > > > > Send Submissions to: network-quality-workshop-pc@iab.org = > > >. > > > > > > > > > > Position papers from academia, industry, the open source > > community and > > > > > others that focus on measurements, experiences, observations > > and > > > > > advice for the future are welcome. Papers that reflect > > experience > > > > > based on deployed services are especially welcome. The > > organizers > > > > > understand that specific actions taken by operators are > > unlikely to be > > > > > discussed in detail, so papers discussing general categories > > of > > > > > actions and issues without naming specific technologies, > > products, or > > > > > other players in the ecosystem are expected. Papers should not > > focus > > > > > on specific protocol solutions. > > > > > > > > > > The workshop will be by invitation only. Those wishing to > > attend > > > > > should submit a position paper to the address above; it may > > take the > > > > > form of an Internet-Draft. > > > > > > > > > > All inputs submitted and considered relevant will be published > > on the > > > > > workshop website. The organisers will decide whom to invite > > based on > > > > > the submissions received. Sessions will be organized according > > to > > > > > content, and not every accepted submission or invited attendee > > will > > > > > have an opportunity to present as the intent is to foster > > discussion > > > > > and not simply to have a sequence of presentations. > > > > > > > > > > Position papers from those not planning to attend the virtual > > sessions > > > > > themselves are also encouraged. A workshop report will be > > published > > > > > afterwards. > > > > > > > > > > Overview: > > > > > > > > > > "We believe that one of the major factors behind this lack of > > progress > > > > > is the popular perception that throughput is the often sole > > measure of > > > > > the quality of Internet connectivity. With such narrow focus, > > people > > > > > don=E2=80=99t consider questions such as: > > > > > > > > > > What is the latency under typical working conditions? > > > > > How reliable is the connectivity across longer time periods? > > > > > Does the network allow the use of a broad range of protocols? > > > > > What services can be run by clients of the network? > > > > > What kind of IPv4, NAT or IPv6 connectivity is offered, and > > are there firewalls? > > > > > What security mechanisms are available for local services, > > such as DNS? > > > > > To what degree are the privacy, confidentiality, integrity > > and > > > > > authenticity of user communications guarded? > > > > > > > > > > Improving these aspects of network quality will likely depend > > on > > > > > measurement and exposing metrics to all involved parties, > > including to > > > > > end users in a meaningful way. Such measurements and exposure > > of the > > > > > right metrics will allow service providers and network > > operators to > > > > > focus on the aspects that impacts the users=E2=80=99 = experience > > most and at > > > > > the same time empowers users to choose the Internet service > > that will > > > > > give them the best experience." > > > > > > > > > > > > > > > -- > > > > > Latest Podcast: > > > > > > > = https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ = > > = > > > > > > > > > > > Dave T=C3=A4ht CTO, TekLibre, LLC > > > > > _______________________________________________ > > > > > Cerowrt-devel mailing list > > > > > Cerowrt-devel@lists.bufferbloat.net = > > > > > > > > https://lists.bufferbloat.net/listinfo/cerowrt-devel = > > > > > > > > > > > > > > > > > > > > -- > > > Latest Podcast: > > > = https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ = > > = > > > > > > > Dave T=C3=A4ht CTO, TekLibre, LLC > > > _______________________________________________ > > > Make-wifi-fast mailing list > > > Make-wifi-fast@lists.bufferbloat.net = > > > > > > https://lists.bufferbloat.net/listinfo/make-wifi-fast = > > > > > > > > > > > > This electronic communication and the information and any files = transmitted > > with it, or attached to it, are confidential and are intended solely = for the use > > of > > > the individual or entity to whom it is addressed and may contain = information > > that is confidential, legally privileged, protected by privacy laws, = or otherwise > > > restricted from disclosure to anyone else. If you are not the = intended > > recipient or the person responsible for delivering the e-mail to the = intended > > recipient, > > > you are hereby notified that any use, copying, distributing, = dissemination, > > forwarding, printing, or copying of this e-mail is strictly = prohibited. If you > > > received this e-mail in error, please return the e-mail to the = sender, delete > > it from your computer, and destroy any printed copy of it. > > > > > > _______________________________________________ > > > Starlink mailing list > > > Starlink@lists.bufferbloat.net = > > > https://lists.bufferbloat.net/listinfo/starlink = > > > > >=20 > >=20 > > -- > > Ben Greear > > > Candela Technologies Inc http://www.candelatech.com = > > > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink = _________________________= ______________________ > Make-wifi-fast mailing list > Make-wifi-fast@lists.bufferbloat.net = > https://lists.bufferbloat.net/listinfo/make-wifi-fast = --Apple-Mail=_422EA467-017F-4539-8634-56F78D0A0508 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 David,

No = question that non-stationarity and instability are what we often see in = networks.  And, non-stationarity and instability are both topics = that lead to very complex analytical problems in queueing theory. =  You can find some results on the transient analysis in the = queueing theory literature (including the second volume of my Queueing = Systems book), but they are limited and hard. Nevertheless, the = literature does contain some works on transient analysis of queueing = systems as applied to network congestion control - again = limited. On the other hand, as you said, control theory addresses = stability head on and does offer some tools as well, but again, it is = hairy. 

Averages are only averages, but they can provide valuable = information. For sure, latency can and does confound behavior. =  But, as you point out, it is the proliferation of control = protocols that are, in some cases, deployed willy-nilly in networks = without proper evaluation of their behavior that can lead to the nasty = cycle of large transient latency, frantic repeating of web requests, = protocols sending multiple copies, lack of awareness of true capacity or = queue size or throughput, etc, all of which you articulate so well, = create the chaos and frustration in the network.  Analyzing that is = really difficult, and if we don=E2=80=99t measure and sense, we have no = hope of understanding, controlling, or ameliorating such situations. =  

Len

On Jul 9, 2021, at 12:31 PM, = David P. Reed <dpreed@deepplum.com> wrote:

Len - I admit I made a mistake in challenging Little's Law as = being based on Poisson processes. It is more general. But it tells you = an "average" in its base form, and latency averages are not useful for = end user applications.

 

However, Little's Law does assume something that is not = actually valid about the kind of distributions seen in the network, and = in fact, it is NOT true that networks converge on Poisson arrival = times.

 

The key issue is well-described = in the sandard analysis of the M/M/1 queue (e.g. https://en.wikipedia.org/wiki/M/M/1_queue) , which is = done only for Poisson processes, and is also limited to "stable" = systems. But networks are never stable when fully loaded. They get = unstable and those instabilities persist for a long time in the network. = Instability is at core the underlying *requirement* of the Internet's = usage.

 

So specifically: real networks, = even large ones, and certainly the Internet today, are not asymptotic = limits of sums of stationary stochastic arrival processes. Each esternal = terminal of any real network has a real user there, running a real = application, and the network is a complex graph. This makes it = completely unlike a single queue. Even the links within a network carry = a relatively small number of application flows. There's no ability to = apply the Law of Large Numbers to the distributions, because any = particular path contains only a small number of serialized flows with = hightly variable rates.

 

Here's an example of what really happens in a real network = (I've observed this in 5 different cities on ATT's cellular network, = back when it was running Alcatel Lucent HSPA+ gear in those = cities).
But you = can see this on any network where transient overload occurs, creating = instability.

 

 

At 7 AM, the data transmission of the network is = roughty stable. That's because no links are overloaded within the = network. Little's Law can tell you by observing the delay and throughput = on any path that the average delay in the network is X.

 

Continue sampling delay in the network as the = day wears on. At about 10 AM, ping delay starts to soar into the = multiple second range. No packers are lost. The peak ping time is about = 4000 milliseconds - 4 seconds in most of the networks. This is in = downtown, no radio errors are reported, no link errors.
So it is all queueing = delay. 

 

Now what Little's law doesn't = tell you much about average delay, because clearly *some* subpiece of = the network is fully saturated. But what is interesting here is what is = happening and where. You can't tell what is saturated, and in fact the = entire network is quite unstable, because the peak is constantly varying = and you don't know where the throughput is. All the packets are now = arriving 4 seconds or so later.

 

Why is the situaton not worse than 4 seconds? = Well, there are multiple things going on:

 

1) TCP may be doing a lot of retransmissions = (non-Poisson at all, not random either. The arrival process is entirely = deterministic in each source, based on the retransmission timeout) or it = may not be.

 

2) Users are pissed off, because = they clicked on a web page, and got nothing back. They retry on their = screen, or they try another site. Meanwhile, the underlying TCP = connection remains there, pumping the network full of more packets on = that old path, which is still backed up with packets that haven't been = delivered that are sitting in queues. The real arrival process is not = Poisson at all, its a deterministic, repeated retrsnsmission plus a new = attempt to connect to a new site.

 

3) When the users get a web page back = eventually, it is filled with names of other pieces needed to display = that web page, which causes some number (often as many as 100) new pages = to be fetched, ALL at the same time. Certainly not a stochastic process = that will just obey the law of large numbers.

 

All of these things are the result of initial = instability, causing queues to build up.

 

So what is the state of the system? is it = stable? is it stochastic? Is it the sum of enough stochastic stable = flows to average out to Poisson?

 

The answer is clearly NO. Control theory (not = queuing theory) suggests that this system is completely uncontrolled and = unstable.

 

So if the system is in this = state, what does Little's Lemma tell us? What is the meaning of that = hightly variable 4 second delay on ping packets, in terms of average = utilizaton of the network?

 

We don't even know what all the users really = might need, if the system hadn't become unstable, because some users = have given up, and others are trying even harder, and new users are = arriving.

 

What we do know, because ATT (at = my suggestion) reconfigured their system after blaming Apple Computer = company for "bugs" in the original iPhone in public, is that simply = *dropping* packets sitting in queues more than a couple milliseconds = MADE THE USERS HAPPY. Apparently the required capacity was there all = along! 

 

So I conclude that the 4 second = delay was the largest delay users could barely tolerate before deciding = the network was DOWN and going away. And that the backup was the = accumulation of useless packets sitting in queues because none of the = end systems were receiving congestion signals (which for the Internet = stack begins with packet dropping).

 

I should say that most operators, and especially = ATT in this case, do not measure end-to-end latency. Instead they use = Little's Lemma to query routers for their current throughput in bits per = second, and calculate latency as if Little's Lemma applied. This results = in reports to management that literally say:

 

  The network is not dropping packets, = utilization is near 100% on many of our switches and routers.

 

And management responds, Hooray! Because = utilization of 100% of their hardware is their investors' metric of = maximizing profits. The hardware they are operating is fully utilized. = No waste! And users are happy because no packets have been = dropped!

 

Hmm... what's wrong with this = picture? I can see why Donovan, CTO, would accuse Apple of lousy = software that was ruining iPhone user experience!  His network was = operating without ANY problems.
So it must be Apple!

 

Well, no. The entire problem, as we saw when ATT = just changed to shorten egress queues and drop packets when the egress = queues overflowed, was that ATT's network was amplifying instability, = not at the link level, but at the network level.

 

And queueing theory can help with that, but = *intro queueing theory* cannot.

 

And a big part of that problem is the pervasive = belief that, at the network boundary, *Poisson arrival* is a reasonable = model for use in all cases.

 

 

 

 

 

 

 

 

 

 

On Friday, July 9, 2021 6:05am, "Luca = Muscariello" <muscariello@ieee.org> said:

For those = who might be interested in Little's law
there is = a nice paper by John Little on the occasion 
of the = 50th anniversary  of the result.
 
Nice read. 
Luca 
 
P.S. 
Who has not a copy of L. Kleinrock's books? I do have and am = not ready to lend them!
On Fri, Jul 9, 2021 at 11:01 AM = Leonard Kleinrock <lk@cs.ucla.edu> wrote:
David,
I totally appreciate  your attention to when and = when not analytical modeling works. Let me clarify a few things from = your note.
First, Little's law (also known as Little=E2=80=99s = lemma or, as I use in my book, Little=E2=80=99s result) does not assume = Poisson arrivals -  it is good for any = arrival process and any service process and is an equality between time = averages.  It states that the time average of the number in a = system (for a sample path w) is equal to the = average arrival rate to the system multiplied by the time-averaged time = in the system for that sample path.  This is often written as =   NTimeAvg =3D=CE=BB=C2=B7TTimeAvg .  Moreover, if the system is also = ergodic, then the time average equals the ensemble average and we often = write it as N =CC=84 =3D =CE=BB T =CC=84 .  In any case, this requires neither Poisson arrivals = nor exponential service times.  
 
Queueing theorists often do study the case of Poisson = arrivals.  True, it makes the analysis easier, yet there is a = better reason it is often used, and that is because the sum of a large = number of independent stationary renewal processes approaches a Poisson = process.  So nature often gives us Poisson arrivals.  
Best,
Len
On Jul 8, 2021, at 12:38 PM, David P. Reed <dpreed@deepplum.com> wrote:

I will tell you flat out that the arrival time = distribution assumption made by Little's Lemma that allows "estimation = of queue depth" is totally unreasonable on ANY Internet in = practice.

 

The assumption is a Poisson Arrival Process. In = reality, traffic arrivals in real internet applications are extremely = far from Poisson, and, of course, using TCP windowing, become highly = intercorrelated with crossing traffic that shares the same = queue.

 

So, as I've tried to tell many, many net-heads (people = who ignore applications layer behavior, like the people that think = latency doesn't matter to end users, only throughput), end-to-end packet = arrival times on a practical network are incredibly far from Poisson - = and they are more like fractal probability distributions, very irregular = at all scales of time.

 

So, the idea that iperf can estimate queue depth by = Little's Lemma by just measuring saturation of capacity of a path is = bogus.The less Poisson, the worse the estimate gets, by a huge = factor.

 

 

Where does the Poisson assumption come from?  = Well, like many theorems, it is the simplest tractable closed form = solution - it creates a simplified view, by being a "single-parameter" = distribution (the parameter is called lambda for a Poisson = distribution).  And the analysis of a simple queue with poisson = arrival distribution and a static, fixed service time is the first = interesting Queueing Theory example in most textbooks. It is suggestive = of an interesting phenomenon, but it does NOT characterize any real = system.

 

It's the queueing theory equivalent of "First, we = assume a spherical cow...". in doing an example in a freshman physics = class.

 

Unfortunately, most networking engineers understand = neither queuing theory nor application networking usage in interactive = applications. Which makes them arrogant. They assume all distributions = are poisson!

 

 

On Tuesday, July 6, 2021 9:46am, "Ben Greear" <greearb@candelatech.com> said:

> Hello,
>
> I = am interested to hear wish lists for network testing features. We make = test
> equipment, supporting lots
> of = wifi stations and a distributed architecture, with built-in udp, tcp, = ipv6,
> http, ... protocols,
> and = open to creating/improving some of our automated tests.
>=
> I know Dave has some test scripts already, so I'm = not necessarily looking to
> reimplement that,
> but more fishing for other/new ideas.
> =
> Thanks,
> Ben
> =
> On 7/2/21 4:28 PM, Bob McMahon wrote:
> > I think we need the language of math here. It = seems like the network
> power metric, introduced by = Kleinrock and Jaffe in the late 70s, is something
> = useful.
> > Effective end/end queue depths per = Little's law also seems useful. Both are
> available in = iperf 2 from a test perspective. Repurposing test techniques to
> actual
> > traffic could be useful. = Hence the question around what exact telemetry
> = is useful to apps making socket write() and read() calls.
> >
> > Bob
> = >
> > On Fri, Jul 2, 2021 at 10:07 AM Dave Taht = <dave.taht@gmail.com
> <mailto:dave.taht@gmail.com>> wrote:
> >
> > In terms of trying to find = "Quality" I have tried to encourage folk to
> > both = read "zen and the art of motorcycle maintenance"[0], and Deming's
> > work on "total quality management".
> >
> > My own slice at this = network, computer and lifestyle "issue" is aiming
> = > for "imperceptible latency" in all things. [1]. There's a lot of
> > fallout from that in terms of not just addressing = queuing delay, but
> > caching, prefetching, and = learning more about what a user really needs
> > (as = opposed to wants) to know via intelligent agents.
> = >
> > [0] If you want to get depressed, read = Pirsig's successor to "zen...",
> > lila, which is = in part about what happens when an engineer hits an
> = > insoluble problem.
> > [1] https://www.internetsociety.org/events/latency2013/
> <https://www.internetsociety.org/events/latency2013/>
> >
> >
> >
> > On Thu, Jul 1, 2021 at 6:16 PM David P. Reed <dpreed@deepplum.com
> <mailto:dpreed@deepplum.com>> wrote:
> > >
> > > Well, nice that = the folks doing the conference  are willing to
> = consider that quality of user experience has little to do with = signalling rate at
> the
> > = physical layer or throughput of FTP transfers.
> > = >
> > >
> > >
> > > But honestly, the fact that they call the = problem "network quality"
> suggests that they REALLY, = REALLY don't understand the Internet isn't the hardware
>= or
> > the routers or even the routing algorithms = *to its users*.
> > >
> > = >
> > >
> > > By = ignoring the diversity of applications now and in the future,
> and the fact that we DON'T KNOW what will be coming up, = this conference will
> likely fall
> = > into the usual trap that net-heads fall into - optimizing for = some
> imaginary reality that doesn't exist, and in = fact will probably never be what
> users
> > actually will do given the chance.
>= > >
> > >
> > >
> > > I saw this issue in 1976 in the group = developing the original
> Internet protocols - a desire = to put *into the network* special tricks to optimize
> = ASR33
> > logins to remote computers from terminal = concentrators (aka remote
> login), bulk file transfers = between file systems on different time-sharing
> = systems, and
> > "sessions" (virtual circuits) that = required logins. And then trying to
> exploit = underlying "multicast" by building it into the IP layer, because = someone
> > thought that TV broadcast would be the = dominant application.
> > >
> = > >
> > >
> > > = Frankly, to think of "quality" as something that can be "provided"
> by "the network" misses the entire point of "end-to-end = argument in system
> design".
> > = Quality is not a property defined or created by The Network. If you = want
> to talk about Quality, you need to talk about = users - all the users at all times,
> > now and into = the future, and that's something you can't do if you don't
> bother to include current and future users talking about = what they might expect
> to
> > = experience that they don't experience.
> > >
> > >
> > >
> = > > There was much fighting back in 1976 that basically = involved
> "network experts" saying that the network = was the place to "solve" such issues as
> quality,
> > so applications could avoid having to solve such = issues.
> > >
> > >
> > >
> > > What some of us = managed to do was to argue that you can't "solve"
> = such issues. All you can do is provide a framework that enables = different uses to
> > *cooperate* in some way.
> > >
> > >
> = > >
> > > Which is why the Internet drops = packets rather than queueing them,
> and why diffserv = cannot work.
> > >
> > > = (I know the latter is conftroversial, but at the moment, ALL of
> diffserv attempts to talk about end-to-end applicaiton = specific metrics, but
> never, ever
> = > explains what the diffserv control points actually do w.r.t. what = the IP
> layer can actually control. So it is = meaningless - another violation of the
> > so-called = end-to-end principle).
> > >
> = > >
> > >
> > > = Networks are about getting packets from here to there, multiplexing
> the underlying resources. That's it. Quality is a whole = different thing. Quality
> can
> > = be improved by end-to-end approaches, if the underlying network = provides
> some kind of thing that actually creates a = way for end-to-end applications to
> > affect = queueing and routing decisions, and more importantly getting
> "telemetry" from the network regarding what is actually = going on with the other
> > end-to-end users sharing = the infrastructure.
> > >
> > = >
> > >
> > > This = conference won't talk about it this way. So don't waste your
> time.
> > >
> = > >
> > >
> > >
> > >
> > >
> = > >
> > > On Wednesday, June 30, 2021 = 8:12pm, "Dave Taht"
> <dave.taht@gmail.com <mailto:dave.taht@gmail.com>> said:
>= > >
> > > > The program committee = members are *amazing*. Perhaps, finally,
> we can
> > > > move the bar for the internet's quality = metrics past endless,
> blind
> > = > > repetitions of speedtest.
> > > >
> > > > For complete details, please see:
> > > > https://www.iab.org/activities/workshops/network-quality/> <https://www.iab.org/activities/workshops/network-quality/&g= t;
> > > >
> > > > = Submissions Due: Monday 2nd August 2021, midnight AOE
> = (Anywhere On Earth)
> > > > Invitations Issued = by: Monday 16th August 2021
> > > >
> > > > Workshop Date: This will be a virtual = workshop, spread over
> three days:
> = > > >
> > > > 1400-1800 UTC Tue 14th = September 2021
> > > > 1400-1800 UTC Wed 15th = September 2021
> > > > 1400-1800 UTC Thu 16th = September 2021
> > > >
> > = > > Workshop co-chairs: Wes Hardaker, Evgeny Khorov, Omer = Shapira
> > > >
> > > = > The Program Committee members:
> > > >
> > > > Jari Arkko, Olivier Bonaventure, Vint = Cerf, Stuart Cheshire,
> Sam
> > = > > Crowford, Nick Feamster, Jim Gettys, Toke = Hoiland-Jorgensen,
> Geoff
> > > = > Huston, Cullen Jennings, Katarzyna Kosek-Szott, Mirja
> Kuehlewind,
> > > > Jason = Livingood, Matt Mathias, Randall Meyer, Kathleen
> = Nichols,
> > > > Christoph Paasch, Tommy = Pauly, Greg White, Keith Winstein.
> > > >
> > > > Send Submissions to: network-quality-workshop-pc@iab.org
> = <mailto:network-quality-workshop-pc@iab.org>.
> > > >
> > > > = Position papers from academia, industry, the open source
> community and
> > > > others = that focus on measurements, experiences, observations
> = and
> > > > advice for the future are welcome. = Papers that reflect
> experience
> = > > > based on deployed services are especially welcome. The
> organizers
> > > > understand = that specific actions taken by operators are
> unlikely = to be
> > > > discussed in detail, so papers = discussing general categories
> of
> = > > > actions and issues without naming specific = technologies,
> products, or
> > = > > other players in the ecosystem are expected. Papers should = not
> focus
> > > > on = specific protocol solutions.
> > > >
> > > > The workshop will be by invitation only. = Those wishing to
> attend
> > > = > should submit a position paper to the address above; it may
> take the
> > > > form of an = Internet-Draft.
> > > >
> = > > > All inputs submitted and considered relevant will be = published
> on the
> > > > = workshop website. The organisers will decide whom to invite
> based on
> > > > the = submissions received. Sessions will be organized according
> to
> > > > content, and not = every accepted submission or invited attendee
> will
> > > > have an opportunity to present as the = intent is to foster
> discussion
> = > > > and not simply to have a sequence of presentations.
> > > >
> > > > = Position papers from those not planning to attend the virtual
> sessions
> > > > themselves = are also encouraged. A workshop report will be
> = published
> > > > afterwards.
>= > > >
> > > > Overview:
> > > >
> > > > "We = believe that one of the major factors behind this lack of
> progress
> > > > is the = popular perception that throughput is the often sole
> = measure of
> > > > the quality of Internet = connectivity. With such narrow focus,
> people
> > > > don=E2=80=99t consider questions such = as:
> > > >
> > > > = What is the latency under typical working conditions?
> = > > > How reliable is the connectivity across longer time = periods?
> > > > Does the network allow the = use of a broad range of protocols?
> > > > = What services can be run by clients of the network?
> = > > > What kind of IPv4, NAT or IPv6 connectivity is offered, = and
> are there firewalls?
> > > = > What security mechanisms are available for local services,
> such as DNS?
> > > > To what = degree are the privacy, confidentiality, integrity
> = and
> > > > authenticity of user = communications guarded?
> > > >
> > > > Improving these aspects of network = quality will likely depend
> on
> > = > > measurement and exposing metrics to all involved parties,
> including to
> > > > end users = in a meaningful way. Such measurements and exposure
> = of the
> > > > right metrics will allow = service providers and network
> operators to
> > > > focus on the aspects that impacts the = users=E2=80=99 experience
> most and at
> > > > the same time empowers users to choose = the Internet service
> that will
> = > > > give them the best experience."
> > = > >
> > > >
> > > = > --
> > > > Latest Podcast:
> > > >
> https://www.linkedin.com/feed/update/urn:li:activity:6791014284= 936785920/
> <https://www.linkedin.com/feed/update/urn:li:activity:6791014284= 936785920/>
> > > >
> = > > > Dave T=C3=A4ht CTO, TekLibre, LLC
> > = > > _______________________________________________
> > > > Cerowrt-devel mailing list
> > > > Cerowrt-devel@lists.bufferbloat.net
> = <mailto:Cerowrt-devel@lists.bufferbloat.net>
> > > > https://lists.bufferbloat.net/listinfo/cerowrt-devel
> <https://lists.bufferbloat.net/listinfo/cerowrt-devel>> > > >
> >
>= >
> >
> > --
> > Latest Podcast:
> > https://www.linkedin.com/feed/update/urn:li:activity:6791014284= 936785920/
> <https://www.linkedin.com/feed/update/urn:li:activity:6791014284= 936785920/>
> >
> > Dave = T=C3=A4ht CTO, TekLibre, LLC
> > = _______________________________________________
> > = Make-wifi-fast mailing list
> > Make-wifi-fast@lists.bufferbloat.net
> = <mailto:Make-wifi-fast@lists.bufferbloat.net>
> > https://lists.bufferbloat.net/listinfo/make-wifi-fast
> <https://lists.bufferbloat.net/listinfo/make-wifi-fast>> >
> >
> > = This electronic communication and the information and any files = transmitted
> with it, or attached to it, are = confidential and are intended solely for the use
> = of
> > the individual or entity to whom it is = addressed and may contain information
> that is = confidential, legally privileged, protected by privacy laws, or = otherwise
> > restricted from disclosure to anyone = else. If you are not the intended
> recipient or the = person responsible for delivering the e-mail to the intended
> recipient,
> > you are hereby = notified that any use, copying, distributing, dissemination,
> forwarding, printing, or copying of this e-mail is = strictly prohibited. If you
> > received this e-mail = in error, please return the e-mail to the sender, delete
> it from your computer, and destroy any printed copy of = it.
> >
> > = _______________________________________________
> > = Starlink mailing list
> > Starlink@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/starlink
> >
>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela = Technologies Inc http://www.candelatech.com
>
_______________________________________________
Starlink = mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink
_______________________________________________
= Make-wifi-fast mailing list
Make-wifi-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/make-wifi-fast

= --Apple-Mail=_422EA467-017F-4539-8634-56F78D0A0508--