From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp109.iad3a.emailsrvr.com (smtp109.iad3a.emailsrvr.com [173.203.187.109]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id DA0FC3B2A4 for ; Sat, 30 Jul 2022 17:12:01 -0400 (EDT) Received: from app68.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp38.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 49EA73BD6 for ; Sat, 30 Jul 2022 17:12:00 -0400 (EDT) Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app68.wa-webapps.iad3a (Postfix) with ESMTP id 3209BE010B; Sat, 30 Jul 2022 17:12:01 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Sat, 30 Jul 2022 17:12:01 -0400 (EDT) X-Auth-ID: dpreed@deepplum.com Date: Sat, 30 Jul 2022 17:12:01 -0400 (EDT) From: "David P. Reed" To: starlink@lists.bufferbloat.net Cc: starlink@lists.bufferbloat.net MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20220730171201000000_21246" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: X-Client-IP: 209.6.168.128 Message-ID: <1659215521.20211738@apps.rackspace.com> X-Mailer: webmail/19.0.17-RC X-Classification-ID: 39e02fa8-b0d0-4678-bdd0-9304a87de237-1-1 Subject: [Starlink] Natural Internet Traffic matters more than the network architects want to think about X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jul 2022 21:12:01 -0000 ------=_20220730171201000000_21246 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AThere's been a good discussion here triggered by my comments on "avoidin= g queueing delay" being a "good operating point" of the Internet. I have al= ways thought that the research community interested in bettering networks o= ught to invest its time in modeling "real need" of "real people" rather tha= n create yet another queueing model that misses what matters.=0A =0AQueuein= g theory is a pretty nice foundation, but it isn't sufficient unto inself, = because it is full of toy examples and narrowly relevant theorems. That's n= ot a complaint about queueing theory itself, but a complaint about assuming= it provides the answers in and of itself.=0A =0AThere's a related set of a= pplications of queueing theory that also suffer from similar misuse - which= I encountered long before network packet switching queueing theory - and t= hat's computer OS scheduling of processes. There is no valid reason to beli= eve that the load presented to a computer timesharing system is Poisson arr= ival modelable. But in the 1970's, OS research was full of claims about opt= imizing time sharing schedulers that assumed each user was a poisson proces= s. I worked on a real time sharing system with real users, and we worked on= the scheduler for that system to make it usable. (It was a 70 user Multics= system with two processors and 784K 36 bit words of memory that were paged= from a Librafile "drum" that had a 16 msec. latency).=0A =0AI mention this= , because it became VERY obvious that NONE of the queueing theory theorems = could tell us what to do to schedule user processes on that system. And the= reason was simple - humans are NOT poisson processes, nor are algorithms.= =0A =0AIf you look at the whole Internet today, the primary load is the WWW= , and now "conversational AV" [Zoom]. Not telephony, not video streaming, n= ot voice.=0AAnd if you look at what happens when ANY user clicks on a link,= and often even when some program running in the browser makes a decision a= t some scheduled time instant, a cascade of highly correlated events happen= s at unpredictable parts of the whole system.=0A =0AThis cascade of events = doesn't "average" or "smooth". In fact, what happens actually changes the = users' behavior in the future, a feedback loop that is really hard to predi= ct in advance.=0AAlso, the event cascade is highly parallelized across endp= oints. A typical landing page in the web launches near simultaneous request= s for up to 100 different sets of data (that's cookies and advertising etc.= ).=0A =0AThis cascade is also getting much worse, because Google has decide= d to invent a non-TCP protocol called QUIC, which uses UDP packets for all = kinds of concurrent activity.=0A =0AStatistically, this is incredibly burst= y. (despite what Comcast researchers might share with academics).=0AMoreove= r, response time involves latencies being added up because many packets are= emitted only after the packet makes many RTT's over the Internet.=0A =0ATh= is is the actual reality of the Internet (or any corporate Intranet). =0A = =0AYet, every user-driven action (retrieving a file of any size, or firing = a bullet at an "enemy") wants to have predictable, very low response time o= verall - AS SEEN BY THE USERS.=0A =0AWhat to do?=0A =0AWell in fact, a lot = can be done. You can't tell the designers of applications or whatever to re= design their applications to be "less bursty" at a bottleneck link. Instead= , what you do is make it so there are never any substantive queues anywhere= in the network, 99.99% of the time. (Notice I talk about "availability" he= re, and not "goodput" - 5 9's of availability is 99.999%, and 4 9's is OK, = 5 9's is great.)=0A =0AHow is this done in a related but simpler environmen= t? Well I happen to be familiar with high-performance computing architecure= in the data center - which also has networks in it. They are called "Buses= " - the memory buses, I/O buses, ... Ask yourself what the "average utiliza= tion" of the memory bus which connects CPU to DRAM actually is. Well, It's = well under 10%. Well under. That is achieved by caching DRAM contents in th= e CPU, to achieve a very high hit-ratio in the L3 cache. But L3 cache isn't= fast enough to feed 6-10 cores. So each group of cores has L2 cache that c= aches what is in L3, to get a pretty high hit ratio, etc. And L1 cache can = respond to CPU demand much of the time, however, even that bus is too slow = for the rate at which the processor can do adds, tests, etc. so each hypert= hread actually runs concurrently.=0A =0AThe insight that is common between = the Internet and high-performance computing bus architectures is that *you = can't design application loads that fit the network architecture* Instead y= ou create very flexible, overprovisioned switching frameworks, and you don'= t spend your time on what happens when one link is saturated - because when= you are at that operating point, your design has FAILED.=0A =0AThe Hotrodd= er Fallacy infects some of computer architecture, too. The Stream benchmark= tells you a throughput number for a completely useless benchmark test of t= he memory channel throughput in a system. But in fact, no one buys systems = or builds applications as if the Stream benchmark matters very much.=0A =0A= Sadly, just as in high performance computer architecture, it is very, very = difficult to create a simple model of the workload on a complex highly conc= urrent system.=0A =0AAnd you can't get "data" that isn't corrupted by limit= ations of a particular architecture and a particular workload at a particul= ar time that would let you design without looking at the real world.=0A =0A= So, what I would suggest is taking a really good look at the real world.=0A= =0ANo one is setting up any observation mechanisms for potential serious i= ssues that will pop up when QUIC becomes the replacement for HTTP/1.1 and H= TTP/2. Yet that is coming. They claim they have "congestion control". I hav= e reviewed the spec. There are no simulations, no measurements, no thought = about "time constants". There's no "flows" so "flow fairness" won't help ba= lance and stabilize traffic among independent users.=0A =0ABut since we won= 't even have a way to LOOK at the behavior of the rate limiting edges of th= e network with QUIC, we will be left wondering what the hell happened.=0A = =0AThis is exacly what I think is happening in Starlink, too. There are no = probe points that can show where the queueing delay is in the starlink desi= gn. In fairness, they have been racing to get a product out for Musk to bra= g about, which encourages the Hotrodder Fallacy to be used for measurements= instead of real Internet use.=0A ------=_20220730171201000000_21246 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

There's been a good di= scussion here triggered by my comments on "avoiding queueing delay" being a= "good operating point" of the Internet. I have always thought that the res= earch community interested in bettering networks ought to invest its time i= n modeling "real need" of "real people" rather than create yet another queu= eing model that misses what matters.

=0A

 

= =0A

Queueing theory is a pretty nice foundation, but it= isn't sufficient unto inself, because it is full of toy examples and narro= wly relevant theorems. That's not a complaint about queueing theory itself,= but a complaint about assuming it provides the answers in and of itself.=0A

 

=0A

There's a relat= ed set of applications of queueing theory that also suffer from similar mis= use - which I encountered long before network packet switching queueing the= ory - and that's computer OS scheduling of processes. There is no valid rea= son to believe that the load presented to a computer timesharing system is = Poisson arrival modelable. But in the 1970's, OS research was full of claim= s about optimizing time sharing schedulers that assumed each user was a poi= sson process. I worked on a real time sharing system with real users, and w= e worked on the scheduler for that system to make it usable. (It was a 70 u= ser Multics system with two processors and 784K 36 bit words of memory that= were paged from a Librafile "drum" that had a 16 msec. latency).

=0A

 

=0A

I mention this, because= it became VERY obvious that NONE of the queueing theory theorems could tel= l us what to do to schedule user processes on that system. And the reason w= as simple - humans are NOT poisson processes, nor are algorithms.

=0A

 

=0A

If you look at the whol= e Internet today, the primary load is the WWW, and now "conversational AV" = [Zoom]. Not telephony, not video streaming, not voice.

=0A

And if you look at what happens when ANY user clicks on a link, and = often even when some program running in the browser makes a decision at som= e scheduled time instant, a cascade of highly correlated events happens at = unpredictable parts of the whole system.

=0A

 <= /p>=0A

This cascade of events doesn't "average" or "smo= oth".  In fact, what happens actually changes the users' behavior in t= he future, a feedback loop that is really hard to predict in advance.

= =0A

Also, the event cascade is highly parallelized acro= ss endpoints. A typical landing page in the web launches near simultaneous = requests for up to 100 different sets of data (that's cookies and advertisi= ng etc.).

=0A

 

=0A

This= cascade is also getting much worse, because Google has decided to invent a= non-TCP protocol called QUIC, which uses UDP packets for all kinds of conc= urrent activity.

=0A

 

=0A

Statistically, this is incredibly bursty. (despite what Comcast research= ers might share with academics).

=0A

Moreover, respo= nse time involves latencies being added up because many packets are emitted= only after the packet makes many RTT's over the Internet.

=0A

 

=0A

This is the actual reality o= f the Internet (or any corporate Intranet). 

=0A

Yet, every user-driven action (retrievi= ng a file of any size, or firing a bullet at an "enemy") wants to have pred= ictable, very low response time overall - AS SEEN BY THE USERS.

=0A

 

=0A

What to do?

=0A

 

=0A

Well in fact, a lot can be = done. You can't tell the designers of applications or whatever to redesign = their applications to be "less bursty" at a bottleneck link. Instead, what = you do is make it so there are never any substantive queues anywhere in the= network, 99.99% of the time. (Notice I talk about "availability" here, and= not "goodput" - 5 9's of availability is 99.999%, and 4 9's is OK, 5 9's i= s great.)

=0A

 

=0A

How = is this done in a related but simpler environment? Well I happen to be fami= liar with high-performance computing architecure in the data center - which= also has networks in it. They are called "Buses" - the memory buses, I/O b= uses, ... Ask yourself what the "average utilization" of the memory bus whi= ch connects CPU to DRAM actually is. Well, It's well under 10%. Well under.= That is achieved by caching DRAM contents in the CPU, to achieve a very hi= gh hit-ratio in the L3 cache. But L3 cache isn't fast enough to feed 6-10 c= ores. So each group of cores has L2 cache that caches what is in L3, to get= a pretty high hit ratio, etc. And L1 cache can respond to CPU demand much = of the time, however, even that bus is too slow for the rate at which the p= rocessor can do adds, tests, etc. so each hyperthread actually runs concurr= ently.

=0A

 

=0A

The ins= ight that is common between the Internet and high-performance computing bus= architectures is that *you can't design application loads that fit the net= work architecture* Instead you create very flexible, overprovisioned switch= ing frameworks, and you don't spend your time on what happens when one link= is saturated - because when you are at that operating point, your design h= as FAILED.

=0A

 

=0A

The= Hotrodder Fallacy infects some of computer architecture, too. The Stream b= enchmark tells you a throughput number for a completely useless benchmark t= est of the memory channel throughput in a system. But in fact, no one buys = systems or builds applications as if the Stream benchmark matters very much= .

=0A

 

=0A

Sadly, just = as in high performance computer architecture, it is very, very difficult to= create a simple model of the workload on a complex highly concurrent syste= m.

=0A

 

=0A

And you can= 't get "data" that isn't corrupted by limitations of a particular architect= ure and a particular workload at a particular time that would let you desig= n without looking at the real world.

=0A

 

= =0A

So, what I would suggest is taking a really good lo= ok at the real world.

=0A

 

=0A

No one is setting up any observation mechanisms for potential serio= us issues that will pop up when QUIC becomes the replacement for HTTP/1.1 a= nd HTTP/2. Yet that is coming. They claim they have "congestion control". I= have reviewed the spec. There are no simulations, no measurements, no thou= ght about "time constants". There's no "flows" so "flow fairness" won't hel= p balance and stabilize traffic among independent users.

=0A

 

=0A

But since we won't even have a w= ay to LOOK at the behavior of the rate limiting edges of the network with Q= UIC, we will be left wondering what the hell happened.

=0A

 

=0A

This is exacly what I think is hap= pening in Starlink, too. There are no probe points that can show where the = queueing delay is in the starlink design. In fairness, they have been racin= g to get a product out for Musk to brag about, which encourages the Hotrodd= er Fallacy to be used for measurements instead of real Internet use.

=0A=

 

------=_20220730171201000000_21246--