From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id EF19A3B29D for ; Mon, 16 Oct 2023 13:36:05 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1697477762; x=1698082562; i=moeller0@gmx.de; bh=5Nkkyn8JkxJQstcGUpATAT8GJpfzY9aS44FCj9jIV68=; h=X-UI-Sender-Class:From:Subject:Date:References:To:In-Reply-To; b=YKatUlTTRpjbcawUNJ38peM7QQ5aEnoeYx3nLjtBlmT7tD2nQEoWSnfd5R+9fhLcgvA0JqoFRnr uj4gVOJy2vQt48oAiBcVpxzGryUbERkhf2VCiFDNvD5EqtqVohmSXR7enYRat8jltfkLuisoUMLLj CraJQw5eCIItmiWQq2LOQ4PhZlz/hHZ+GCT1y6Xa9ZtQGwqCmPc9/nW1UHv3P8IzSa/NKmA+6Btbs HSxYwuBhxdM3iT1yxzyMPOodpwQ/iLglwHCJBVc+SkwT7cawDTolYY1qU/sOEWXu68zzSe5qFuUt5 MP1cHm7TXQAvFpeR6g3ET0RXvmC8m/EhTDQQ== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from smtpclient.apple ([77.8.145.49]) by mail.gmx.net (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MIwzA-1r8Klc0rQg-00KQzo; Mon, 16 Oct 2023 19:36:02 +0200 From: Sebastian Moeller Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\)) Date: Mon, 16 Oct 2023 19:36:01 +0200 References: <4c44a9ef4c4b14a06403e553e633717d@rjmcmahon.com> To: dickroy@alum.mit.edu, =?utf-8?Q?Network_Neutrality_is_back!_Let=C2=B4s_make_the_technical_as?= =?utf-8?Q?pects_heard_this_time!?= In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3696.120.41.1.4) X-Provags-ID: V03:K1:VtusW3omUIScrJMPIe/QJzU9A05KYFwPrg3f8p5/K4bwHZ++H2r EMbyfJw9mIDvpzcjuo/K42YD5EA60QtMbSr75YM0ITwn+ZFHNapUMS6O1gC5BH8CgT/9jaR NlV+5T+zXVWxX+GlCzf2hEBT/5ghdi9/IdlaA1Vz/nwSkXKO+rvPi8RjAvP0noNfVhEDaYd +fgiXDikzDxwKQX2/B8gw== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:eVAxRKy1J88=;8tNvzL0ZfW3lwMYsg/yJPB/+k/c /0OIL8rw+p9NObxt9Pi39fPyokoKrE15jnw0p/wgFCFiNhj0G61o4Bv+lgr1xro6yNSP9waW0 3WJXMs7wV0q6uwFTFM0eYs5Abh2sN2jAx10xJpk3wCQ0bwULBVNVh0YtoIhOIiBViSxWCKsYZ kJxTIKASHg8aGPnlztFuys+1gT/sIYJxKD0lIxAnBJi/sVCtbO+9wZM/hOaEHynKf1th0WuEf M81SEE/4y1aSnqiiZLjDmMgLdmAKObTwXwOveoVWou+hFKNsbULQfHCv0bzSce+zOcFAKX9uy Js8yAcEnuHJLSjIviT+Z1r8K2dUvdHlwBe5eL8ntMmeZaR5/AhtgBiSaYksldRvEgA0dF3Eoa c6BU+oFgkC0zckrLkgqniOXsukn1GLJtbxdQR7ObyaIGByJ93pheNNFA0wihVfFstk/SLyVfi ttZzc5og7/rQ4cHDadWNHs+YYVZro9q4ysRrcQL4fw287o1uJ+eV+zUWznNNftr+ManZSSm4N eqfBBgOBZaph9+MsVtw0ur+sHynAPjztUJl4CRALzNJIqDXYa6tWwHciYhKw17uQe8p1whxtS DOR5FkEpx2Miua0mNBYtkhQMTCyIxgra7PATYkaXtd0zdiQiPZ+4EDBZYD5xrUSxNH+rrkhUN 6Vfk5RqBpz6HfohwTSBSQkG0wWCQ0+BBVF3ejcZYiid26a3j1ckBTSvrBDDd/KWRjmtoG9fku ULvu3xOVFFj1yqbY252i2MXfaehFHX7CLCOVC92jIihYZ1wVrF7GceuvSpuvnNahFscC6npmt +xpx+TxHEr+uKklpuaGf1dTat1kiWukPpO1i4lCwt7MBDwKjbDmzr73vKX7oALeLesM11/Kpk xw/sx8V2/8is72aXL/oU0dVd3FUxo5avs9rD2n2R6XfOxIjRXu4McSTWd9sgw3IOPox0Znqga EWtgCisRqMb5HMou/S5D65zxc9M= Subject: Re: [NNagain] transit and peering costs projections X-BeenThere: nnagain@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: =?utf-8?q?Network_Neutrality_is_back!_Let=C2=B4s_make_the_technical_aspects_heard_this_time!?= List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Oct 2023 17:36:06 -0000 Hi Richard, > On Oct 16, 2023, at 19:01, Dick Roy via Nnagain = wrote: >=20 > Just an observation: ANY type of congestion control that changes = application behavior in response to congestion, or predicted congestion = (ENC), begs the question "How does throttling of application information = exchange rate (aka behavior) affect the user experience and will the = user tolerate it?"=20 [SM] The trade-off here is, if the application does not respond = (or rather if no application would respond) we would end up with = congestion collapse where no application would gain much of anything as = the network busies itself trying to re-transmit dropped packets without = making much head way... Simplistic game theory application might imply = that individual applications could try to game this, and generally that = seems to be true, but we have remedies for that available.. >=20 > Given any (complex and packet-switched) network topology of = interconnected nodes and links, each with possible a different capacity = and characteristics, such as the internet today, IMO the two fundamental = questions are: >=20 > 1) How can a given network be operated/configured so as to maximize = aggregate throughput (i.e. achieve its theoretical capacity), and > 2) What things in the network need to change to increase the = throughput (aka parameters in the network with the largest Lagrange = multipliers associated with them)? [SM] The thing is we generally know how to maximize (average) = throughput, just add (over-)generous amounts of buffering, the problem = is that this screws up the other important quality axis, latency... We = ideally want low latency and even more low latency variance (aka jitter) = AND high throughput... Turns out though that above a certain throughput = threshold* many users do not seem to care all that much for more = throughput as long as interactive use cases are sufficiently = responsive... but high responsiveness requires low latency and low = jitter... This is actually a good thing, as that means we do not = necessarily aim for 100% utilization (almost requiring deep buffers and = hence resulting in compromised latency) but can get away with say 80-90% = where shallow buffers will do (or rather where buffer filling stays = shallow, there is IMHO still value in having deep buffers for rare = events that need it). *) This is not a hard physical law so the exact threshold is not set in = stone, but unless one has many parallel users, something in the 20-50 = Mbps range is plenty and that is only needed in the "loaded" direction, = that is for pure consumers the upload can be thinner, for pure producers = the download can be thinner. >=20 > I am not an expert in this field, [SM] Nor am I, I come from the wet-ware side of things so not = even soft- or hard-ware ;) > however it seems to me that answers to these questions would be = useful, assuming they are not yet available! >=20 > Cheers, >=20 > RR >=20 >=20 >=20 > -----Original Message----- > From: Nnagain [mailto:nnagain-bounces@lists.bufferbloat.net] On Behalf = Of rjmcmahon via Nnagain > Sent: Sunday, October 15, 2023 1:39 PM > To: Network Neutrality is back! Let=C2=B4s make the technical aspects = heard this time! > Cc: rjmcmahon > Subject: Re: [NNagain] transit and peering costs projections >=20 > Hi Jack, >=20 > Thanks again for sharing. It's very interesting to me. >=20 > Today, the networks are shifting from capacity constrained to latency=20= > constrained, as can be seen in the IX discussions about how the speed = of=20 > light over fiber is too slow even between Houston & Dallas. >=20 > The mitigations against standing queues (which cause bloat today) are: >=20 > o) Shrink the e2e bottleneck queue so it will drop packets in a flow = and=20 > TCP will respond to that "signal" > o) Use some form of ECN marking where the network forwarding plane=20 > ultimately informs the TCP source state machine so it can slow down or=20= > pace effectively. This can be an earlier feedback signal and, if done=20= > well, can inform the sources to avoid bottleneck queuing. There are=20 > couple of approaches with ECN. Comcast is trialing L4S now which seems=20= > interesting to me as a WiFi test & measurement engineer. The jury is=20= > still out on this and measurements are needed. > o) Mitigate source side bloat via TCP_NOTSENT_LOWAT >=20 > The QoS priority approach per congestion is orthogonal by my judgment = as=20 > it's typically not supported e2e, many networks will bleach DSCP=20 > markings. And it's really too late by my judgment. >=20 > Also, on clock sync, yes your generation did us both a service and=20 > disservice by getting rid of the PSTN TDM clock ;) So IP networking=20 > devices kinda ignored clock sync, which makes e2e one way delay (OWD)=20= > measurements impossible. Thankfully, the GPS atomic clock is now=20 > available mostly everywhere and many devices use TCXO oscillators so=20= > it's possible to get clock sync and use oscillators that can minimize=20= > drift. I pay $14 for a Rpi4 GPS chip with pulse per second as an=20 > example. >=20 > It seems silly to me that clocks aren't synced to the GPS atomic clock=20= > even if by a proxy even if only for measurement and monitoring. >=20 > Note: As Richard Roy will point out, there really is no such thing as=20= > synchronized clocks across geographies per general relativity - so = those=20 > syncing clocks need to keep those effects in mind. I limited the iperf = 2=20 > timestamps to microsecond precision in hopes avoiding those issues. >=20 > Note: With WiFi, a packet drop can occur because an intermittent RF=20 > channel condition. TCP can't tell the difference between an RF drop vs = a=20 > congested queue drop. That's another reason ECN markings from network=20= > devices may be better than dropped packets. >=20 > Note: I've added some iperf 2 test support around pacing as that seems=20= > to be the direction the industry is heading as networks are less and=20= > less capacity strained and user quality of experience is being driven = by=20 > tail latencies. One can also test with the Prague CCA for the L4S=20 > scenarios. (This is a fun project: https://www.l4sgear.com/ and fairly=20= > low cost) >=20 > --fq-rate n[kmgKMG] > Set a rate to be used with fair-queuing based socket-level pacing, in=20= > bytes or bits per second. Only available on platforms supporting the=20= > SO_MAX_PACING_RATE socket option. (Note: Here the suffixes indicate=20 > bytes/sec or bits/sec per use of uppercase or lowercase, respectively) >=20 > --fq-rate-step n[kmgKMG] > Set a step of rate to be used with fair-queuing based socket-level=20 > pacing, in bytes or bits per second. Step occurs every=20 > fq-rate-step-interval (defaults to one second) >=20 > --fq-rate-step-interval n > Time in seconds before stepping the fq-rate >=20 > Bob >=20 > PS. Iperf 2 man page https://iperf2.sourceforge.io/iperf-manpage.html >=20 >> The "VGV User" (Voice, Gaming, Videoconferencing) cares a lot about >> latency. It's not just "rewarding" to have lower latencies; high >> latencies may make VGV unusable. Average (or "typical") latency as >> the FCC label proposes isn't a good metric to judge usability. A = path >> which has high variance in latency can be unusable even if the = average >> is quite low. Having your voice or video or gameplay "break up" >> every minute or so when latency spikes to 500 msec makes the "user >> experience" intolerable. >>=20 >> A few years ago, I ran some simple "ping" tests to help a friend who >> was trying to use a gaming app. My data was only for one specific >> path so it's anecdotal. What I saw was surprising - zero data loss, >> every datagram was delivered, but occasionally a datagram would take >> up to 30 seconds to arrive. I didn't have the ability to poke around >> inside, but I suspected it was an experience of "bufferbloat", = enabled >> by the dramatic drop in price of memory over the decades. >>=20 >> It's been a long time since I was involved in operating any part of >> the Internet, so I don't know much about the inner workings today. >> Apologies for my ignorance.... >>=20 >> There was a scenario in the early days of the Internet for which we >> struggled to find a technical solution. Imagine some node in the >> bowels of the network, with 3 connected "circuits" to some other >> nodes. On two of those inputs, traffic is arriving to be forwarded >> out the third circuit. The incoming flows are significantly more = than >> the outgoing path can accept. >>=20 >> What happens? How is "backpressure" generated so that the incoming >> flows are reduced to the point that the outgoing circuit can handle >> the traffic? >>=20 >> About 45 years ago, while we were defining TCPV4, we struggled with >> this issue, but didn't find any consensus solutions. So = "placeholder" >> mechanisms were defined in TCPV4, to be replaced as research = continued >> and found a good solution. >>=20 >> In that "placeholder" scheme, the "Source Quench" (SQ) IP message was >> defined; it was to be sent by a switching node back toward the sender >> of any datagram that had to be discarded because there wasn't any >> place to put it. >>=20 >> In addition, the TOS (Type Of Service) and TTL (Time To Live) fields >> were defined in IP. >>=20 >> TOS would allow the sender to distinguish datagrams based on their >> needs. For example, we thought "Interactive" service might be needed >> for VGV traffic, where timeliness of delivery was most important.=20 >> "Bulk" service might be useful for activities like file transfers, >> backups, et al. "Normal" service might now mean activities like >> using the Web. >>=20 >> The TTL field was an attempt to inform each switching node about the >> "expiration date" for a datagram. If a node somehow knew that a >> particular datagram was unlikely to reach its destination in time to >> be useful (such as a video datagram for a frame that has already been >> displayed), the node could, and should, discard that datagram to free >> up resources for useful traffic. Sadly we had no mechanisms for >> measuring delay, either in transit or in queuing, so TTL was defined >> in terms of "hops", which is not an accurate proxy for time. But >> it's all we had. >>=20 >> Part of the complexity was that the "flow control" mechanism of the >> Internet had put much of the mechanism in the users' computers' TCP >> implementations, rather than the switches which handle only IP. >> Without mechanisms in the users' computers, all a switch could do is >> order more circuits, and add more memory to the switches for queuing.=20= >> Perhaps that led to "bufferbloat". >>=20 >> So TOS, SQ, and TTL were all placeholders, for some mechanism in a >> future release that would introduce a "real" form of Backpressure and >> the ability to handle different types of traffic. Meanwhile, these >> rudimentary mechanisms would provide some flow control. Hopefully the >> users' computers sending the flows would respond to the SQ >> backpressure, and switches would prioritize traffic using the TTL and >> TOS information. >>=20 >> But, being way out of touch, I don't know what actually happens >> today. Perhaps the current operators and current government watchers >> can answer?:git clone https://rjmcmahon@git.code.sf.net/p/iperf2/code=20= >> iperf2-code >>=20 >> 1/ How do current switches exert Backpressure to reduce competing >> traffic flows? Do they still send SQs? >>=20 >> 2/ How do the current and proposed government regulations treat the >> different needs of different types of traffic, e.g., "Bulk" versus >> "Interactive" versus "Normal"? Are Internet carriers permitted to >> treat traffic types differently? Are they permitted to charge >> different amounts for different types of service? >>=20 >> Jack Haverty >>=20 >> On 10/15/23 09:45, Dave Taht via Nnagain wrote: >>> For starters I would like to apologize for cc-ing both nanog and my >>> new nn list. (I will add sender filters) >>>=20 >>> A bit more below. >>>=20 >>> On Sun, Oct 15, 2023 at 9:32=E2=80=AFAM Tom Beecher = =20 >>> wrote: >>>>> So for now, we'll keep paying for transit to get to the others=20 >>>>> (since it=E2=80=99s about as much as transporting IXP from = Dallas), and=20 >>>>> hoping someone at Google finally sees Houston as more than a third=20= >>>>> rate city hanging off of Dallas. Or=E2=80=A6 someone finally = brings a=20 >>>>> worthwhile IX to Houston that gets us more than peering to Kansas=20= >>>>> City. Yeah, I think the former is more likely. =F0=9F=98=8A >>>>=20 >>>> There is often a chicken/egg scenario here with the economics. As = an=20 >>>> eyeball network, your costs to build out and connect to Dallas are=20= >>>> greater than your transit cost, so you do that. Totally fair. >>>>=20 >>>> However think about it from the content side. Say I want to build=20= >>>> into to Houston. I have to put routers in, and a bunch of cache=20 >>>> servers, so I have capital outlay , plus opex for space, power,=20 >>>> IX/backhaul/transit costs. That's not cheap, so there's a lot of=20 >>>> calculations that go into it. Is there enough total eyeball traffic=20= >>>> there to make it worth it? Is saving 8-10ms enough of a performance=20= >>>> boost to justify the spend? What are the long term trends in that=20= >>>> market? These answers are of course different for a company running=20= >>>> their own CDN vs the commercial CDNs. >>>>=20 >>>> I don't work for Google and obviously don't speak for them, but I=20= >>>> would suspect that they're happy to eat a 8-10ms performance hit to=20= >>>> serve from Dallas , versus the amount of capital outlay to build = out=20 >>>> there right now. >>> The three forms of traffic I care most about are voip, gaming, and >>> videoconferencing, which are rewarding to have at lower latencies. >>> When I was a kid, we had switched phone networks, and while the = sound >>> quality was poorer than today, the voice latency cross-town was just >>> like "being there". Nowadays we see 500+ms latencies for this kind = of >>> traffic. >>>=20 >>> As to how to make calls across town work that well again, cost-wise, = I >>> do not know, but the volume of traffic that would be better served = by >>> these interconnects quite low, respective to the overall gains in >>> lower latency experiences for them. >>>=20 >>>=20 >>>=20 >>>> On Sat, Oct 14, 2023 at 11:47=E2=80=AFPM Tim Burke = wrote: >>>>> I would say that a 1Gbit IP transit in a carrier neutral DC can be=20= >>>>> had for a good bit less than $900 on the wholesale market. >>>>>=20 >>>>> Sadly, IXP=E2=80=99s are seemingly turning into a pay to play = game, with=20 >>>>> rates almost costing as much as transit in many cases after you=20 >>>>> factor in loop costs. >>>>>=20 >>>>> For example, in the Houston market (one of the largest and fastest=20= >>>>> growing regions in the US!), we do not have a major IX, so to get = up=20 >>>>> to Dallas it=E2=80=99s several thousand for a 100g wave, plus = several=20 >>>>> thousand for a 100g port on one of those major IXes. Or, a better=20= >>>>> option, we can get a 100g flat internet transit for just a little=20= >>>>> bit more. >>>>>=20 >>>>> Fortunately, for us as an eyeball network, there are a good number=20= >>>>> of major content networks that are allowing for private peering in=20= >>>>> markets like Houston for just the cost of a cross connect and a = QSFP=20 >>>>> if you=E2=80=99re in the right DC, with Google and some others = being the=20 >>>>> outliers. >>>>>=20 >>>>> So for now, we'll keep paying for transit to get to the others=20 >>>>> (since it=E2=80=99s about as much as transporting IXP from = Dallas), and=20 >>>>> hoping someone at Google finally sees Houston as more than a third=20= >>>>> rate city hanging off of Dallas. Or=E2=80=A6 someone finally = brings a=20 >>>>> worthwhile IX to Houston that gets us more than peering to Kansas=20= >>>>> City. Yeah, I think the former is more likely. =F0=9F=98=8A >>>>>=20 >>>>> See y=E2=80=99all in San Diego this week, >>>>> Tim >>>>>=20 >>>>> On Oct 14, 2023, at 18:04, Dave Taht wrote: >>>>>> =EF=BB=BFThis set of trendlines was very interesting. = Unfortunately the=20 >>>>>> data >>>>>> stops in 2015. Does anyone have more recent data? >>>>>>=20 >>>>>> = https://drpeering.net/white-papers/Internet-Transit-Pricing-Historical-And= -Projected.php >>>>>>=20 >>>>>> I believe a gbit circuit that an ISP can resell still runs at = about >>>>>> $900 - $1.4k (?) in the usa? How about elsewhere? >>>>>>=20 >>>>>> ... >>>>>>=20 >>>>>> I am under the impression that many IXPs remain very successful, >>>>>> states without them suffer, and I also find the concept of doing=20= >>>>>> micro >>>>>> IXPs at the city level, appealing, and now achievable with cheap=20= >>>>>> gear. >>>>>> Finer grained cross connects between telco and ISP and IXP would=20= >>>>>> lower >>>>>> latencies across town quite hugely... >>>>>>=20 >>>>>> PS I hear ARIN is planning on dropping the price for, and = bundling=20 >>>>>> 3 >>>>>> BGP AS numbers at a time, as of the end of this year, also. >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> -- >>>>>> Oct 30:=20 >>>>>> = https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html >>>>>> Dave T=C3=A4ht CSO, LibreQos >>>=20 >>>=20 >>=20 >> _______________________________________________ >> Nnagain mailing list >> Nnagain@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/nnagain > _______________________________________________ > Nnagain mailing list > Nnagain@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/nnagain >=20 > _______________________________________________ > Nnagain mailing list > Nnagain@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/nnagain