Thinking of networks not being fast enough .. we wrote this years upon years ago at the Interop show.  We shouldn't have been surprised, but we were - a lot of "the press" believed this: https://www.cavebear.com/cb_catalog/techno/gaganet/ Here's the introduction snippet, the rest is via the link above: May 5, 1998: Las Vegas, Networld+Interop Today, the worlds greatest collection of networking professionals gathered and constructed the first trans-relativistic network. The NOC Team used hyper-fiber to create the first network not limited by the speed of light. ... etc etc     --karl-- On 10/15/23 1:39 PM, rjmcmahon via Nnagain wrote: > Hi Jack, > > Thanks again for sharing. It's very interesting to me. > > Today, the networks are shifting from capacity constrained to latency > constrained, as can be seen in the IX discussions about how the speed > of light over fiber is too slow even between Houston & Dallas. > > The mitigations against standing queues (which cause bloat today) are: > > o) Shrink the e2e bottleneck queue so it will drop packets in a flow > and TCP will respond to that "signal" > o) Use some form of ECN marking where the network forwarding plane > ultimately informs the TCP source state machine so it can slow down or > pace effectively. This can be an earlier feedback signal and, if done > well, can inform the sources to avoid bottleneck queuing. There are > couple of approaches with ECN. Comcast is trialing L4S now which seems > interesting to me as a WiFi test & measurement engineer. The jury is > still out on this and measurements are needed. > o) Mitigate source side bloat via TCP_NOTSENT_LOWAT > > The QoS priority approach per congestion is orthogonal by my judgment > as it's typically not supported e2e, many networks will bleach DSCP > markings. And it's really too late by my judgment. > > Also, on clock sync, yes your generation did us both a service and > disservice by getting rid of the PSTN TDM clock ;) So IP networking > devices kinda ignored clock sync, which makes e2e one way delay (OWD) > measurements impossible. Thankfully, the GPS atomic clock is now > available mostly everywhere and many devices use TCXO oscillators so > it's possible to get clock sync and use oscillators that can minimize > drift. I pay $14 for a Rpi4 GPS chip with pulse per second as an example. > > It seems silly to me that clocks aren't synced to the GPS atomic clock > even if by a proxy even if only for measurement and monitoring. > > Note: As Richard Roy will point out, there really is no such thing as > synchronized clocks across geographies per general relativity - so > those syncing clocks need to keep those effects in mind. I limited the > iperf 2 timestamps to microsecond precision in hopes avoiding those > issues. > > Note: With WiFi, a packet drop can occur because an intermittent RF > channel condition. TCP can't tell the difference between an RF drop vs > a congested queue drop. That's another reason ECN markings from > network devices may be better than dropped packets. > > Note: I've added some iperf 2 test support around pacing as that seems > to be the direction the industry is heading as networks are less and > less capacity strained and user quality of experience is being driven > by tail latencies. One can also test with the Prague CCA for the L4S > scenarios. (This is a fun project: https://www.l4sgear.com/ and fairly > low cost) > > --fq-rate n[kmgKMG] > Set a rate to be used with fair-queuing based socket-level pacing, in > bytes or bits per second. Only available on platforms supporting the > SO_MAX_PACING_RATE socket option. (Note: Here the suffixes indicate > bytes/sec or bits/sec per use of uppercase or lowercase, respectively) > > --fq-rate-step n[kmgKMG] > Set a step of rate to be used with fair-queuing based socket-level > pacing, in bytes or bits per second. Step occurs every > fq-rate-step-interval (defaults to one second) > > --fq-rate-step-interval n > Time in seconds before stepping the fq-rate > > Bob > > PS. Iperf 2 man page https://iperf2.sourceforge.io/iperf-manpage.html > >> The "VGV User" (Voice, Gaming, Videoconferencing) cares a lot about >> latency.   It's not just "rewarding" to have lower latencies; high >> latencies may make VGV unusable.   Average (or "typical") latency as >> the FCC label proposes isn't a good metric to judge usability. A path >> which has high variance in latency can be unusable even if the average >> is quite low.   Having your voice or video or gameplay "break up" >> every minute or so when latency spikes to 500 msec makes the "user >> experience" intolerable. >> >> A few years ago, I ran some simple "ping" tests to help a friend who >> was trying to use a gaming app.  My data was only for one specific >> path so it's anecdotal.  What I saw was surprising - zero data loss, >> every datagram was delivered, but occasionally a datagram would take >> up to 30 seconds to arrive.  I didn't have the ability to poke around >> inside, but I suspected it was an experience of "bufferbloat", enabled >> by the dramatic drop in price of memory over the decades. >> >> It's been a long time since I was involved in operating any part of >> the Internet, so I don't know much about the inner workings today. >> Apologies for my ignorance.... >> >> There was a scenario in the early days of the Internet for which we >> struggled to find a technical solution.  Imagine some node in the >> bowels of the network, with 3 connected "circuits" to some other >> nodes.  On two of those inputs, traffic is arriving to be forwarded >> out the third circuit.  The incoming flows are significantly more than >> the outgoing path can accept. >> >> What happens?   How is "backpressure" generated so that the incoming >> flows are reduced to the point that the outgoing circuit can handle >> the traffic? >> >> About 45 years ago, while we were defining TCPV4, we struggled with >> this issue, but didn't find any consensus solutions.  So "placeholder" >> mechanisms were defined in TCPV4, to be replaced as research continued >> and found a good solution. >> >> In that "placeholder" scheme, the "Source Quench" (SQ) IP message was >> defined; it was to be sent by a switching node back toward the sender >> of any datagram that had to be discarded because there wasn't any >> place to put it. >> >> In addition, the TOS (Type Of Service) and TTL (Time To Live) fields >> were defined in IP. >> >> TOS would allow the sender to distinguish datagrams based on their >> needs.  For example, we thought "Interactive" service might be needed >> for VGV traffic, where timeliness of delivery was most important. >> "Bulk" service might be useful for activities like file transfers, >> backups, et al.   "Normal" service might now mean activities like >> using the Web. >> >> The TTL field was an attempt to inform each switching node about the >> "expiration date" for a datagram.   If a node somehow knew that a >> particular datagram was unlikely to reach its destination in time to >> be useful (such as a video datagram for a frame that has already been >> displayed), the node could, and should, discard that datagram to free >> up resources for useful traffic.  Sadly we had no mechanisms for >> measuring delay, either in transit or in queuing, so TTL was defined >> in terms of "hops", which is not an accurate proxy for time. But >> it's all we had. >> >> Part of the complexity was that the "flow control" mechanism of the >> Internet had put much of the mechanism in the users' computers' TCP >> implementations, rather than the switches which handle only IP. >> Without mechanisms in the users' computers, all a switch could do is >> order more circuits, and add more memory to the switches for queuing. >> Perhaps that led to "bufferbloat". >> >> So TOS, SQ, and TTL were all placeholders, for some mechanism in a >> future release that would introduce a "real" form of Backpressure and >> the ability to handle different types of traffic.   Meanwhile, these >> rudimentary mechanisms would provide some flow control. Hopefully the >> users' computers sending the flows would respond to the SQ >> backpressure, and switches would prioritize traffic using the TTL and >> TOS information. >> >> But, being way out of touch, I don't know what actually happens >> today.  Perhaps the current operators and current government watchers >> can answer?:git clone https://rjmcmahon@git.code.sf.net/p/iperf2/code >> iperf2-code >> >> 1/ How do current switches exert Backpressure to  reduce competing >> traffic flows?  Do they still send SQs? >> >> 2/ How do the current and proposed government regulations treat the >> different needs of different types of traffic, e.g., "Bulk" versus >> "Interactive" versus "Normal"?  Are Internet carriers permitted to >> treat traffic types differently?  Are they permitted to charge >> different amounts for different types of service? >> >> Jack Haverty >> >> On 10/15/23 09:45, Dave Taht via Nnagain wrote: >>> For starters I would like to apologize for cc-ing both nanog and my >>> new nn list. (I will add sender filters) >>> >>> A bit more below. >>> >>> On Sun, Oct 15, 2023 at 9:32 AM Tom Beecher wrote: >>>>> So for now, we'll keep paying for transit to get to the others >>>>> (since it’s about as much as transporting IXP from Dallas), and >>>>> hoping someone at Google finally sees Houston as more than a third >>>>> rate city hanging off of Dallas. Or… someone finally brings a >>>>> worthwhile IX to Houston that gets us more than peering to Kansas >>>>> City. Yeah, I think the former is more likely. 😊 >>>> >>>> There is often a chicken/egg scenario here with the economics. As >>>> an eyeball network, your costs to build out and connect to Dallas >>>> are greater than your transit cost, so you do that. Totally fair. >>>> >>>> However think about it from the content side. Say I want to build >>>> into to Houston. I have to put routers in, and a bunch of cache >>>> servers, so I have capital outlay , plus opex for space, power, >>>> IX/backhaul/transit costs. That's not cheap, so there's a lot of >>>> calculations that go into it. Is there enough total eyeball traffic >>>> there to make it worth it? Is saving 8-10ms enough of a performance >>>> boost to justify the spend? What are the long term trends in that >>>> market? These answers are of course different for a company running >>>> their own CDN vs the commercial CDNs. >>>> >>>> I don't work for Google and obviously don't speak for them, but I >>>> would suspect that they're happy to eat a 8-10ms performance hit to >>>> serve from Dallas , versus the amount of capital outlay to build >>>> out there right now. >>> The three forms of traffic I care most about are voip, gaming, and >>> videoconferencing, which are rewarding to have at lower latencies. >>> When I was a kid, we had switched phone networks, and while the sound >>> quality was poorer than today, the voice latency cross-town was just >>> like "being there". Nowadays we see 500+ms latencies for this kind of >>> traffic. >>> >>> As to how to make calls across town work that well again, cost-wise, I >>> do not know, but the volume of traffic that would be better served by >>> these interconnects quite low, respective to the overall gains in >>> lower latency experiences for them. >>> >>> >>> >>>> On Sat, Oct 14, 2023 at 11:47 PM Tim Burke wrote: >>>>> I would say that a 1Gbit IP transit in a carrier neutral DC can be >>>>> had for a good bit less than $900 on the wholesale market. >>>>> >>>>> Sadly, IXP’s are seemingly turning into a pay to play game, with >>>>> rates almost costing as much as transit in many cases after you >>>>> factor in loop costs. >>>>> >>>>> For example, in the Houston market (one of the largest and fastest >>>>> growing regions in the US!), we do not have a major IX, so to get >>>>> up to Dallas it’s several thousand for a 100g wave, plus several >>>>> thousand for a 100g port on one of those major IXes. Or, a better >>>>> option, we can get a 100g flat internet transit for just a little >>>>> bit more. >>>>> >>>>> Fortunately, for us as an eyeball network, there are a good number >>>>> of major content networks that are allowing for private peering in >>>>> markets like Houston for just the cost of a cross connect and a >>>>> QSFP if you’re in the right DC, with Google and some others being >>>>> the outliers. >>>>> >>>>> So for now, we'll keep paying for transit to get to the others >>>>> (since it’s about as much as transporting IXP from Dallas), and >>>>> hoping someone at Google finally sees Houston as more than a third >>>>> rate city hanging off of Dallas. Or… someone finally brings a >>>>> worthwhile IX to Houston that gets us more than peering to Kansas >>>>> City. Yeah, I think the former is more likely. 😊 >>>>> >>>>> See y’all in San Diego this week, >>>>> Tim >>>>> >>>>> On Oct 14, 2023, at 18:04, Dave Taht wrote: >>>>>> This set of trendlines was very interesting. Unfortunately the data >>>>>> stops in 2015. Does anyone have more recent data? >>>>>> >>>>>> https://drpeering.net/white-papers/Internet-Transit-Pricing-Historical-And-Projected.php >>>>>> >>>>>> >>>>>> I believe a gbit circuit that an ISP can resell still runs at about >>>>>> $900 - $1.4k (?) in the usa? How about elsewhere? >>>>>> >>>>>> ... >>>>>> >>>>>> I am under the impression that many IXPs remain very successful, >>>>>> states without them suffer, and I also find the concept of doing >>>>>> micro >>>>>> IXPs at the city level, appealing, and now achievable with cheap >>>>>> gear. >>>>>> Finer grained cross connects between telco and ISP and IXP would >>>>>> lower >>>>>> latencies across town quite hugely... >>>>>> >>>>>> PS I hear ARIN is planning on dropping the price for, and bundling 3 >>>>>> BGP AS numbers at a time, as of the end of this year, also. >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Oct 30: >>>>>> https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html >>>>>> Dave Täht CSO, LibreQos >>> >>> >> >> _______________________________________________ >> Nnagain mailing list >> Nnagain@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/nnagain > _______________________________________________ > Nnagain mailing list > Nnagain@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/nnagain