From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 621BD3B29E; Thu, 12 Jan 2023 03:14:42 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1673511274; bh=qF3GvUJxnRtYPy/ZRrSbVAwlZ3VGBPBVJhOjDgevRi0=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=ivfYWGX/d9dsOQqPzSDDZKFUQQEDvR1Fju2WpnJqzz1xly7pHugjqxDllhdRGehDO HpQJQJtxgvsJYCjIstqDbAgwC40oOO3C25WcEo2R6+00EEZO9EB1m//MDYAWlxH25x FWQx2IeAxG5suezUd8gqqp2uOX5CumKKEbXcyhl2oypvrAzPEPOLJnVfcsEc8jp1Yd bkFIorkKq1AaJ1Ls6KxkwCdHDEYl/o9YihDlypKIs6hn3EnikEYaM8xJzxLAAoSq/i vy0YqkbYtw/89v/FtEmKJVVlzyukwIIJTtaij/IrTZn1VNyNnwwXXkbUCCT66Et7yF 1anxLbj4RGgEQ== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MhU5R-1obLBZ0eCB-00eeWf; Thu, 12 Jan 2023 09:14:34 +0100 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) From: Sebastian Moeller In-Reply-To: Date: Thu, 12 Jan 2023 09:14:32 +0100 Cc: "Rodney W. Grimes" , Rpm , mike.reynolds@netforecast.com, "David P. Reed" , libreqos , Dave Taht via Starlink , bloat Content-Transfer-Encoding: quoted-printable Message-Id: <44D8ADEB-8F89-4477-BCBD-C597A888A83E@gmx.de> References: <202301111832.30BIWevV030127@gndrsh.dnsmgr.net> To: rjmcmahon X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Provags-ID: V03:K1:dg+FNjPi9nEQTcsxRyjJawOfI2+UKkjkIju7JUmirInKVC4xWbi z+3ckCUXfkbcpNQWOHMXBhUAu9ZReMA6J0vk0tz0dcwvA32NE86TZrr3jFRKZfsyc7OBHdt +2tBJCn5hoHR1n465gTU7dNRj1fidM/bdO+RD9pfLBYOpLx6clFSE8VMEkyRTKA+Ij3YwS0 ngTQcqXdMbSMtAJOX6Z1A== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:QoSBH3QQAPQ=;SNoM1J7OU7C5qqhs9SU1qB/c5e8 bM3S90pMzJWJAHZC+pNplwmjy81oJ/WkeMiEqvk5aorN+HBJXDl6UD0xmQVS+AhgLuYJeOvuZ vAbiUZdKPpu3pTYpGx1ZENAbBIDcypfAfwQ/QhQTvFon/2WEUHiQzjejaRkwmZCg/AxSPDG9L yDchfFoxjqhOMpbMrfudiwpNbPm5b6P3CiGaxQ0TqImU5c4mMGKcB2WphEe7GgycDVf60enE9 DHpo2pUYVG5erfkZvcGXTXGgGDBW6g+a124v2fBBfEjoiFatPsvPhFDhZ1dK7fbe2ENlRO/zD EJn/5OysqrN8qnqdQKqCWvf5qCJs31mAuaIjbeeoYlv/qfjlFBENbdV8WuDHpmoYq5oxh41in NeWQURUJkt9ORsCfsxPcYsERkxNPsJ1HjLn0DetZzmoCG54kbyFpCech8CgP//ZLZHdQsrEyz EyudS0BHJeamG8boIJT+IXRvrZf3tJUVzdPknJ/Zp1Xq5mvI/BxOlBBmulcDiqWF1Xsqr/sDp SoeC3rlwikXrTEg/QAlww8P+yiMFyRDoMMscDeFTzv0tq14Rn29Y7ov6WIdgBo9vQy0h7znEe Oik89YbJLA4go0KHd7LRrtilwxMbKxnxdeX3CGdl5ptql21vxNPqzg2wt2X3BaY3NvRN+gXhr sgDbCySaB2+5Su6dVN9YMTZO0JLzWrD4ciV1/wHp9uoVtHysTtyOxfzcqJGsv7dAb9eaUN9Bo 8S3gThjhxY8az8ew3G955FUsqUxz88e2VghSDNXBMzbFnCKnMwZEon7aX7O+N5qht6vtbfsTV Utg1up0XxXSb5t0ZRQze6D6zkgx2QNiGzFWd2xeOPLUf768zVKNpFwIDqPDts+shOnTMbZqtq J+30bWARVsReYT2g329icKUPfa1/3t10sji6s6lApJX76RoQ0HtuZJWfFOMrJ2fAnklAn7SYx +hB38YanxyvgNJ14DNFD2qxGJdA= Subject: Re: [Rpm] [Starlink] Researchers Seeking Probe Volunteers in USA X-BeenThere: rpm@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: revolutions per minute - a new metric for measuring responsiveness List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jan 2023 08:14:42 -0000 Hi Bob, > On Jan 11, 2023, at 21:09, rjmcmahon wrote: >=20 > Iperf 2 is designed to measure network i/o. Note: It doesn't have to = move large amounts of data. It can support data profiles that don't = drive TCP's CCA as an example. >=20 > Two things I've been asked for and avoided: >=20 > 1) Integrate clock sync into iperf's test traffic [SM] This I understand, measurement conditions can be unsuited = for tight time synchronization... > 2) Measure and output CPU usages [SM] This one puzzles me, as far as I understand the only way to = properly diagnose network issues is to rule out other things like CPU = overload that can have symptoms similar to network issues. As an = example, the cake qdisc will if CPU cycles become tight first increases = its internal queueing and jitter (not consciously, it is just an = observation that once cake does not get access to the CPU as timely as = it wants, queuing latency and variability increases) and then later also = shows reduced throughput, so similar things that can happen along an e2e = network path for completely different reasons, e.g. lower level = retransmissions or a variable rate link. So i would think that checking = the CPU load at least coarse would be within the scope of network = testing tools, no? Regards Sebastian > I think both of these are outside the scope of a tool designed to test = network i/o over sockets, rather these should be developed & validated = independently of a network i/o tool. >=20 > Clock error really isn't about amount/frequency of traffic but rather = getting a periodic high-quality reference. I tend to use GPS pulse per = second to lock the local system oscillator to. As David says, most every = modern handheld computer has the GPS chips to do this already. So to me = it seems more of a policy choice between data center operators and = device mfgs and less of a technical issue. >=20 > Bob >> Hello, >> Yall can call me crazy if you want.. but... see below [RWG] >>> Hi Bib, >>> > On Jan 9, 2023, at 20:13, rjmcmahon via Starlink = wrote: >>> > >>> > My biggest barrier is the lack of clock sync by the devices, i.e. = very limited support for PTP in data centers and in end devices. This = limits the ability to measure one way delays (OWD) and most assume that = OWD is 1/2 and RTT which typically is a mistake. We know this = intuitively with airplane flight times or even car commute times where = the one way time is not 1/2 a round trip time. Google maps & directions = provide a time estimate for the one way link. It doesn't compute a round = trip and divide by two. >>> > >>> > For those that can get clock sync working, the iperf 2 = --trip-times options is useful. >>> [SM] +1; and yet even with unsynchronized clocks one can try to = measure how latency changes under load and that can be done per = direction. Sure this is far inferior to real reliably measured OWDs, but = if life/the internet deals you lemons.... >> [RWG] iperf2/iperf3, etc are already moving large amounts of data >> back and forth, for that matter any rate test, why not abuse some of >> that data and add the fundemental NTP clock sync data and >> bidirectionally pass each others concept of "current time". IIRC = (its >> been 25 years since I worked on NTP at this level) you *should* be >> able to get a fairly accurate clock delta between each end, and then >> use that info and time stamps in the data stream to compute OWD's. >> You need to put 4 time stamps in the packet, and with that you can >> compute "offset". >>> > >>> > --trip-times >>> > enable the measurement of end to end write to read latencies = (client and server clocks must be synchronized) >> [RWG] --clock-skew >> enable the measurement of the wall clock difference between = sender and receiver >>> [SM] Sweet! >>> Regards >>> Sebastian >>> > >>> > Bob >>> >> I have many kvetches about the new latency under load tests being >>> >> designed and distributed over the past year. I am delighted! that = they >>> >> are happening, but most really need third party evaluation, and >>> >> calibration, and a solid explanation of what network pathologies = they >>> >> do and don't cover. Also a RED team attitude towards them, as = well as >>> >> thinking hard about what you are not measuring (operations = research). >>> >> I actually rather love the new cloudflare speedtest, because it = tests >>> >> a single TCP connection, rather than dozens, and at the same time = folk >>> >> are complaining that it doesn't find the actual "speed!". yet... = the >>> >> test itself more closely emulates a user experience than = speedtest.net >>> >> does. I am personally pretty convinced that the fewer numbers of = flows >>> >> that a web page opens improves the likelihood of a good user >>> >> experience, but lack data on it. >>> >> To try to tackle the evaluation and calibration part, I've = reached out >>> >> to all the new test designers in the hope that we could get = together >>> >> and produce a report of what each new test is actually doing. = I've >>> >> tweeted, linked in, emailed, and spammed every measurement list I = know >>> >> of, and only to some response, please reach out to other test = designer >>> >> folks and have them join the rpm email list? >>> >> My principal kvetches in the new tests so far are: >>> >> 0) None of the tests last long enough. >>> >> Ideally there should be a mode where they at least run to "time = of >>> >> first loss", or periodically, just run longer than the >>> >> industry-stupid^H^H^H^H^H^Hstandard 20 seconds. There be dragons >>> >> there! It's really bad science to optimize the internet for 20 >>> >> seconds. It's like optimizing a car, to handle well, for just 20 >>> >> seconds. >>> >> 1) Not testing up + down + ping at the same time >>> >> None of the new tests actually test the same thing that the = infamous >>> >> rrul test does - all the others still test up, then down, and = ping. It >>> >> was/remains my hope that the simpler parts of the flent test = suite - >>> >> such as the tcp_up_squarewave tests, the rrul test, and the = rtt_fair >>> >> tests would provide calibration to the test designers. >>> >> we've got zillions of flent results in the archive published = here: >>> >> https://blog.cerowrt.org/post/found_in_flent/ >>> >> ps. Misinformation about iperf 2 impacts my ability to do this. >>> > >>> >> The new tests have all added up + ping and down + ping, but not = up + >>> >> down + ping. Why?? >>> >> The behaviors of what happens in that case are really = non-intuitive, I >>> >> know, but... it's just one more phase to add to any one of those = new >>> >> tests. I'd be deliriously happy if someone(s) new to the field >>> >> started doing that, even optionally, and boggled at how it = defeated >>> >> their assumptions. >>> >> Among other things that would show... >>> >> It's the home router industry's dirty secret than darn few = "gigabit" >>> >> home routers can actually forward in both directions at a = gigabit. I'd >>> >> like to smash that perception thoroughly, but given our starting = point >>> >> is a gigabit router was a "gigabit switch" - and historically = been >>> >> something that couldn't even forward at 200Mbit - we have a long = way >>> >> to go there. >>> >> Only in the past year have non-x86 home routers appeared that = could >>> >> actually do a gbit in both directions. >>> >> 2) Few are actually testing within-stream latency >>> >> Apple's rpm project is making a stab in that direction. It looks >>> >> highly likely, that with a little more work, crusader and >>> >> go-responsiveness can finally start sampling the tcp RTT, loss = and >>> >> markings, more directly. As for the rest... sampling TCP_INFO on >>> >> windows, and Linux, at least, always appeared simple to me, but = I'm >>> >> discovering how hard it is by delving deep into the rust behind >>> >> crusader. >>> >> the goresponsiveness thing is also IMHO running WAY too many = streams >>> >> at the same time, I guess motivated by an attempt to have the = test >>> >> complete quickly? >>> >> B) To try and tackle the validation problem:ps. Misinformation = about iperf 2 impacts my ability to do this. >>> > >>> >> In the libreqos.io project we've established a testbed where = tests can >>> >> be plunked through various ISP plan network emulations. It's = here: >>> >> https://payne.taht.net (run bandwidth test for what's currently = hooked >>> >> up) >>> >> We could rather use an AS number and at least a ipv4/24 and = ipv6/48 to >>> >> leverage with that, so I don't have to nat the various = emulations. >>> >> (and funding, anyone got funding?) Or, as the code is GPLv2 = licensed, >>> >> to see more test designers setup a testbed like this to calibrate >>> >> their own stuff. >>> >> Presently we're able to test: >>> >> flent >>> >> netperf >>> >> iperf2 >>> >> iperf3 >>> >> speedtest-cli >>> >> crusader >>> >> the broadband forum udp based test: >>> >> https://github.com/BroadbandForum/obudpst >>> >> trexx >>> >> There's also a virtual machine setup that we can remotely drive a = web >>> >> browser from (but I didn't want to nat the results to the world) = to >>> >> test other web services. >>> >> _______________________________________________ >>> >> Rpm mailing list >>> >> Rpm@lists.bufferbloat.net >>> >> https://lists.bufferbloat.net/listinfo/rpm >>> > _______________________________________________ >>> > Starlink mailing list >>> > Starlink@lists.bufferbloat.net >>> > https://lists.bufferbloat.net/listinfo/starlink >>> _______________________________________________ >>> Starlink mailing list >>> Starlink@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/starlink