-----Original Message-----
From: Sebastian Moeller [mailto:moeller0@gmx.de]
Sent: Thursday, January 12, 2023 11:45 PM
To:
Cc: mike.reynolds@netforecast.com; 'libreqos'; '
Subject: RE: [Starlink] [Rpm] Researchers Seeking Probe Volunteers in
Hi RR
On 12 January 2023 22:57:32 CET, Dick Roy <
>FYI .
>
>
>
>https://www.fiercewireless.com/tech/cbrs-based-fwa-beats-starlink-performanc
>e-madden
>
[SM] He is so close:
[RR] Which
is why I posted the link J
I knew you’d latch on to his thread!
'Speed tests don’t tell us much about the capacity of the network, or
the reliability of the network, or the true latency with larger packet sizes.
Packet loss testing can help to fill in key missing information to give the end
customer the smooth experience they’re looking for.'
and
'Packets received over 250 ms latency are considered too late to be
useful for video conferencing.'
He actually reports both loss numbers and delay > 250ms, so in spite
arguing that loss is the relevant metric he already dips his toes into the
latency issue... I wonder whether his view will refine over time now that he
apparently moved from a link with 8% packet loss to one with a more sane 0.1%
loss rate (no idea how he measured lossrate though, or latency). I guess this
shows that there is no single solution for all links, it really matters where
one starts which of throughput, delay, loss is the most painful and hence the
dimension in need of a fix first.
Regards
Sebastian
>
>
>Nothing earth-shaking :-)
>
>
>RR
>
>
>
> _____
>
>From: Starlink [mailto:starlink-bounces@lists.bufferbloat.net] On
Behalf Of
>Robert McMahon via Starlink
>Sent: Thursday, January 12, 2023 9:50 AM
>To: Sebastian Moeller
>Cc: Dave Taht via Starlink; mike.reynolds@netforecast.com;
libreqos; David
>P. Reed; Rpm; bloat
>Subject: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers
in
>
>
>
>Hi Sebastien,
>
>You make a good point. What I did was issue a warning if the tool
found it
>was being CPU limited vs i/o limited. This indicates the i/o test
likely is
>inaccurate from an i/o perspective, and the results are suspect. It
does
>this crudely by comparing the cpu thread doing stats against the
traffic
>threads doing i/o, which thread is waiting on the others. There is
no
>attempt to assess the cpu load itself. So it's designed with a
singular
>purpose of making sure i/o threads only block on syscalls of write
and read.
>
>I probably should revisit this both in design and implementation.
Thanks for
>bringing it up and all input is truly appreciated.
>
>Bob
>
>On Jan 12, 2023, at 12:14 AM, Sebastian Moeller
<moeller0@gmx.de> wrote:
>
>Hi Bob,
>
>
>
>
>
>
> On Jan 11, 2023, at 21:09, rjmcmahon
<rjmcmahon@rjmcmahon.com> wrote:
>
>
>
>
>
> Iperf 2 is designed to measure network i/o. Note: It doesn't have
to move
>large amounts of data. It can support data profiles that don't
drive TCP's
>CCA as an example.
>
>
>
>
>
> Two things I've been asked for and avoided:
>
>
>
>
>
> 1) Integrate clock sync into iperf's test traffic
>
>
>
> [SM] This I understand, measurement conditions can be unsuited for
tight
>time synchronization...
>
>
>
>
>
>
> 2) Measure and output CPU usages
>
>
>
> [SM] This one puzzles me, as far as I understand the only way to
properly
>diagnose network issues is to rule out other things like CPU
overload that
>can have symptoms similar to network issues. As an example, the
cake qdisc
>will if CPU cycles become tight first increases its internal
queueing and
>jitter (not consciously, it is just an observation that once cake
does not
>get access to the CPU as timely as it wants, queuing latency and
variability
>increases) and then later also shows reduced throughput, so similar
things
>that can happen along an e2e network path for completely different
reasons,
>e.g. lower level retransmissions or a variable rate link. So i
would think
>that checking the CPU load at least coarse would be within the
scope of
>network testing tools, no?
>
>
>
>
>
>Regards
>
>
> Sebastian
>
>
>
>
>
>
>
>
>
>
>
>
> I think both of these are outside the scope of a tool designed to
test
>network i/o over sockets, rather these should be developed &
validated
>independently of a network i/o tool.
>
>
>
>
>
> Clock error really isn't about amount/frequency of traffic but
rather
>getting a periodic high-quality reference. I tend to use GPS pulse
per
>second to lock the local system oscillator to. As David says, most
every
>modern handheld computer has the GPS chips to do this already. So
to me it
>seems more of a policy choice between data center operators and
device mfgs
>and less of a technical issue.
>
>
>
>
>
> Bob
> Hello,
>
>
> Yall can call me crazy if you want.. but... see below [RWG]
> Hi Bib,
> On Jan 9, 2023, at 20:13, rjmcmahon via Starlink
><starlink@lists.bufferbloat.net> wrote:
>
>
>
>
>
> My biggest barrier is the lack of clock sync by the devices, i.e.
very
>limited support for PTP in data centers and in end devices. This
limits the
>ability to measure one way delays (OWD) and most assume that OWD is
1/2 and
>RTT which typically is a mistake. We know this intuitively with
airplane
>flight times or even car commute times where the one way time is
not 1/2 a
>round trip time. Google maps & directions provide a time
estimate for the
>one way link. It doesn't compute a round trip and divide by two.
>
>
>
>
>
> For those that can get clock sync working, the iperf 2
--trip-times options
>is useful.
> [SM] +1; and yet even with unsynchronized clocks one can try to
measure
>how latency changes under load and that can be done per direction.
Sure this
>is far inferior to real reliably measured OWDs, but if life/the
internet
>deals you lemons....
> [RWG] iperf2/iperf3, etc are already moving large amounts of data
>
>
> back and forth, for that matter any rate test, why not abuse some
of
>
>
> that data and add the fundemental NTP clock sync data and
>
>
> bidirectionally pass each others concept of "current
time". IIRC (its
>
>
> been 25 years since I worked on NTP at this level) you *should* be
>
>
> able to get a fairly accurate clock delta between each end, and
then
>
>
> use that info and time stamps in the data stream to compute OWD's.
>
>
> You need to put 4 time stamps in the packet, and with that you can
>
>
> compute "offset".
>
>
>
>
> --trip-times
>
>
> enable the measurement of end to end write to read latencies
(client and
>server clocks must be synchronized)
>
> [RWG] --clock-skew
>
>
> enable the measurement of the wall clock difference between
sender and
>receiver
> [SM] Sweet!
>
>
> Regards
>
>
> Sebastian
>
>
>
> Bob
> I have many kvetches about the new latency under load tests being
>
>
> designed and distributed over the past year. I am delighted! that
they
>
>
> are happening, but most really need third party evaluation, and
>
>
> calibration, and a solid explanation of what network pathologies
they
>
>
> do and don't cover. Also a RED team attitude towards them, as well
as
>
>
> thinking hard about what you are not measuring (operations
research).
>
>
> I actually rather love the new cloudflare speedtest, because it
tests
>
>
> a single TCP connection, rather than dozens, and at the same time
folk
>
>
> are complaining that it doesn't find the actual
"speed!". yet... the
>
>
> test itself more closely emulates a user experience than
speedtest.net
>
>
> does. I am personally pretty convinced that the fewer numbers of
flows
>
>
> that a web page opens improves the likelihood of a good user
>
>
> experience, but lack data on it.
>
>
> To try to tackle the evaluation and calibration part, I've reached
out
>
>
> to all the new test designers in the hope that we could get
together
>
>
> and produce a report of what each new test is actually doing. I've
>
>
> tweeted, linked in, emailed, and spammed every measurement list I
know
>
>
> of, and only to some response, please reach out to other test
designer
>
>
> folks and have them join the rpm email list?
>
>
> My principal kvetches in the new tests so far are:
>
>
> 0) None of the tests last long enough.
>
>
> Ideally there should be a mode where they at least run to
"time of
>
>
> first loss", or periodically, just run longer than the
>
>
> industry-stupid^H^H^H^H^H^Hstandard 20 seconds. There be dragons
>
>
> there! It's really bad science to optimize the internet for 20
>
>
> seconds. It's like optimizing a car, to handle well, for just 20
>
>
> seconds.
>
>
> 1) Not testing up + down + ping at the same time
>
>
> None of the new tests actually test the same thing that the
infamous
>
>
> rrul test does - all the others still test up, then down, and
ping. It
>
>
> was/remains my hope that the simpler parts of the flent test suite
-
>
>
> such as the tcp_up_squarewave tests, the rrul test, and the
rtt_fair
>
>
> tests would provide calibration to the test designers.
>
>
> we've got zillions of flent results in the archive published here:
>
>
> https://blog.cerowrt.org/post/found_in_flent/
>
>
> ps. Misinformation about iperf 2 impacts my ability to do this.
>
>
> The new tests have all added up + ping and down + ping, but not up
+
>
>
> down + ping. Why??
>
>
> The behaviors of what happens in that case are really
non-intuitive, I
>
>
> know, but... it's just one more phase to add to any one of those
new
>
>
> tests. I'd be deliriously happy if someone(s) new to the field
>
>
> started doing that, even optionally, and boggled at how it
defeated
>
>
> their assumptions.
>
>
> Among other things that would show...
>
>
> It's the home router industry's dirty secret than darn few
"gigabit"
>
>
> home routers can actually forward in both directions at a gigabit.
I'd
>
>
> like to smash that perception thoroughly, but given our starting
point
>
>
> is a gigabit router was a "gigabit switch" - and
historically been
>
>
> something that couldn't even forward at 200Mbit - we have a long
way
>
>
> to go there.
>
>
> Only in the past year have non-x86 home routers appeared that
could
>
>
> actually do a gbit in both directions.
>
>
> 2) Few are actually testing within-stream latency
>
>
> Apple's rpm project is making a stab in that direction. It looks
>
>
> highly likely, that with a little more work, crusader and
>
>
> go-responsiveness can finally start sampling the tcp RTT, loss and
>
>
> markings, more directly. As for the rest... sampling TCP_INFO on
>
>
> windows, and Linux, at least, always appeared simple to me, but
I'm
>
>
> discovering how hard it is by delving deep into the rust behind
>
>
> crusader.
>
>
> the goresponsiveness thing is also IMHO running WAY too many
streams
>
>
> at the same time, I guess motivated by an attempt to have the test
>
>
> complete quickly?
>
>
> B) To try and tackle the validation problem:ps. Misinformation
about iperf
>2 impacts my ability to do this.
>
>
> In the libreqos.io project we've established a testbed where tests
can
>
>
> be plunked through various ISP plan network emulations. It's here:
>
>
> https://payne.taht.net (run bandwidth test for what's currently
hooked
>
>
> up)
>
>
> We could rather use an AS number and at least a ipv4/24 and
ipv6/48 to
>
>
> leverage with that, so I don't have to nat the various emulations.
>
>
> (and funding, anyone got funding?) Or, as the code is GPLv2
licensed,
>
>
> to see more test designers setup a testbed like this to calibrate
>
>
> their own stuff.
>
>
> Presently we're able to test:
>
>
> flent
>
>
> netperf
>
>
> iperf2
>
>
> iperf3
>
>
> speedtest-cli
>
>
> crusader
>
>
> the broadband forum udp based test:
>
>
> https://github.com/BroadbandForum/obudpst
>
>
> trexx
>
>
> There's also a virtual machine setup that we can remotely drive a
web
>
>
> browser from (but I didn't want to nat the results to the world)
to
>
>
> test other web services.
>
>
>
>
>
> _____
>
>
>
>
>
>
> Rpm mailing list
>
>
> Rpm@lists.bufferbloat.net
>
>
> https://lists.bufferbloat.net/listinfo/rpm
>
>
>
>
>
>
> _____
>
>
>
>
>
>
> Starlink mailing list
>
>
> Starlink@lists.bufferbloat.net
>
>
> https://lists.bufferbloat.net/listinfo/starlink
>
>
>
>
>
> _____
>
>
>
>
>
>
> Starlink mailing list
>
>
> Starlink@lists.bufferbloat.net
>
>
> https://lists.bufferbloat.net/listinfo/starlink
>
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.