From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rjmcmahon@rjmcmahon.com>
Received: from bobcat.rjmcmahon.com (bobcat.rjmcmahon.com [45.33.58.123])
 (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 7FBA63B2A4;
 Thu, 12 Jan 2023 12:50:00 -0500 (EST)
Received: from [192.168.1.95] (c-69-181-111-171.hsd1.ca.comcast.net
 [69.181.111.171])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by bobcat.rjmcmahon.com (Postfix) with ESMTPSA id 7E7A31B252;
 Thu, 12 Jan 2023 09:49:59 -0800 (PST)
DKIM-Filter: OpenDKIM Filter v2.11.0 bobcat.rjmcmahon.com 7E7A31B252
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rjmcmahon.com;
 s=bobcat; t=1673545799;
 bh=g6HNaCWaEfqkLGioPY3DoyhGseB5lOOL1S1ekkVaExA=;
 h=In-Reply-To:References:Subject:From:Date:To:CC:From;
 b=MhqigmPeMmeOHv2KYEzaiN4/KSwVrkx/mgkVZbLz0O7tt/1cmDu29qYL+OzLaj7tI
 6Vnag++AdmVZROtWZ/Sb8o4ELvCIg5jNlXmBrzto13fHZUKoIIrIHDB0mCOVispceI
 t2ik3VQvD6sZuCzvAVoSYQMZ6hkg7YY7862BLlYo=
In-Reply-To: <44D8ADEB-8F89-4477-BCBD-C597A888A83E@gmx.de>
References: <202301111832.30BIWevV030127@gndrsh.dnsmgr.net>
 <d9f54596c8a5393f9917dbd2d684fb9e@rjmcmahon.com>
 <44D8ADEB-8F89-4477-BCBD-C597A888A83E@gmx.de>
X-Referenced-Uid: 0000fa36567702d5
Thread-Topic: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers in USA
X-Blue-Identity: !l=253&o=43&fo=11800&pl=211&po=0&qs=PREFIX&f=HTML&m=!%3AODY4NDIxODAtZDYzYS00ZmFiLTk1N2EtZjE0NWVlYzg4ZGQ1%3ASU5CT1g%3D%3AMDAwMGZhMzY1Njc3MDJkNQ%3D%3D%3AANSWERED&p=211&q=SHOW
X-Is-Generated-Message-Id: true
User-Agent: Android
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="----BEBC4X4QA19Y90RNCVLLJMIFKXLVN9"
Content-Transfer-Encoding: 7bit
From: Robert McMahon <rjmcmahon@rjmcmahon.com>
Date: Thu, 12 Jan 2023 09:49:59 -0800
To: Sebastian Moeller <moeller0@gmx.de>
CC: "Rodney W. Grimes" <starlink@gndrsh.dnsmgr.net>,
 Rpm <rpm@lists.bufferbloat.net>, 
 mike.reynolds@netforecast.com,"David P. Reed" <dpreed@deepplum.com>,
 libreqos <libreqos@lists.bufferbloat.net>,
 Dave Taht via Starlink <starlink@lists.bufferbloat.net>, 
 bloat <bloat@lists.bufferbloat.net>
Message-ID: <752678c9-9fdf-4abe-9915-564e3989f4d8@rjmcmahon.com>
Subject: Re: [Bloat] [Starlink] [Rpm] Researchers Seeking Probe Volunteers
	in USA
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 12 Jan 2023 17:50:00 -0000

------BEBC4X4QA19Y90RNCVLLJMIFKXLVN9
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
 charset=UTF-8

Hi Sebastien,

=E2=81=A3You make a good point=2E What I did was issue a war=
ning if the tool found it was being CPU limited vs i/o limited=2E This indi=
cates the i/o test likely is inaccurate from an i/o perspective, and the re=
sults are suspect=2E It does this crudely by comparing the cpu thread doing=
 stats against the traffic threads doing i/o, which thread is waiting on th=
e others=2E There is no attempt to assess the cpu load itself=2E So it's de=
signed with a singular purpose of making sure i/o threads only block on sys=
calls of write and read=2E

I probably should revisit this both in design a=
nd implementation=2E Thanks for bringing it up and all input is truly appre=
ciated=2E 

Bob

On Jan 12, 2023, 12:14 AM, at 12:14 AM, Sebastian Moeller =
<moeller0@gmx=2Ede> wrote:
>Hi Bob,
>
>
>> On Jan 11, 2023, at 21:09, rjmcm=
ahon <rjmcmahon@rjmcmahon=2Ecom> wrote:
>> 
>> Iperf 2 is designed to measu=
re network i/o=2E Note: It doesn't have to
>move large amounts of data=2E I=
t can support data profiles that don't
>drive TCP's CCA as an example=2E
>>=
 
>> Two things I've been asked for and avoided:
>> 
>> 1) Integrate clock =
sync into iperf's test traffic
>
>	[SM] This I understand, measurement cond=
itions can be unsuited for
>tight time synchronization=2E=2E=2E
>
>
>> 2) M=
easure and output CPU usages
>
>	[SM] This one puzzles me, as far as I unde=
rstand the only way to
>properly diagnose network issues is to rule out oth=
er things like CPU
>overload that can have symptoms similar to network issu=
es=2E As an
>example, the cake qdisc will if CPU cycles become tight first =
increases
>its internal queueing and jitter (not consciously, it is just an=

>observation that once cake does not get access to the CPU as timely as
>i=
t wants, queuing latency and variability increases) and then later
>also sh=
ows reduced throughput, so similar things that can happen along
>an e2e net=
work path for completely different reasons, e=2Eg=2E lower level
>retransmi=
ssions or a variable rate link=2E So i would think that checking
>the CPU l=
oad at least coarse would be within the scope of network
>testing tools, no=
?
>
>Regards
>	Sebastian
>
>
>
>
>> I think both of these are outside the s=
cope of a tool designed to
>test network i/o over sockets, rather these sho=
uld be developed &
>validated independently of a network i/o tool=2E
>> 
>>=
 Clock error really isn't about amount/frequency of traffic but rather
>get=
ting a periodic high-quality reference=2E I tend to use GPS pulse per
>seco=
nd to lock the local system oscillator to=2E As David says, most
>every mod=
ern handheld computer has the GPS chips to do this already=2E So
>to me it =
seems more of a policy choice between data center operators
>and device mfg=
s and less of a technical issue=2E
>> 
>> Bob
>>> Hello,
>>> 	Yall can call=
 me crazy if you want=2E=2E but=2E=2E=2E see below [RWG]
>>>> Hi Bib,
>>>> =
> On Jan 9, 2023, at 20:13, rjmcmahon via Starlink
><starlink@lists=2Ebuffe=
rbloat=2Enet> wrote:
>>>> >
>>>> > My biggest barrier is the lack of clock =
sync by the devices, i=2Ee=2E
>very limited support for PTP in data centers=
 and in end devices=2E This
>limits the ability to measure one way delays (=
OWD) and most assume that
>OWD is 1/2 and RTT which typically is a mistake=
=2E We know this
>intuitively with airplane flight times or even car commut=
e times where
>the one way time is not 1/2 a round trip time=2E Google maps=
 & directions
>provide a time estimate for the one way link=2E It doesn't c=
ompute a
>round trip and divide by two=2E
>>>> >
>>>> > For those that can =
get clock sync working, the iperf 2
>--trip-times options is useful=2E
>>>>=
 	[SM] +1; and yet even with unsynchronized clocks one can try to
>measure =
how latency changes under load and that can be done per
>direction=2E Sure =
this is far inferior to real reliably measured OWDs,
>but if life/the inter=
net deals you lemons=2E=2E=2E=2E
>>> [RWG] iperf2/iperf3, etc are already m=
oving large amounts of data
>>> back and forth, for that matter any rate te=
st, why not abuse some of
>>> that data and add the fundemental NTP clock s=
ync data and
>>> bidirectionally pass each others concept of "current time"=
=2E  IIRC
>(its
>>> been 25 years since I worked on NTP at this level) you =
*should* be
>>> able to get a fairly accurate clock delta between each end,=
 and then
>>> use that info and time stamps in the data stream to compute O=
WD's=2E
>>> You need to put 4 time stamps in the packet, and with that you =
can
>>> compute "offset"=2E
>>>> >
>>>> > --trip-times
>>>> >  enable the m=
easurement of end to end write to read latencies
>(client and server clocks=
 must be synchronized)
>>> [RWG] --clock-skew
>>> 	enable the measurement o=
f the wall clock difference between sender
>and receiver
>>>> 	[SM] Sweet!
=
>>>> Regards
>>>> 	Sebastian
>>>> >
>>>> > Bob
>>>> >> I have many kvetches=
 about the new latency under load tests
>being
>>>> >> designed and distrib=
uted over the past year=2E I am delighted!
>that they
>>>> >> are happening=
, but most really need third party evaluation, and
>>>> >> calibration, and=
 a solid explanation of what network pathologies
>they
>>>> >> do and don't=
 cover=2E Also a RED team attitude towards them, as
>well as
>>>> >> thinki=
ng hard about what you are not measuring (operations
>research)=2E
>>>> >> =
I actually rather love the new cloudflare speedtest, because it
>tests
>>>>=
 >> a single TCP connection, rather than dozens, and at the same
>time folk=

>>>> >> are complaining that it doesn't find the actual "speed!"=2E yet=2E=
=2E=2E
>the
>>>> >> test itself more closely emulates a user experience tha=
n
>speedtest=2Enet
>>>> >> does=2E I am personally pretty convinced that th=
e fewer numbers of
>flows
>>>> >> that a web page opens improves the likeli=
hood of a good user
>>>> >> experience, but lack data on it=2E
>>>> >> To t=
ry to tackle the evaluation and calibration part, I've
>reached out
>>>> >>=
 to all the new test designers in the hope that we could get
>together
>>>>=
 >> and produce a report of what each new test is actually doing=2E
>I've
>=
>>> >> tweeted, linked in, emailed, and spammed every measurement list
>I k=
now
>>>> >> of, and only to some response, please reach out to other test
>=
designer
>>>> >> folks and have them join the rpm email list?
>>>> >> My pr=
incipal kvetches in the new tests so far are:
>>>> >> 0) None of the tests =
last long enough=2E
>>>> >> Ideally there should be a mode where they at le=
ast run to "time
>of
>>>> >> first loss", or periodically, just run longer =
than the
>>>> >> industry-stupid^H^H^H^H^H^Hstandard 20 seconds=2E There be=
 dragons
>>>> >> there! It's really bad science to optimize the internet fo=
r 20
>>>> >> seconds=2E It's like optimizing a car, to handle well, for jus=
t 20
>>>> >> seconds=2E
>>>> >> 1) Not testing up + down + ping at the same=
 time
>>>> >> None of the new tests actually test the same thing that the
>=
infamous
>>>> >> rrul test does - all the others still test up, then down, =
and
>ping=2E It
>>>> >> was/remains my hope that the simpler parts of the f=
lent test
>suite -
>>>> >> such as the tcp_up_squarewave tests, the rrul te=
st, and the
>rtt_fair
>>>> >> tests would provide calibration to the test d=
esigners=2E
>>>> >> we've got zillions of flent results in the archive publ=
ished
>here:
>>>> >> https://blog=2Ecerowrt=2Eorg/post/found_in_flent/
>>>>=
 >> ps=2E Misinformation about iperf 2 impacts my ability to do this=2E
>>>=
> >
>>>> >> The new tests have all added up + ping and down + ping, but not=

>up +
>>>> >> down + ping=2E Why??
>>>> >> The behaviors of what happens i=
n that case are really
>non-intuitive, I
>>>> >> know, but=2E=2E=2E it's ju=
st one more phase to add to any one of those
>new
>>>> >> tests=2E I'd be d=
eliriously happy if someone(s) new to the field
>>>> >> started doing that,=
 even optionally, and boggled at how it
>defeated
>>>> >> their assumptions=
=2E
>>>> >> Among other things that would show=2E=2E=2E
>>>> >> It's the ho=
me router industry's dirty secret than darn few
>"gigabit"
>>>> >> home rou=
ters can actually forward in both directions at a
>gigabit=2E I'd
>>>> >> l=
ike to smash that perception thoroughly, but given our starting
>point
>>>>=
 >> is a gigabit router was a "gigabit switch" - and historically
>been
>>>=
> >> something that couldn't even forward at 200Mbit - we have a long
>way
=
>>>> >> to go there=2E
>>>> >> Only in the past year have non-x86 home rout=
ers appeared that
>could
>>>> >> actually do a gbit in both directions=2E
>=
>>> >> 2) Few are actually testing within-stream latency
>>>> >> Apple's rp=
m project is making a stab in that direction=2E It looks
>>>> >> highly lik=
ely, that with a little more work, crusader and
>>>> >> go-responsiveness c=
an finally start sampling the tcp RTT, loss
>and
>>>> >> markings, more dir=
ectly=2E As for the rest=2E=2E=2E sampling TCP_INFO on
>>>> >> windows, and=
 Linux, at least, always appeared simple to me, but
>I'm
>>>> >> discoverin=
g how hard it is by delving deep into the rust behind
>>>> >> crusader=2E
>=
>>> >> the goresponsiveness thing is also IMHO running WAY too many
>stream=
s
>>>> >> at the same time, I guess motivated by an attempt to have the
>te=
st
>>>> >> complete quickly?
>>>> >> B) To try and tackle the validation pr=
oblem:ps=2E Misinformation
>about iperf 2 impacts my ability to do this=2E
=
>>>> >
>>>> >> In the libreqos=2Eio project we've established a testbed whe=
re
>tests can
>>>> >> be plunked through various ISP plan network emulation=
s=2E It's
>here:
>>>> >> https://payne=2Etaht=2Enet (run bandwidth test for=
 what's currently
>hooked
>>>> >> up)
>>>> >> We could rather use an AS num=
ber and at least a ipv4/24 and
>ipv6/48 to
>>>> >> leverage with that, so I=
 don't have to nat the various
>emulations=2E
>>>> >> (and funding, anyone =
got funding?) Or, as the code is GPLv2
>licensed,
>>>> >> to see more test =
designers setup a testbed like this to
>calibrate
>>>> >> their own stuff=
=2E
>>>> >> Presently we're able to test:
>>>> >> flent
>>>> >> netperf
>>>=
> >> iperf2
>>>> >> iperf3
>>>> >> speedtest-cli
>>>> >> crusader
>>>> >> t=
he broadband forum udp based test:
>>>> >> https://github=2Ecom/BroadbandFo=
rum/obudpst
>>>> >> trexx
>>>> >> There's also a virtual machine setup that=
 we can remotely drive
>a web
>>>> >> browser from (but I didn't want to na=
t the results to the world)
>to
>>>> >> test other web services=2E
>>>> >> =
_______________________________________________
>>>> >> Rpm mailing list
>>=
>> >> Rpm@lists=2Ebufferbloat=2Enet
>>>> >> https://lists=2Ebufferbloat=2En=
et/listinfo/rpm
>>>> > _______________________________________________
>>>>=
 > Starlink mailing list
>>>> > Starlink@lists=2Ebufferbloat=2Enet
>>>> > h=
ttps://lists=2Ebufferbloat=2Enet/listinfo/starlink
>>>> ___________________=
____________________________
>>>> Starlink mailing list
>>>> Starlink@lists=
=2Ebufferbloat=2Enet
>>>> https://lists=2Ebufferbloat=2Enet/listinfo/starli=
nk

------BEBC4X4QA19Y90RNCVLLJMIFKXLVN9
Content-Type: text/html;
 charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><head></head><body style=3D"zoom: 0%;"><div dir=3D"auto">Hi Sebastien=
,<br><br></div>
<div dir=3D"auto"><!-- tmjah_g_1299s -->You make a good poi=
nt=2E What I did was issue a warning if the tool found it was being CPU lim=
ited vs i/o limited=2E This indicates the i/o test likely is inaccurate fro=
m an i/o perspective, and the results are suspect=2E It does this crudely b=
y comparing the cpu thread doing stats against the traffic threads doing i/=
o, which thread is waiting on the others=2E There is no attempt to assess t=
he cpu load itself=2E So it's designed with a singular purpose of making su=
re i/o threads only block on syscalls of write and read=2E<!-- tmjah_g_1299=
e --><br><br></div>
<div dir=3D"auto"><!-- tmjah_g_1299s -->I probably shou=
ld revisit this both in design and implementation=2E Thanks for bringing it=
 up and all input is truly appreciated=2E <!-- tmjah_g_1299e --><br><br></d=
iv>
<div dir=3D"auto"><!-- tmjah_g_1299s -->Bob<!-- tmjah_g_1299e --></div>=

<div class=3D"gmail_quote" >On Jan 12, 2023, at 12:14 AM, Sebastian Moelle=
r &lt;<a href=3D"mailto:moeller0@gmx=2Ede" target=3D"_blank">moeller0@gmx=
=2Ede</a>&gt; wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0pt =
0pt 0pt 0=2E8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1=
ex;">
<pre class=3D"blue">Hi Bob,<br><br><br><blockquote class=3D"gmail_quo=
te" style=3D"margin: 0pt 0pt 1ex 0=2E8ex; border-left: 1px solid #729fcf; p=
adding-left: 1ex;"> On Jan 11, 2023, at 21:09, rjmcmahon &lt;rjmcmahon@rjmc=
mahon=2Ecom&gt; wrote:<br> <br> Iperf 2 is designed to measure network i/o=
=2E Note: It doesn't have to move large amounts of data=2E It can support d=
ata profiles that don't drive TCP's CCA as an example=2E<br> <br> Two thing=
s I've been asked for and avoided:<br> <br> 1) Integrate clock sync into ip=
erf's test traffic<br></blockquote><br> [SM] This I understand, measurement=
 conditions can be unsuited for tight time synchronization=2E=2E=2E<br><br>=
<br><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0=2E8ex;=
 border-left: 1px solid #729fcf; padding-left: 1ex;"> 2) Measure and output=
 CPU usages<br></blockquote><br> [SM] This one puzzles me, as far as I unde=
rstand the only way to properly diagnose network issues is to rule out othe=
r things like CPU overload that can have symptoms similar to network issues=
=2E As an example, the cake qdisc will if CPU cycles become tight first inc=
reases its internal queueing and jitter (not consciously, it is just an obs=
ervation that once cake does not get access to the CPU as timely as it want=
s, queuing latency and variability increases) and then later also shows red=
uced throughput, so similar things that can happen along an e2e network pat=
h for completely different reasons, e=2Eg=2E lower level retransmissions or=
 a variable rate link=2E So i would think that checking the CPU load at lea=
st coarse would be within the scope of network testing tools, no?<br><br>Re=
gards<br> Sebastian<br><br><br><br><br><blockquote class=3D"gmail_quote" st=
yle=3D"margin: 0pt 0pt 1ex 0=2E8ex; border-left: 1px solid #729fcf; padding=
-left: 1ex;"> I think both of these are outside the scope of a tool designe=
d to test network i/o over sockets, rather these should be developed &amp; =
validated independently of a network i/o tool=2E<br> <br> Clock error reall=
y isn't about amount/frequency of traffic but rather getting a periodic hig=
h-quality reference=2E I tend to use GPS pulse per second to lock the local=
 system oscillator to=2E As David says, most every modern handheld computer=
 has the GPS chips to do this already=2E So to me it seems more of a policy=
 choice between data center operators and device mfgs and less of a technic=
al issue=2E<br> <br> Bob<br><blockquote class=3D"gmail_quote" style=3D"marg=
in: 0pt 0pt 1ex 0=2E8ex; border-left: 1px solid #ad7fa8; padding-left: 1ex;=
"> Hello,<br>  Yall can call me crazy if you want=2E=2E but=2E=2E=2E see be=
low [RWG]<br><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex=
 0=2E8ex; border-left: 1px solid #8ae234; padding-left: 1ex;"> Hi Bib,<br><=
blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0=2E8ex; bord=
er-left: 1px solid #fcaf3e; padding-left: 1ex;"> On Jan 9, 2023, at 20:13, =
rjmcmahon via Starlink &lt;starlink@lists=2Ebufferbloat=2Enet&gt; wrote:<br=
><br> My biggest barrier is the lack of clock sync by the devices, i=2Ee=2E=
 very limited support for PTP in data centers and in end devices=2E This li=
mits the ability to measure one way delays (OWD) and most assume that OWD i=
s 1/2 and RTT which typically is a mistake=2E We know this intuitively with=
 airplane flight times or even car commute times where the one way time is =
not 1/2 a round trip time=2E Google maps &amp; directions provide a time es=
timate for the one way link=2E It doesn't compute a round trip and divide b=
y two=2E<br><br> For those that can get clock sync working, the iperf 2 --t=
rip-times options is useful=2E<br></blockquote>  [SM] +1; and yet even with=
 unsynchronized clocks one can try to measure how latency changes under loa=
d and that can be done per direction=2E Sure this is far inferior to real r=
eliably measured OWDs, but if life/the internet deals you lemons=2E=2E=2E=
=2E<br></blockquote> [RWG] iperf2/iperf3, etc are already moving large amou=
nts of data<br> back and forth, for that matter any rate test, why not abus=
e some of<br> that data and add the fundemental NTP clock sync data and<br>=
 bidirectionally pass each others concept of "current time"=2E  IIRC (its<b=
r> been 25 years since I worked on NTP at this level) you *should* be<br> a=
ble to get a fairly accurate clock delta between each end, and then<br> use=
 that info and time stamps in the data stream to compute OWD's=2E<br> You n=
eed to put 4 time stamps in the packet, and with that you can<br> compute "=
offset"=2E<br><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1e=
x 0=2E8ex; border-left: 1px solid #8ae234; padding-left: 1ex;"><blockquote =
class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0=2E8ex; border-left: 1p=
x solid #fcaf3e; padding-left: 1ex;"><br> --trip-times<br>  enable the meas=
urement of end to end write to read latencies (client and server clocks mus=
t be synchronized)<br></blockquote></blockquote> [RWG] --clock-skew<br>  en=
able the measurement of the wall clock difference between sender and receiv=
er<br><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0=2E8e=
x; border-left: 1px solid #8ae234; padding-left: 1ex;">  [SM] Sweet!<br> Re=
gards<br>  Sebastian<br><blockquote class=3D"gmail_quote" style=3D"margin: =
0pt 0pt 1ex 0=2E8ex; border-left: 1px solid #fcaf3e; padding-left: 1ex;"><b=
r> Bob<br><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0=
=2E8ex; border-left: 1px solid #e9b96e; padding-left: 1ex;"> I have many kv=
etches about the new latency under load tests being<br> designed and distri=
buted over the past year=2E I am delighted! that they<br> are happening, bu=
t most really need third party evaluation, and<br> calibration, and a solid=
 explanation of what network pathologies they<br> do and don't cover=2E Als=
o a RED team attitude towards them, as well as<br> thinking hard about what=
 you are not measuring (operations research)=2E<br> I actually rather love =
the new cloudflare speedtest, because it tests<br> a single TCP connection,=
 rather than dozens, and at the same time folk<br> are complaining that it =
doesn't find the actual "speed!"=2E yet=2E=2E=2E the<br> test itself more c=
losely emulates a user experience than <a href=3D"http://speedtest=2Enet">s=
peedtest=2Enet</a><br> does=2E I am personally pretty convinced that the fe=
wer numbers of flows<br> that a web page opens improves the likelihood of a=
 good user<br> experience, but lack data on it=2E<br> To try to tackle the =
evaluation and calibration part, I've reached out<br> to all the new test d=
esigners in the hope that we could get together<br> and produce a report of=
 what each new test is actually doing=2E I've<br> tweeted, linked in, email=
ed, and spammed every measurement list I know<br> of, and only to some resp=
onse, please reach out to other test designer<br> folks and have them join =
the rpm email list?<br> My principal kvetches in the new tests so far are:<=
br> 0) None of the tests last long enough=2E<br> Ideally there should be a =
mode where they at least run to "time of<br> first loss", or periodically, =
just run longer than the<br> industry-stupid^H^H^H^H^H^Hstandard 20 seconds=
=2E There be dragons<br> there! It's really bad science to optimize the int=
ernet for 20<br> seconds=2E It's like optimizing a car, to handle well, for=
 just 20<br> seconds=2E<br> 1) Not testing up + down + ping at the same tim=
e<br> None of the new tests actually test the same thing that the infamous<=
br> rrul test does - all the others still test up, then down, and ping=2E I=
t<br> was/remains my hope that the simpler parts of the flent test suite -<=
br> such as the tcp_up_squarewave tests, the rrul test, and the rtt_fair<br=
> tests would provide calibration to the test designers=2E<br> we've got zi=
llions of flent results in the archive published here:<br> <a href=3D"https=
://blog=2Ecerowrt=2Eorg/post/found_in_flent">https://blog=2Ecerowrt=2Eorg/p=
ost/found_in_flent</a>/<br> ps=2E Misinformation about iperf 2 impacts my a=
bility to do this=2E<br></blockquote><br><blockquote class=3D"gmail_quote" =
style=3D"margin: 0pt 0pt 1ex 0=2E8ex; border-left: 1px solid #e9b96e; paddi=
ng-left: 1ex;"> The new tests have all added up + ping and down + ping, but=
 not up +<br> down + ping=2E Why??<br> The behaviors of what happens in tha=
t case are really non-intuitive, I<br> know, but=2E=2E=2E it's just one mor=
e phase to add to any one of those new<br> tests=2E I'd be deliriously happ=
y if someone(s) new to the field<br> started doing that, even optionally, a=
nd boggled at how it defeated<br> their assumptions=2E<br> Among other thin=
gs that would show=2E=2E=2E<br> It's the home router industry's dirty secre=
t than darn few "gigabit"<br> home routers can actually forward in both dir=
ections at a gigabit=2E I'd<br> like to smash that perception thoroughly, b=
ut given our starting point<br> is a gigabit router was a "gigabit switch" =
- and historically been<br> something that couldn't even forward at 200Mbit=
 - we have a long way<br> to go there=2E<br> Only in the past year have non=
-x86 home routers appeared that could<br> actually do a gbit in both direct=
ions=2E<br> 2) Few are actually testing within-stream latency<br> Apple's r=
pm project is making a stab in that direction=2E It looks<br> highly likely=
, that with a little more work, crusader and<br> go-responsiveness can fina=
lly start sampling the tcp RTT, loss and<br> markings, more directly=2E As =
for the rest=2E=2E=2E sampling TCP_INFO on<br> windows, and Linux, at least=
, always appeared simple to me, but I'm<br> discovering how hard it is by d=
elving deep into the rust behind<br> crusader=2E<br> the goresponsiveness t=
hing is also IMHO running WAY too many streams<br> at the same time, I gues=
s motivated by an attempt to have the test<br> complete quickly?<br> B) To =
try and tackle the validation problem:ps=2E Misinformation about iperf 2 im=
pacts my ability to do this=2E<br></blockquote><br><blockquote class=3D"gma=
il_quote" style=3D"margin: 0pt 0pt 1ex 0=2E8ex; border-left: 1px solid #e9b=
96e; padding-left: 1ex;"> In the <a href=3D"http://libreqos=2Eio">libreqos=
=2Eio</a> project we've established a testbed where tests can<br> be plunke=
d through various ISP plan network emulations=2E It's here:<br> <a href=3D"=
https://payne=2Etaht=2Enet">https://payne=2Etaht=2Enet</a> (run bandwidth t=
est for what's currently hooked<br> up)<br> We could rather use an AS numbe=
r and at least a ipv4/24 and ipv6/48 to<br> leverage with that, so I don't =
have to nat the various emulations=2E<br> (and funding, anyone got funding?=
) Or, as the code is GPLv2 licensed,<br> to see more test designers setup a=
 testbed like this to calibrate<br> their own stuff=2E<br> Presently we're =
able to test:<br> flent<br> netperf<br> iperf2<br> iperf3<br> speedtest-cli=
<br> crusader<br> the broadband forum udp based test:<br> <a href=3D"https:=
//github=2Ecom/BroadbandForum/obudpst">https://github=2Ecom/BroadbandForum/=
obudpst</a><br> trexx<br> There's also a virtual machine setup that we can =
remotely drive a web<br> browser from (but I didn't want to nat the results=
 to the world) to<br> test other web services=2E<br><hr><br> Rpm mailing li=
st<br> Rpm@lists=2Ebufferbloat=2Enet<br> <a href=3D"https://lists=2Ebufferb=
loat=2Enet/listinfo/rpm">https://lists=2Ebufferbloat=2Enet/listinfo/rpm</a>=
<br></blockquote><hr><br> Starlink mailing list<br> Starlink@lists=2Ebuffer=
bloat=2Enet<br> <a href=3D"https://lists=2Ebufferbloat=2Enet/listinfo/starl=
ink">https://lists=2Ebufferbloat=2Enet/listinfo/starlink</a><br></blockquot=
e><hr><br> Starlink mailing list<br> Starlink@lists=2Ebufferbloat=2Enet<br>=
 <a href=3D"https://lists=2Ebufferbloat=2Enet/listinfo/starlink">https://li=
sts=2Ebufferbloat=2Enet/listinfo/starlink</a><br></blockquote></blockquote>=
</blockquote><br></pre></blockquote></div></body></html>
------BEBC4X4QA19Y90RNCVLLJMIFKXLVN9--