From: Sebastian
Moeller [mailto:moeller0@gmx.de]
Sent: Thursday, January 12, 2023
11:33 PM
To:
Cc: 'Rodney W. Grimes';
mike.reynolds@netforecast.com; 'libreqos'; '
Subject: RE: [Starlink] [Rpm]
Researchers Seeking Probe Volunteers in
Hi RR,
Thanks for the detailed response below, since my point is somewhat orthogonal I
opted for top-posting.
Let me take a step back here and rephrase, synchronising clocks within an
acceptable range to be useful is not rocket science nor witchcraft. For
measuring internet traffic 'millisecond' range seems acceptable, local networks
can probably profit from finer time resolution. So I am not after e.g. clock
synchronisation to participate in SDH/SONET. Heck in the toy project I am
active in, we operate on load dependent delay deltas so we even ignore
different time offsets and are tolerant to (mildly) different tickrates and
clock skew, but it would certainly be nice to have some acceptable measure of
UTC from endpoints to be able to interpret timestamps as 'absolute'. Mind you I
am fine with them not being veridical absolute, but just good enough for my
measurement purpose and I guess that should be within the range of the
achievable. Heck, if all servers we query timestamps of would be
NTP-'synchronized' and would follow the RFC recommendation to report timestamps
in milliseconds past midnight UTC I would be happy.
[RR] Yup! All true. Hence my post that
obviously passed this one in the ether! J J
Regards
Sebsstian
On 12 January 2023 21:39:21 CET, Dick Roy <
Hi Sebastian (et. al.),
[I’ll comment up here instead of inline.]
Let me start by saying that I have not been intimately involved with
the IEEE 1588 effort (PTP), however I was involved in the 802.11 efforts along
a similar vein, just adding the wireless first hop component and it’s effects
on PTP.
What was apparent from the outset was that there was a lack of
understanding what the terms “to synchronize” or “to be synchronized” actually
mean. It’s not trivial … because we live in a (approximately, that’s
another story!) 4-D space-time continuum where the Lorentz metric plays a
critical role. Therein, simultaneity (aka “things happening at the same
time”) means the “distance” between two such events is zero and that distance
is given by sqrt(x^2 + y^2 + z^2 –
(ct)^2) and the “thing happening” can be the tick of a clock somewhere. Now
since everything is relative (time with respect to what? / location with
respect to where?) it’s pretty easy to see that “if you don’t know where you
are, you can’t know what time it is!” (English sailors of the 18th
century knew this well!) Add to this the fact that if everything were
stationary, nothing would happen (as Einstein said “Nothing happens until
something moves!”), special relativity also pays a role. Clocks on GPS
satellites run approx. 7usecs/day slower than those on earth due to their
“speed” (8700 mph roughly)! Then add the consequence that without mass we
wouldn’t exist (in these forms at leastJ), and gravitational effects (aka
General Relativity) come into play. Those turn out to make clocks on GPS
satellites run 45usec/day faster than those on earth! The net effect is
that GPS clocks run about 38usec/day faster than clocks on earth. So what
does it mean to “synchronize to GPS”? Point is: it’s a non-trivial
question with a very complicated answer. The reason it is important to
get all this right is that the “what that ties time and space together” is the
speed of light and that turns out to be a “foot-per-nanosecond” in a vacuum
(roughly 300m/usec). This means if I am uncertain about my location to
say 300 meters, then I also am not sure what time it is to a usec AND
vice-versa!
All that said, the simplest explanation of synchronization is probably:
Two clocks are synchronized if, when they are brought (slowly) into physical
proximity (“sat next to each other”) in the same (quasi-)inertial frame and the
same gravitational potential (not so obvious BTW … see the FYI below!), an
observer of both would say “they are keeping time identically”. Since this
experiment is rarely possible, one can never be “sure” that his clock is
synchronized to any other clock elsewhere. And what does it mean to say they
“were synchronized” when brought together, but now they are not because they
are now in different gravitational potentials! (FYI, there are land mine
detectors being developed on this very principle! I know someone who actually
worked on such a project!)
This all gets even more complicated when dealing with large networks of
networks in which the “speed of information transmission” can vary depending on
the medium (cf. coaxial cables versus fiber versus microwave links!) In fact,
the atmosphere is one of those media and variations therein result in the need
for “GPS corrections” (cf. RTCM GPS correction messages, RTK, etc.) in order to
get to sub-nsec/cm accuracy. Point is if you have a set of nodes
distributed across the country all with GPS and all “synchronized to GPS time”,
and a second identical set of nodes (with no GPS) instead connected with a
network of cables and fiber links, all of different lengths and composition
using different carrier frequencies (dielectric constants vary with frequency!)
“synchronized” to some clock somewhere using NTP or PTP), the synchronization
of the two sets will be different unless a common reference clock is used AND
all the above effects are taken into account, and good luck with that! J
In conclusion, if anyone tells you that clock synchronization in
communication networks is simple (“Just use GPS!”), you should feel free to
chuckle (under your breath if necessaryJ)
Cheers,
RR
-----Original Message-----
From: Sebastian Moeller [mailto:moeller0@gmx.de]
Sent: Thursday, January 12, 2023 12:23 AM
To: Dick Roy
Cc: Rodney W. Grimes; mike.reynolds@netforecast.com; libreqos;
Subject: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers in
Hi RR,
> On Jan 11, 2023, at 22:46, Dick Roy <
>
>
>
> -----Original Message-----
> From: Starlink [mailto:starlink-bounces@lists.bufferbloat.net] On
Behalf Of Sebastian Moeller via Starlink
> Sent: Wednesday, January 11, 2023 12:01 PM
> To: Rodney W. Grimes
> Cc: Dave Taht via Starlink; mike.reynolds@netforecast.com;
libreqos;
> Subject: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers
in
>
> Hi Rodney,
>
>
>
>
> > On Jan 11, 2023, at 19:32, Rodney W. Grimes
<starlink@gndrsh.dnsmgr.net> wrote:
> >
> > Hello,
> >
> > Yall can call me crazy if you want..
but... see below [RWG]
> >> Hi Bib,
> >>
> >>
> >>> On Jan 9, 2023, at 20:13, rjmcmahon via Starlink
<starlink@lists.bufferbloat.net> wrote:
> >>>
> >>> My biggest barrier is the lack of clock sync by the
devices, i.e. very limited support for PTP in data centers and in end devices.
This limits the ability to measure one way delays (OWD) and most assume that
OWD is 1/2 and RTT which typically is a mistake. We know this intuitively with
airplane flight times or even car commute times where the one way time is not
1/2 a round trip time. Google maps & directions provide a time estimate for
the one way link. It doesn't compute a round trip and divide by two.
> >>>
> >>> For those that can get clock sync working, the iperf 2
--trip-times options is useful.
> >>
> >> [SM] +1; and yet even with
unsynchronized clocks one can try to measure how latency changes under load and
that can be done per direction. Sure this is far inferior to real reliably
measured OWDs, but if life/the internet deals you lemons....
> >
> > [RWG] iperf2/iperf3, etc are already moving large amounts of
data back and forth, for that matter any rate test, why not abuse some of that
data and add the fundemental NTP clock sync data and bidirectionally pass each
others concept of "current time". IIRC (its been 25 years since
I worked on NTP at this level) you *should* be able to get a fairly accurate
clock delta between each end, and then use that info and time stamps in the
data stream to compute OWD's. You need to put 4 time stamps in the
packet, and with that you can compute "offset".
> [RR] For this to work at a reasonable level of accuracy, the
timestamping circuits on both ends need to be deterministic and repeatable as I
recall. Any uncertainty in that process adds to synchronization
errors/uncertainties.
>
> [SM] Nice idea. I would guess
that all timeslot based access technologies (so starlink, docsis, GPON, LTE?)
all distribute "high quality time" carefully to the
"modems", so maybe all that would be needed is to expose that high
quality time to the LAN side of those modems, dressed up as NTP server?
> [RR] It’s not that simple! Distributing “high-quality time”,
i.e. “synchronizing all clocks” does not solve the communication problem in synchronous
slotted MAC/PHYs!
[SM] I happily believe you, but the same
idea of "time slot" needs to be shared by all nodes, no? So the
clockss need to be reasonably similar rate, aka synchronized (see below).
> All the technologies you mentioned above are essentially
P2P, not intended for broadcast. Point is, there is a point controller
(aka PoC) often called a base station (eNodeB, gNodeB, …) that actually
“controls everything that is necessary to control” at the UE including time,
frequency and sampling time offsets, and these are critical to get right if you
want to communicate, and they are ALL subject to the laws of physics (cf. the
speed of light)! Turns out that what is necessary for the system to function
anywhere near capacity, is for all the clocks governing transmissions from the
UEs to be “unsynchronized” such that all the UE transmissions arrive at the PoC
at the same (prescribed) time!
[SM] Fair enough. I would call clocks
that are "in sync" albeit with individual offsets as synchronized,
but I am a layman and that might sound offensively wrong to experts in the
field. But even without the naming my point is that all systems that depend on
some idea of shared time-base are halfway there of exposing that time to end
users, by "translating it into an NTP time source at the modem.
> For some technologies, in particular 5G!, these considerations are
ESSENTIAL. Feel free to scour the 3GPP LTE 5G RLC and PHY specs if you don’t
believe me! J
[SM Far be it from me not to believe
you, so thanks for the pointers. Yet, I still think that unless different nodes
of a shared segment move at significantly different speeds, that there should
be a common "tick-duration" for all clocks even if each clock runs at
an offset... (I naively would try to implement something like that by trying to
fully synchronize clocks and maintain a local offset value to convert from
"absolute" time to "network" time, but likely because
coming from the outside I am blissfully unaware of the detail challenges that
need to be solved).
Regards & Thanks
Sebastian
>
>
> >
> >>
> >>
> >>>
> >>> --trip-times
> >>> enable the measurement of end to end write to read
latencies (client and server clocks must be synchronized)
> > [RWG] --clock-skew
> > enable the measurement of the wall
clock difference between sender and receiver
> >
> >>
> >> [SM] Sweet!
> >>
> >> Regards
> >> Sebastian
> >>
> >>>
> >>> Bob
> >>>> I have many kvetches about the new latency under
load tests being
> >>>> designed and distributed over the past year. I am
delighted! that they
> >>>> are happening, but most really need third party
evaluation, and
> >>>> calibration, and a solid explanation of what
network pathologies they
> >>>> do and don't cover. Also a RED team attitude
towards them, as well as
> >>>> thinking hard about what you are not measuring
(operations research).
> >>>> I actually rather love the new cloudflare
speedtest, because it tests
> >>>> a single TCP connection, rather than dozens, and
at the same time folk
> >>>> are complaining that it doesn't find the actual
"speed!". yet... the
> >>>> test itself more closely emulates a user
experience than speedtest.net
> >>>> does. I am personally pretty convinced that the
fewer numbers of flows
> >>>> that a web page opens improves the likelihood of
a good user
> >>>> experience, but lack data on it.
> >>>> To try to tackle the evaluation and calibration
part, I've reached out
> >>>> to all the new test designers in the hope that we
could get together
> >>>> and produce a report of what each new test is
actually doing. I've
> >>>> tweeted, linked in, emailed, and spammed every
measurement list I know
> >>>> of, and only to some response, please reach out
to other test designer
> >>>> folks and have them join the rpm email list?
> >>>> My principal kvetches in the new tests so far
are:
> >>>> 0) None of the tests last long enough.
> >>>> Ideally there should be a mode where they at
least run to "time of
> >>>> first loss", or periodically, just run
longer than the
> >>>> industry-stupid^H^H^H^H^H^Hstandard 20 seconds.
There be dragons
> >>>> there! It's really bad science to optimize the
internet for 20
> >>>> seconds. It's like optimizing a car, to handle
well, for just 20
> >>>> seconds.
> >>>> 1) Not testing up + down + ping at the same time
> >>>> None of the new tests actually test the same
thing that the infamous
> >>>> rrul test does - all the others still test up,
then down, and ping. It
> >>>> was/remains my hope that the simpler parts of the
flent test suite -
> >>>> such as the tcp_up_squarewave tests, the rrul
test, and the rtt_fair
> >>>> tests would provide calibration to the test
designers.
> >>>> we've got zillions of flent results in the
archive published here:
> >>>> https://blog.cerowrt.org/post/found_in_flent/
> >>>> ps. Misinformation about iperf 2 impacts my
ability to do this.
> >>>
> >>>> The new tests have all added up + ping and down +
ping, but not up +
> >>>> down + ping. Why??
> >>>> The behaviors of what happens in that case are
really non-intuitive, I
> >>>> know, but... it's just one more phase to add to
any one of those new
> >>>> tests. I'd be deliriously happy if someone(s) new
to the field
> >>>> started doing that, even optionally, and boggled
at how it defeated
> >>>> their assumptions.
> >>>> Among other things that would show...
> >>>> It's the home router industry's dirty secret than
darn few "gigabit"
> >>>> home routers can actually forward in both
directions at a gigabit. I'd
> >>>> like to smash that perception thoroughly, but
given our starting point
> >>>> is a gigabit router was a "gigabit
switch" - and historically been
> >>>> something that couldn't even forward at 200Mbit -
we have a long way
> >>>> to go there.
> >>>> Only in the past year have non-x86 home routers
appeared that could
> >>>> actually do a gbit in both directions.
> >>>> 2) Few are actually testing within-stream latency
> >>>> Apple's rpm project is making a stab in that
direction. It looks
> >>>> highly likely, that with a little more work, crusader
and
> >>>> go-responsiveness can finally start sampling the
tcp RTT, loss and
> >>>> markings, more directly. As for the rest...
sampling TCP_INFO on
> >>>> windows, and Linux, at least, always appeared
simple to me, but I'm
> >>>> discovering how hard it is by delving deep into
the rust behind
> >>>> crusader.
> >>>> the goresponsiveness thing is also IMHO running
WAY too many streams
> >>>> at the same time, I guess motivated by an attempt
to have the test
> >>>> complete quickly?
> >>>> B) To try and tackle the validation problem:ps.
Misinformation about iperf 2 impacts my ability to do this.
> >>>
> >>>> In the libreqos.io project we've established a
testbed where tests can
> >>>> be plunked through various ISP plan network
emulations. It's here:
> >>>> https://payne.taht.net (run bandwidth test for
what's currently hooked
> >>>> up)
> >>>> We could rather use an AS number and at least a
ipv4/24 and ipv6/48 to
> >>>> leverage with that, so I don't have to nat the
various emulations.
> >>>> (and funding, anyone got funding?) Or, as the
code is GPLv2 licensed,
> >>>> to see more test designers setup a testbed like
this to calibrate
> >>>> their own stuff.
> >>>> Presently we're able to test:
> >>>> flent
> >>>> netperf
> >>>> iperf2
> >>>> iperf3
> >>>> speedtest-cli
> >>>> crusader
> >>>> the broadband forum udp based test:
> >>>> https://github.com/BroadbandForum/obudpst
> >>>> trexx
> >>>> There's also a virtual machine setup that we can
remotely drive a web
> >>>> browser from (but I didn't want to nat the
results to the world) to
> >>>> test other web services.
> >>>> _______________________________________________
> >>>> Rpm mailing list
> >>>> Rpm@lists.bufferbloat.net
> >>>> https://lists.bufferbloat.net/listinfo/rpm
> >>> _______________________________________________
> >>> Starlink mailing list
> >>> Starlink@lists.bufferbloat.net
> >>> https://lists.bufferbloat.net/listinfo/starlink
> >>
> >> _______________________________________________
> >> Starlink mailing list
> >> Starlink@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/starlink
>
> _______________________________________________
> Starlink mailing list
> Starlink@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.