[Bloat] [Starlink] [Rpm] Researchers Seeking Probe Volunteers in USA
rjmcmahon
rjmcmahon at rjmcmahon.com
Sun Jan 15 18:09:09 EST 2023
hmm, interesting. I'm thinking that GPS PPS is sufficient from iperf 2 &
classical mechanics perspective.
Have you looked at white rabbit per CERN?
https://kt.cern/article/white-rabbit-cern-born-open-source-technology-sets-new-global-standard-empowering-world#:~:text=White%20Rabbit%20(WR)%20is%20a,the%20field%20of%20particle%20physics.
This discussion does make me question if there is a better metric than
one way delay, i.e. "speed of causality as limited by network i/o" taken
per each end of the e2e path? My expertise is quite limited w/respect to
relativity so I don't know if the below makes any sense or not. I also
think a core issue is the simultaneity of the start which isn't obvious
on how to discern.
Does comparing the write blocking times (or frequency) histograms to the
read blocking times (or frequency) histograms which are coupled by tcp's
control loop do anything useful? The blocking occurs because of a
coupling & awating per the remote. Then compare those against a write to
read thread on the same chip (which I think should be the same in each
reference frame and the fastest i/o possible for an end.) The frequency
differences might be due to what you call "interruptions" & one way
delays (& error) assuming all else equal??
Thanks in advance for any thoughts on this.
Bob
> -----Original Message-----
> From: rjmcmahon [mailto:rjmcmahon at rjmcmahon.com]
> Sent: Thursday, January 12, 2023 11:40 PM
> To: dickroy at alum.mit.edu
> Cc: 'Sebastian Moeller'; 'Rodney W. Grimes';
> mike.reynolds at netforecast.com; 'libreqos'; 'David P. Reed'; 'Rpm';
> 'bloat'
> Subject: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers in
> USA
>
> Hi RR,
>
> I believe quality GPS chips compensate for relativity in pulse per
>
> second which is needed to get position accuracy.
>
> _[RR] Of course they do. That 38usec/day really matters! They assume
> they know what the gravitational potential is where they are, and they
> can estimate the potential at the satellites so they can compensate,
> and they do. Point is, a GPS unit at Lake Tahoe (6250') runs faster
> than the one in San Francisco (sea level). How do you think these two
> "should be synchronized"! How do you define "synchronization" in
> this case? You synchronize those two clocks, then what about all the
> other clocks at Lake Tahoe (or SF or anywhere in between for that
> matter __J)??? These are not trivial questions. However if all one
> cares about is seconds or milliseconds, then you can argue that we
> (earthlings on planet earth) can "sweep such facts under the
> proverbial rug" for the purposes of latency in communication networks
> and that's certainly doable. Don't tell that to the guys whose
> protocols require "synchronization of all unit to nanoseconds" though!
> They will be very, very unhappy __J __J And you know who you are __J
> __J _
>
> _ _
>
> _J_
>
> Bob
>
>> Hi Sebastian (et. al.),
>
>>
>
>> [I'll comment up here instead of inline.]
>
>>
>
>> Let me start by saying that I have not been intimately involved with
>
>
>> the IEEE 1588 effort (PTP), however I was involved in the 802.11
>
>> efforts along a similar vein, just adding the wireless first hop
>
>> component and it's effects on PTP.
>
>>
>
>> What was apparent from the outset was that there was a lack of
>
>> understanding what the terms "to synchronize" or "to be
> synchronized"
>
>> actually mean. It's not trivial … because we live in a
>
>> (approximately, that's another story!) 4-D space-time continuum
> where
>
>> the Lorentz metric plays a critical role. Therein, simultaneity
> (aka
>
>> "things happening at the same time") means the "distance" between
> two
>
>> such events is zero and that distance is given by sqrt(x^2 + y^2 +
> z^2
>
>> - (ct)^2) and the "thing happening" can be the tick of a clock
>
>> somewhere. Now since everything is relative (time with respect to
>
>> what? / location with respect to where?) it's pretty easy to see
> that
>
>> "if you don't know where you are, you can't know what time it is!"
>
>> (English sailors of the 18th century knew this well!) Add to this
> the
>
>> fact that if everything were stationary, nothing would happen (as
>
>> Einstein said "Nothing happens until something moves!"), special
>
>> relativity also pays a role. Clocks on GPS satellites run approx.
>
>> 7usecs/day slower than those on earth due to their "speed" (8700 mph
>
>
>> roughly)! Then add the consequence that without mass we wouldn't
> exist
>
>> (in these forms at leastJ), and gravitational effects (aka General
>
>> Relativity) come into play. Those turn out to make clocks on GPS
>
>> satellites run 45usec/day faster than those on earth! The net
> effect
>
>> is that GPS clocks run about 38usec/day faster than clocks on earth.
>
>
>> So what does it mean to "synchronize to GPS"? Point is: it's a
>
>> non-trivial question with a very complicated answer. The reason it
> is
>
>> important to get all this right is that the "what that ties time and
>
>
>> space together" is the speed of light and that turns out to be a
>
>> "foot-per-nanosecond" in a vacuum (roughly 300m/usec). This means
> if
>
>> I am uncertain about my location to say 300 meters, then I also am
> not
>
>> sure what time it is to a usec AND vice-versa!
>
>>
>
>> All that said, the simplest explanation of synchronization is
>
>> probably: Two clocks are synchronized if, when they are brought
>
>> (slowly) into physical proximity ("sat next to each other") in the
>
>> same (quasi-)inertial frame and the same gravitational potential
> (not
>
>> so obvious BTW … see the FYI below!), an observer of both would
> say
>
>> "they are keeping time identically". Since this experiment is rarely
>
>
>> possible, one can never be "sure" that his clock is synchronized to
>
>> any other clock elsewhere. And what does it mean to say they "were
>
>> synchronized" when brought together, but now they are not because
> they
>
>> are now in different gravitational potentials! (FYI, there are land
>
>> mine detectors being developed on this very principle! I know
> someone
>
>> who actually worked on such a project!)
>
>>
>
>> This all gets even more complicated when dealing with large networks
>
>
>> of networks in which the "speed of information transmission" can
> vary
>
>> depending on the medium (cf. coaxial cables versus fiber versus
>
>> microwave links!) In fact, the atmosphere is one of those media and
>
>> variations therein result in the need for "GPS corrections" (cf.
> RTCM
>
>> GPS correction messages, RTK, etc.) in order to get to sub-nsec/cm
>
>> accuracy. Point is if you have a set of nodes distributed across
> the
>
>> country all with GPS and all "synchronized to GPS time", and a
> second
>
>> identical set of nodes (with no GPS) instead connected with a
> network
>
>> of cables and fiber links, all of different lengths and composition
>
>> using different carrier frequencies (dielectric constants vary with
>
>> frequency!) "synchronized" to some clock somewhere using NTP or
> PTP),
>
>> the synchronization of the two sets will be different unless a
> common
>
>> reference clock is used AND all the above effects are taken into
>
>> account, and good luck with that! J
>
>>
>
>> In conclusion, if anyone tells you that clock synchronization in
>
>> communication networks is simple ("Just use GPS!"), you should feel
>
>> free to chuckle (under your breath if necessaryJ)
>
>>
>
>> Cheers,
>
>>
>
>> RR
>
>>
>
>> -----Original Message-----
>
>> From: Sebastian Moeller [mailto:moeller0 at gmx.de]
>
>> Sent: Thursday, January 12, 2023 12:23 AM
>
>> To: Dick Roy
>
>> Cc: Rodney W. Grimes; mike.reynolds at netforecast.com; libreqos; David
>
>
>> P. Reed; Rpm; rjmcmahon; bloat
>
>> Subject: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers
> in
>
>> USA
>
>>
>
>> Hi RR,
>
>>
>
>>> On Jan 11, 2023, at 22:46, Dick Roy <dickroy at alum.mit.edu> wrote:
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> -----Original Message-----
>
>>
>
>>> From: Starlink [mailto:starlink-bounces at lists.bufferbloat.net] On
>
>> Behalf Of Sebastian Moeller via Starlink
>
>>
>
>>> Sent: Wednesday, January 11, 2023 12:01 PM
>
>>
>
>>> To: Rodney W. Grimes
>
>>
>
>>> Cc: Dave Taht via Starlink; mike.reynolds at netforecast.com;
> libreqos;
>
>> David P. Reed; Rpm; rjmcmahon; bloat
>
>>
>
>>> Subject: Re: [Starlink] [Rpm] Researchers Seeking Probe Volunteers
>
>> in USA
>
>>
>
>>>
>
>>
>
>>> Hi Rodney,
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> > On Jan 11, 2023, at 19:32, Rodney W. Grimes
>
>> <starlink at gndrsh.dnsmgr.net> wrote:
>
>>
>
>>> >
>
>>
>
>>> > Hello,
>
>>
>
>>> >
>
>>
>
>>> > Yall can call me crazy if you want.. but... see below [RWG]
>
>>
>
>>> >> Hi Bib,
>
>>
>
>>> >>
>
>>
>
>>> >>
>
>>
>
>>> >>> On Jan 9, 2023, at 20:13, rjmcmahon via Starlink
>
>> <starlink at lists.bufferbloat.net> wrote:
>
>>
>
>>> >>>
>
>>
>
>>> >>> My biggest barrier is the lack of clock sync by the devices,
>
>> i.e. very limited support for PTP in data centers and in end
> devices.
>
>> This limits the ability to measure one way delays (OWD) and most
>
>> assume that OWD is 1/2 and RTT which typically is a mistake. We know
>
>
>> this intuitively with airplane flight times or even car commute
> times
>
>> where the one way time is not 1/2 a round trip time. Google maps &
>
>> directions provide a time estimate for the one way link. It doesn't
>
>> compute a round trip and divide by two.
>
>>
>
>>> >>>
>
>>
>
>>> >>> For those that can get clock sync working, the iperf 2
>
>> --trip-times options is useful.
>
>>
>
>>> >>
>
>>
>
>>> >> [SM] +1; and yet even with unsynchronized clocks one can try
>
>> to measure how latency changes under load and that can be done per
>
>> direction. Sure this is far inferior to real reliably measured OWDs,
>
>
>> but if life/the internet deals you lemons....
>
>>
>
>>> >
>
>>
>
>>> > [RWG] iperf2/iperf3, etc are already moving large amounts of data
>
>
>> back and forth, for that matter any rate test, why not abuse some of
>
>
>> that data and add the fundemental NTP clock sync data and
>
>> bidirectionally pass each others concept of "current time". IIRC
> (its
>
>> been 25 years since I worked on NTP at this level) you *should* be
>
>> able to get a fairly accurate clock delta between each end, and then
>
>
>> use that info and time stamps in the data stream to compute OWD's.
>
>> You need to put 4 time stamps in the packet, and with that you can
>
>> compute "offset".
>
>>
>
>>> [RR] For this to work at a reasonable level of accuracy, the
>
>> timestamping circuits on both ends need to be deterministic and
>
>> repeatable as I recall. Any uncertainty in that process adds to
>
>> synchronization errors/uncertainties.
>
>>
>
>>>
>
>>
>
>>> [SM] Nice idea. I would guess that all timeslot based access
>
>> technologies (so starlink, docsis, GPON, LTE?) all distribute "high
>
>> quality time" carefully to the "modems", so maybe all that would be
>
>> needed is to expose that high quality time to the LAN side of those
>
>> modems, dressed up as NTP server?
>
>>
>
>>> [RR] It's not that simple! Distributing "high-quality time", i.e.
>
>> "synchronizing all clocks" does not solve the communication problem
> in
>
>> synchronous slotted MAC/PHYs!
>
>>
>
>> [SM] I happily believe you, but the same idea of "time slot"
>
>> needs to be shared by all nodes, no? So the clockss need to be
>
>> reasonably similar rate, aka synchronized (see below).
>
>>
>
>>> All the technologies you mentioned above are essentially P2P, not
>
>> intended for broadcast. Point is, there is a point controller (aka
>
>> PoC) often called a base station (eNodeB, gNodeB, …) that actually
>
>
>> "controls everything that is necessary to control" at the UE
> including
>
>> time, frequency and sampling time offsets, and these are critical to
>
>
>> get right if you want to communicate, and they are ALL subject to
> the
>
>> laws of physics (cf. the speed of light)! Turns out that what is
>
>> necessary for the system to function anywhere near capacity, is for
>
>> all the clocks governing transmissions from the UEs to be
>
>> "unsynchronized" such that all the UE transmissions arrive at the
> PoC
>
>> at the same (prescribed) time!
>
>>
>
>> [SM] Fair enough. I would call clocks that are "in sync"
> albeit
>
>> with individual offsets as synchronized, but I am a layman and that
>
>> might sound offensively wrong to experts in the field. But even
>
>> without the naming my point is that all systems that depend on some
>
>> idea of shared time-base are halfway there of exposing that time to
>
>> end users, by "translating it into an NTP time source at the modem.
>
>>
>
>>> For some technologies, in particular 5G!, these considerations are
>
>> ESSENTIAL. Feel free to scour the 3GPP LTE 5G RLC and PHY specs if
> you
>
>> don't believe me! J
>
>>
>
>> [SM Far be it from me not to believe you, so thanks for the
>
>> pointers. Yet, I still think that unless different nodes of a shared
>
>
>> segment move at significantly different speeds, that there should be
> a
>
>> common "tick-duration" for all clocks even if each clock runs at an
>
>> offset... (I naively would try to implement something like that by
>
>> trying to fully synchronize clocks and maintain a local offset value
>
>
>> to convert from "absolute" time to "network" time, but likely
> because
>
>> coming from the outside I am blissfully unaware of the detail
>
>> challenges that need to be solved).
>
>>
>
>> Regards & Thanks
>
>>
>
>> Sebastian
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> >
>
>>
>
>>> >>
>
>>
>
>>> >>
>
>>
>
>>> >>>
>
>>
>
>>> >>> --trip-times
>
>>
>
>>> >>> enable the measurement of end to end write to read latencies
>
>> (client and server clocks must be synchronized)
>
>>
>
>>> > [RWG] --clock-skew
>
>>
>
>>> > enable the measurement of the wall clock difference between
>
>> sender and receiver
>
>>
>
>>> >
>
>>
>
>>> >>
>
>>
>
>>> >> [SM] Sweet!
>
>>
>
>>> >>
>
>>
>
>>> >> Regards
>
>>
>
>>> >> Sebastian
>
>>
>
>>> >>
>
>>
>
>>> >>>
>
>>
>
>>> >>> Bob
>
>>
>
>>> >>>> I have many kvetches about the new latency under load tests
>
>> being
>
>>
>
>>> >>>> designed and distributed over the past year. I am delighted!
>
>> that they
>
>>
>
>>> >>>> are happening, but most really need third party evaluation,
> and
>
>>
>
>>
>
>>> >>>> calibration, and a solid explanation of what network
>
>> pathologies they
>
>>
>
>>> >>>> do and don't cover. Also a RED team attitude towards them, as
>
>> well as
>
>>
>
>>> >>>> thinking hard about what you are not measuring (operations
>
>> research).
>
>>
>
>>> >>>> I actually rather love the new cloudflare speedtest, because
> it
>
>> tests
>
>>
>
>>> >>>> a single TCP connection, rather than dozens, and at the same
>
>> time folk
>
>>
>
>>> >>>> are complaining that it doesn't find the actual "speed!".
>
>> yet... the
>
>>
>
>>> >>>> test itself more closely emulates a user experience than
>
>> speedtest.net
>
>>
>
>>> >>>> does. I am personally pretty convinced that the fewer numbers
>
>> of flows
>
>>
>
>>> >>>> that a web page opens improves the likelihood of a good user
>
>>
>
>>> >>>> experience, but lack data on it.
>
>>
>
>>> >>>> To try to tackle the evaluation and calibration part, I've
>
>> reached out
>
>>
>
>>> >>>> to all the new test designers in the hope that we could get
>
>> together
>
>>
>
>>> >>>> and produce a report of what each new test is actually doing.
>
>> I've
>
>>
>
>>> >>>> tweeted, linked in, emailed, and spammed every measurement
> list
>
>> I know
>
>>
>
>>> >>>> of, and only to some response, please reach out to other test
>
>> designer
>
>>
>
>>> >>>> folks and have them join the rpm email list?
>
>>
>
>>> >>>> My principal kvetches in the new tests so far are:
>
>>
>
>>> >>>> 0) None of the tests last long enough.
>
>>
>
>>> >>>> Ideally there should be a mode where they at least run to
> "time
>
>> of
>
>>
>
>>> >>>> first loss", or periodically, just run longer than the
>
>>
>
>>> >>>> industry-stupid^H^H^H^H^H^Hstandard 20 seconds. There be
>
>> dragons
>
>>
>
>>> >>>> there! It's really bad science to optimize the internet for 20
>
>
>>
>
>>> >>>> seconds. It's like optimizing a car, to handle well, for just
>
>> 20
>
>>
>
>>> >>>> seconds.
>
>>
>
>>> >>>> 1) Not testing up + down + ping at the same time
>
>>
>
>>> >>>> None of the new tests actually test the same thing that the
>
>> infamous
>
>>
>
>>> >>>> rrul test does - all the others still test up, then down, and
>
>> ping. It
>
>>
>
>>> >>>> was/remains my hope that the simpler parts of the flent test
>
>> suite -
>
>>
>
>>> >>>> such as the tcp_up_squarewave tests, the rrul test, and the
>
>> rtt_fair
>
>>
>
>>> >>>> tests would provide calibration to the test designers.
>
>>
>
>>> >>>> we've got zillions of flent results in the archive published
>
>> here:
>
>>
>
>>> >>>> https://blog.cerowrt.org/post/found_in_flent/
>
>>
>
>>> >>>> ps. Misinformation about iperf 2 impacts my ability to do
> this.
>
>>
>
>>
>
>>> >>>
>
>>
>
>>> >>>> The new tests have all added up + ping and down + ping, but
> not
>
>> up +
>
>>
>
>>> >>>> down + ping. Why??
>
>>
>
>>> >>>> The behaviors of what happens in that case are really
>
>> non-intuitive, I
>
>>
>
>>> >>>> know, but... it's just one more phase to add to any one of
>
>> those new
>
>>
>
>>> >>>> tests. I'd be deliriously happy if someone(s) new to the field
>
>
>>
>
>>> >>>> started doing that, even optionally, and boggled at how it
>
>> defeated
>
>>
>
>>> >>>> their assumptions.
>
>>
>
>>> >>>> Among other things that would show...
>
>>
>
>>> >>>> It's the home router industry's dirty secret than darn few
>
>> "gigabit"
>
>>
>
>>> >>>> home routers can actually forward in both directions at a
>
>> gigabit. I'd
>
>>
>
>>> >>>> like to smash that perception thoroughly, but given our
>
>> starting point
>
>>
>
>>> >>>> is a gigabit router was a "gigabit switch" - and historically
>
>> been
>
>>
>
>>> >>>> something that couldn't even forward at 200Mbit - we have a
>
>> long way
>
>>
>
>>> >>>> to go there.
>
>>
>
>>> >>>> Only in the past year have non-x86 home routers appeared that
>
>> could
>
>>
>
>>> >>>> actually do a gbit in both directions.
>
>>
>
>>> >>>> 2) Few are actually testing within-stream latency
>
>>
>
>>> >>>> Apple's rpm project is making a stab in that direction. It
>
>> looks
>
>>
>
>>> >>>> highly likely, that with a little more work, crusader and
>
>>
>
>>> >>>> go-responsiveness can finally start sampling the tcp RTT, loss
>
>
>> and
>
>>
>
>>> >>>> markings, more directly. As for the rest... sampling TCP_INFO
>
>> on
>
>>
>
>>> >>>> windows, and Linux, at least, always appeared simple to me,
> but
>
>> I'm
>
>>
>
>>> >>>> discovering how hard it is by delving deep into the rust
> behind
>
>>
>
>>
>
>>> >>>> crusader.
>
>>
>
>>> >>>> the goresponsiveness thing is also IMHO running WAY too many
>
>> streams
>
>>
>
>>> >>>> at the same time, I guess motivated by an attempt to have the
>
>> test
>
>>
>
>>> >>>> complete quickly?
>
>>
>
>>> >>>> B) To try and tackle the validation problem:ps. Misinformation
>
>
>> about iperf 2 impacts my ability to do this.
>
>>
>
>>> >>>
>
>>
>
>>> >>>> In the libreqos.io project we've established a testbed where
>
>> tests can
>
>>
>
>>> >>>> be plunked through various ISP plan network emulations. It's
>
>> here:
>
>>
>
>>> >>>> https://payne.taht.net (run bandwidth test for what's
> currently
>
>> hooked
>
>>
>
>>> >>>> up)
>
>>
>
>>> >>>> We could rather use an AS number and at least a ipv4/24 and
>
>> ipv6/48 to
>
>>
>
>>> >>>> leverage with that, so I don't have to nat the various
>
>> emulations.
>
>>
>
>>> >>>> (and funding, anyone got funding?) Or, as the code is GPLv2
>
>> licensed,
>
>>
>
>>> >>>> to see more test designers setup a testbed like this to
>
>> calibrate
>
>>
>
>>> >>>> their own stuff.
>
>>
>
>>> >>>> Presently we're able to test:
>
>>
>
>>> >>>> flent
>
>>
>
>>> >>>> netperf
>
>>
>
>>> >>>> iperf2
>
>>
>
>>> >>>> iperf3
>
>>
>
>>> >>>> speedtest-cli
>
>>
>
>>> >>>> crusader
>
>>
>
>>> >>>> the broadband forum udp based test:
>
>>
>
>>> >>>> https://github.com/BroadbandForum/obudpst
>
>>
>
>>> >>>> trexx
>
>>
>
>>> >>>> There's also a virtual machine setup that we can remotely
> drive
>
>> a web
>
>>
>
>>> >>>> browser from (but I didn't want to nat the results to the
>
>> world) to
>
>> awhile
>
>>> >>>> test other web services.
>
>>
>
>>> >>>> _______________________________________________
>
>>
>
>>> >>>> Rpm mailing list
>
>>
>
>>> >>>> Rpm at lists.bufferbloat.net
>
>>
>
>>> >>>> https://lists.bufferbloat.net/listinfo/rpm
>
>>
>
>>> >>> _______________________________________________
>
>>
>
>>> >>> Starlink mailing list
>
>>
>
>>> >>> Starlink at lists.bufferbloat.net
>
>>
>
>>> >>> https://lists.bufferbloat.net/listinfo/starlink
>
>>
>
>>> >>
>
>>
>
>>> >> _______________________________________________
>
>>
>
>>> >> Starlink mailing list
>
>>
>
>>> >> Starlink at lists.bufferbloat.net
>
>>
>
>>> >> https://lists.bufferbloat.net/listinfo/starlink
>
>>
>
>>>
>
>>
>
>>> _______________________________________________
>
>>
>
>>> Starlink mailing list
>
>>
>
>>> Starlink at lists.bufferbloat.net
>
>>
>
>>> https://lists.bufferbloat.net/listinfo/starlink
More information about the Bloat
mailing list