Preliminary results of using GPS to look for clock skew
Dave Taht
dave.taht at gmail.com
Wed Sep 21 20:18:01 EDT 2011
Eric:
It is comforting to know that ntp is working well in your case, and, using GPS,
we have a verifiable means with decent error bars of checking against
ntp's algos independently!
Two ideas here:
1) Run the router WITHOUT ntp enabled at all
(and/or testing against CLOCK_REALTIME)
It would be good to know how much the base clock drift is, without
correction.
2) ReRun all tests under load (example: netperf -l 3600 -H the_router)
The followon test to this is to actually start collecting and parsing
ntp rawstats statistics, which can be easily turned on and collected
on the router. It's getting-those-stats-somewhere coupled with the
need to periodically delete these statistic files that's a problem at
the moment, and really only the former...
The bufferbloat signal (if it exists) is in the noise that ntp is
currently (successfully in your case) rejecting.
There are a couple rawstats parsers floating about, I have part of
one, hal has another. I committed a major overdesign
sin in mine by wanting to put it all into a postgres db, ran into
major data representation problems (time on postgres is different than
time inside of ntp), and put the work aside (it's on github in the
same pieces I left it in)
To enable rawstats collection on the router, modify /etc/ntp.conf to contain:
statsdir /tmp
statistics rawstats
filegen rawstats file rawstats type day enable
and restart ntp
on a system protected by apparmor - like ubuntu - it's mildly trickier
as you need to add
a
whereverthelogdiris/rawstats* rwl
to the /etc/apparmor.d/usr.sbin.ntpd
The final bit of the cbbd is to actually collect port numbers - so
stuff on ephemeral ports is known to be from
natted devices and stuff on 123
but I'm getting way ahead of myself here.
On Wed, Sep 21, 2011 at 4:02 PM, Eric Raymond <esr at snark.thyrsus.com> wrote:
> As a step toward implementation of Dave's Cosmic Background
> Bufferbloat Detector, I took on the job of employing GPS as a reliable
> local time source to check whether an NTP-assisted system clock has
> significant skew from GPS.
>
> Towards this end, I've just spent a couple of days rebuilding and
> improving the latency-profiling machinery in GPSD. What this allows
> me to do is collate the following pieces of data:
>
> * The GPS's time of a fix in properly leapsecond-corrected UTC (Call this T)
>
> * The NTP-corrected system time at which the first burst of
> fix data from the device wakes up gpsd's select(2). Call this
> S (Start of reporting cycle)
>
> * When the daemon has received and processed the entire burst of packets
> constituting a reporting cycle and is about to ship JSON to the client.
> Call this E (End of reporting cycle).
>
> * When the client recieves the report. (R = receipt time).
>
> What's interesting is to plot the deltas S - T, E - S, and R - S as a
> stacked-impulse graph. I can do this at will now. More importantly,
> so can anyone with my software and GPSD. We have a working GPSD port to
> CeroWRT, so our test routers can be instrumented.
>
> The entire height R-T approximates the entire fix latency. I say
> "approximates" because, of course, T is on a different timebase than
> R, S, and E, and NTP's 10ms fuzz is an issue. But S-T gives us an
> idea of the atomic-clock-vs-system-clock time.
>
> There's nothing surprising about my setup - FIOS to an N600 running a
> recent OpenWRT build to a 2.66GHz Intel Core Duo 2 machine running Ubuntu
> 10.04 LTS.
>
> Over runs of 20-100 fixes, S-T (the clock-skew figure) ranges from 66
> to 76 milliseconds. Total latency R-T is steady at about 0.38
> seconds. R-S vanishes - it's close to 0.9 milliseconds. Total fix
> latency is completely dominated by E-S, the time for gpsd to receive
> and process the fix data from the GPS. And I'm not seeing a lot of
> bursty variation in these measurements.
>
> I'm going to investigate further. In particular, I'm going to try to
> either measure or compute how much of E-S is I/O time. My suspicion is that
> almost all of it is, and that gpsd's actual computation time is negligible
> by comparison.
>
> But preliminary indications are that NTP is in fact doing a pretty good
> job of keeping my system clock conditioned. I'm not seeing bufferbloat
> effects on time service.
>
> However, I'd welcome having my code and assumptions checked. This kind
> of profiling is tricky work, with large vulnerabilities to small
> mistakes in detail. That's the main reason I'm describing these
> results as 'preliminary'; some skeptical review is needed to soliudify
> and verify them.
> --
> <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
>
> Let us hope our weapons are never needed --but do not forget what
> the common people knew when they demanded the Bill of Rights: An
> armed citizenry is the first defense, the best defense, and the
> final defense against tyranny.
> If guns are outlawed, only the government will have guns. Only
> the police, the secret police, the military, the hired servants of
> our rulers. Only the government -- and a few outlaws. I intend to
> be among the outlaws.
> -- Edward Abbey, "Abbey's Road", 1979
>
--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com
More information about the Bloat-devel
mailing list