Preliminary results of using GPS to look for clock skew

Dave Taht dave.taht at gmail.com
Wed Sep 21 20:18:01 EDT 2011


Eric:

It is comforting to know that ntp is working well in your case, and, using GPS,
we have a verifiable means with decent error bars of checking against
ntp's algos independently!

Two ideas here:

1) Run the router WITHOUT ntp enabled at all
   (and/or testing against CLOCK_REALTIME)
   It would be good to know how much the base clock drift is, without
correction.

2) ReRun all tests under load (example: netperf -l 3600 -H the_router)

The followon test to this is to actually start collecting and parsing
ntp rawstats statistics, which can be easily turned on and collected
on the router. It's getting-those-stats-somewhere coupled with the
need to periodically delete these statistic files that's a problem at
the moment, and really only the former...

The bufferbloat signal (if it exists) is in the noise that ntp is
currently (successfully in your case) rejecting.

There are a couple rawstats parsers floating about, I have part of
one, hal has another. I committed a major overdesign
sin in mine by wanting to put it all into a postgres db, ran into
major data representation problems (time on postgres is different than
time inside of ntp), and put the work aside (it's on github in the
same pieces I left it in)

To enable rawstats collection on the router, modify /etc/ntp.conf to contain:

statsdir /tmp
statistics rawstats
filegen rawstats file rawstats type day enable

and restart ntp

on a system protected by apparmor - like ubuntu - it's mildly trickier
as you need to add
a

whereverthelogdiris/rawstats* rwl

to the /etc/apparmor.d/usr.sbin.ntpd

The final bit of the cbbd is to actually collect port numbers - so
stuff on ephemeral ports is known to be from
natted devices and stuff on 123

but I'm getting way ahead of myself here.


On Wed, Sep 21, 2011 at 4:02 PM, Eric Raymond <esr at snark.thyrsus.com> wrote:
> As a step toward implementation of Dave's Cosmic Background
> Bufferbloat Detector, I took on the job of employing GPS as a reliable
> local time source to check whether an NTP-assisted system clock has
> significant skew from GPS.
>
> Towards this end, I've just spent a couple of days rebuilding and
> improving the latency-profiling machinery in GPSD.  What this allows
> me to do is collate the following pieces of data:
>
> * The GPS's time of a fix in properly leapsecond-corrected UTC (Call this T)
>
> * The NTP-corrected system time at which the first burst of
>  fix data from the device wakes up gpsd's select(2).  Call this
>  S (Start of reporting cycle)
>
> * When the daemon has received and processed the entire burst of packets
>  constituting a reporting cycle and is about to ship JSON to the client.
>  Call this E (End of reporting cycle).
>
> * When the client recieves the report.  (R = receipt time).
>
> What's interesting is to plot the deltas S - T, E - S, and R - S as a
> stacked-impulse graph.  I can do this at will now.  More importantly,
> so can anyone with my software and GPSD.  We have a working GPSD port to
> CeroWRT, so our test routers can be instrumented.
>
> The entire height R-T approximates the entire fix latency. I say
> "approximates" because, of course, T is on a different timebase than
> R, S, and E, and NTP's 10ms fuzz is an issue.  But S-T gives us an
> idea of the atomic-clock-vs-system-clock time.
>
> There's nothing surprising about my setup - FIOS to an N600 running a
> recent OpenWRT build to a 2.66GHz Intel Core Duo 2 machine running Ubuntu
> 10.04 LTS.
>
> Over runs of 20-100 fixes, S-T (the clock-skew figure) ranges from 66
> to 76 milliseconds.  Total latency R-T is steady at about 0.38
> seconds.  R-S vanishes - it's close to 0.9 milliseconds.  Total fix
> latency is completely dominated by E-S, the time for gpsd to receive
> and process the fix data from the GPS.  And I'm not seeing a lot of
> bursty variation in these measurements.
>
> I'm going to investigate further.  In particular, I'm going to try to
> either measure or compute how much of E-S is I/O time.  My suspicion is that
> almost all of it is, and that gpsd's actual computation time is negligible
> by comparison.
>
> But preliminary indications are that NTP is in fact doing a pretty good
> job of keeping my system clock conditioned. I'm not seeing bufferbloat
> effects on time service.
>
> However, I'd welcome having my code and assumptions checked. This kind
> of profiling is tricky work, with large vulnerabilities to small
> mistakes in detail. That's the main reason I'm describing these
> results as 'preliminary'; some skeptical review is needed to soliudify
> and verify them.
> --
>                <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
>
> Let us hope our weapons are never needed --but do not forget what
> the common people knew when they demanded the Bill of Rights: An
> armed citizenry is the first defense, the best defense, and the
> final defense against tyranny.
>   If guns are outlawed, only the government will have guns. Only
> the police, the secret police, the military, the hired servants of
> our rulers.  Only the government -- and a few outlaws.  I intend to
> be among the outlaws.
>        -- Edward Abbey, "Abbey's Road", 1979
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com



More information about the Bloat-devel mailing list