Historic archive of defunct list bloat-devel@lists.bufferbloat.net
 help / color / mirror / Atom feed
From: Eric Raymond <esr@thyrsus.com>
To: Dave Taht <dave.taht@gmail.com>
Cc: Eric Raymond <esr@snark.thyrsus.com>,
	Hal Murray <hmurray@megapathdsl.net>,
	bloat-devel@lists.bufferbloat.net
Subject: Re: Preliminary results of using GPS to look for clock skew
Date: Wed, 21 Sep 2011 22:11:38 -0400	[thread overview]
Message-ID: <20110922021137.GB21302@thyrsus.com> (raw)
In-Reply-To: <CAA93jw6cdO9ou8JpnRtQw51jHtcuBC5J41Xg3iu6bRPs3MsVdA@mail.gmail.com>

Dave Taht <dave.taht@gmail.com>:
> It is comforting to know that ntp is working well in your case, and, using
> GPS, we have a verifiable means with decent error bars of checking against
> ntp's algos independently!

Yup.  It'll get better as I refine my profiling and gain more insight
into the numbers.  My next task is to compute a lower bound for RS-232
transmission time and subtract that from E-S so we know how much of the
dominant component in fix latency is processing time.

Er, for other bloat-dev members: I should have said up front that I've
volunteered to be the bufferbloat project's go-to guy on reliable time
sources for network performance profiling.  This is a completely
natural extension of the work I've been doing on GPSD since 2005.  GPS
gives us atomic-clock time with $40 hardware (provided we're below 60
drgress N or S latitude and can string an antenna somewhere with a
decent skyview).  I know almost everything there is to know about
extracting data from these sensors, and what I don't know my two senior
lieutenants on the GPSD project *do* know.

> Two ideas here:
> 
> 1) Run the router WITHOUT ntp enabled at all
>    (and/or testing against CLOCK_REALTIME)
>    It would be good to know how much the base clock drift is, without
> correction.

One of the things I don't know, and need to understand, is what the
relationships are among the different realtime clocks. The clock_gettime(3)
manual page is not hugely helpful.  It says:

       CLOCK_REALTIME
              System-wide real-time clock.  Setting this clock requires appro‐
              priate privileges.

       CLOCK_MONOTONIC
              Clock  that  cannot  be  set and represents monotonic time since
              some unspecified starting point.

       CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific)
              Similar to CLOCK_MONOTONIC, but provides access to a  raw  hard‐
              ware-based time that is not subject to NTP adjustments.

       CLOCK_PROCESS_CPUTIME_ID
              High-resolution per-process timer from the CPU.

       CLOCK_THREAD_CPUTIME_ID
              Thread-specific CPU-time clock.

Er, so what exactly is the relationship between the CLOCK_REALTIME clock
and the time(2) clock?  Are they the same?  If they're different, how 
are they different?

It says the CLOCK_MONOTONIC clock isn't settable, but the CLOCK_MONOTONIC_RAW
text implies that the former may get NTP adjustments.  And it doesn.t specify
whether the per-process timers are NTP-corrected...I'd guess not, but who's 
to know from the above.

Can anyone point me to better documentation on these facilities?

> 2) ReRun all tests under load (example: netperf -l 3600 -H the_router)

I'll do this, for completeness, but I predict it's not going to make
any measurable difference.  The indications so far are that neither of
the means of time delivery I have available to check are compute-bound
or disk-I/O bound at any point in their delivery chains.  

So I think they're just going to shrug off any load short of
machine-thrashing-its-guts-out.  But part of the point of what I'm
doing is that soon we'll have the test tools to know for *sure* that's
true.

> The followon test to this is to actually start collecting and parsing
> ntp rawstats statistics, which can be easily turned on and collected
> on the router. It's getting-those-stats-somewhere coupled with the
> need to periodically delete these statistic files that's a problem at
> the moment, and really only the former...
> 
> The bufferbloat signal (if it exists) is in the noise that ntp is
> currently (successfully in your case) rejecting.
> 
> There are a couple rawstats parsers floating about, I have part of
> one, hal has another. I committed a major overdesign
> sin in mine by wanting to put it all into a postgres db, ran into
> major data representation problems (time on postgres is different than
> time inside of ntp), and put the work aside (it's on github in the
> same pieces I left it in)
> 
> To enable rawstats collection on the router, modify /etc/ntp.conf to contain:
> 
> statsdir /tmp
> statistics rawstats
> filegen rawstats file rawstats type day enable
> 
> and restart ntp
> 
> on a system protected by apparmor - like ubuntu - it's mildly trickier
> as you need to add
> a
> 
> whereverthelogdiris/rawstats* rwl
> 
> to the /etc/apparmor.d/usr.sbin.ntpd
> 
> The final bit of the cbbd is to actually collect port numbers - so
> stuff on ephemeral ports is known to be from
> natted devices and stuff on 123
> 
> but I'm getting way ahead of myself here.

Agreed.  I also think you're complicating life unnecessarily. 

If we need rawstats in a form for real-time monitoring, why not modify
NTP to optionally multicast them and avoid all this going to disk?  I
have good relations with the NTP guys, and they wouldn't be likely to
resist a feature request with a network-health-monitoring use case
even if we didn't. Let's *use* that zorch for something, rather than
fielding a fragile pile of hacks.


-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

  reply	other threads:[~2011-09-22  2:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-21 23:02 Eric Raymond
2011-09-22  0:18 ` Dave Taht
2011-09-22  2:11   ` Eric Raymond [this message]
2011-09-22  2:24     ` Jonathan Morton
2011-09-22  2:29       ` Eric Raymond
2011-09-23  9:09       ` Jan Ceuleers
2011-09-23  9:38         ` Dave Taht
2011-09-23 12:10           ` Jan Ceuleers
2011-09-23 12:50             ` Rick
2011-09-24 14:50               ` Jan Ceuleers
2011-09-22  9:08     ` Dave Taht
2011-09-22 17:15       ` Rick Jones
2011-09-22 17:34         ` Dave Taht
2011-09-22 17:43           ` Rick Jones
2011-09-22 17:58             ` Dave Taht
2011-09-23 10:57           ` Aidan Williams
2011-09-23 10:10     ` Dave Taht
2011-09-23  9:09   ` Jan Ceuleers
2011-09-23  9:24 ` Jan Ceuleers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110922021137.GB21302@thyrsus.com \
    --to=esr@thyrsus.com \
    --cc=bloat-devel@lists.bufferbloat.net \
    --cc=dave.taht@gmail.com \
    --cc=esr@snark.thyrsus.com \
    --cc=hmurray@megapathdsl.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox