General list for discussing Bufferbloat
 help / color / mirror / Atom feed
From: Hal Murray <hmurray@megapathdsl.net>
To: Dave Taht <dave.taht@gmail.com>
Cc: Hal Murray <hmurray@megapathdsl.net>,
	bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] Graph of bloat
Date: Thu, 09 Jul 2015 03:07:23 -0700	[thread overview]
Message-ID: <20150709100723.CB767406057@ip-64-139-1-69.sjc.megapath.net> (raw)
In-Reply-To: Message from Dave Taht <dave.taht@gmail.com> of "Wed, 08 Jul 2015 08:55:41 PDT." <CAA93jw6jvDPBLhhjKZW-+0RDyRfZfDg_2deQdEe5F-BHiSKi8A@mail.gmail.com>

There are several parts to this discussion.

Leap seconds are ugly.  The basic problem is that POSIX pretends they don't 
exist.  That's a carryover from the early days when computer time keeping 
didn't have to worry about them.  They weren't introduced until 1972.  There 
should be a second labeled 23:59:60 but most systems just set the clock back 
a second and repeat 23:59:59, and all sorts of systems get in trouble when 
time goes backwards.

They don't impact daily life like leap years do, so we don't teach kids about them when they learn about leap years.  Most people don't even know they exist, and that includes most programmers.  An additional complication is that they are unpredictable so you can't wire simple conversions into a chunk of code that gets copied around.

Google decided that it was simpler to "smear" their clocks rather than chase down and fix the bugs in all their code.
  Time, technology and leaping seconds 
  http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.html
The downside is that all their clocks are off by up to 1/2 second.  If you don't need accurate time for legal reasons like stock market trading, their approach is probably a good one.  Their internal clocks will all agree with each other, but they won't agree with outside systems that aren't playing the smearing game.

The blog above describes the smear using cosine - no sharp corners.  The graph shows a linear smear.


> Does ntp adjust system time backward based on getting nearly all it's
> samples with well over a 1/2 second of induced delay? 

The idea with smearing is to avoid having to set the clock back.  The reference time on that graph is UTC.  If your server was using only Google's NTP servers, it would follow that ramp, inserting the leap second over 20 hours rather than all at once by setting the clock back.  That's the whole point of the smear.  You lie to all your NTP clients and they all follow the same lie.

All that has nothing to do with bloat.  It's just background for why I was making the graph.

--------

Now for NTP...

After the typical NTP client-server exchange, the client has 4 time stamps, send and receive for packets going in both directions.  If you look at things in the right way, you have N equations and N+1 unknowns.  You need one more equation to sort things out.

If you assume that the clocks on both ends are accurate, you can compute the network transit times in both directions.

NTP makes the assumption that the network delays are symmetric.  Without bloat, that's generally reasonable.  It does screwup on long links with asymmetric routing.  If you watch NTP servers over a long distance, you can see steps when the routing changes.  On the scale of bloat, those errors are minor.  If you had a fast link rather than my slow DSL link they would be significant.

ntpd remembers the last 8 samples to each server.  It only uses the one with the lowest round trip time, assuming that the others hit some sort of queueing delay.  That filters out occasional bursts of interference or even bloat.  It doesn't work for sustained bloat.

The huff-n-puff filter can be used for sustained bloat - better to coast than get confused.  But there needs to be some limit on how long to wait before assuming the current timings are valid because the network has been reconfigured.  If your bloat lasts long enough, ntpd will get confused.


In addition to getting the time correct, ntpd is also trying to calibrate the clock frequency so the future time will be more accurate (if the current time is good).  That's the "drift".  Without that correction, the clock will drift farther from the true time the longer you wait.

Ballpark numbers for the errors in crystals are 10s of PPM (parts per million).  One PPM is roughly a second over 2 weeks, so an uncorrected clock is likely to drift seconds per day.  I have one system that's off by 138 PPM.  (The drift can also correct for minor errors in software.)

Normally, ntpd is just making minor corrections.  It does that by slewing the clock, that is by fudging the clock frequency so the clock will "drift" in the desired direction.  That takes a long time to make large corrections.  ntpd will normally step the clock if the correction is over 128 ms.

But stepping the clock backwards is what causes most of the problems.  ntpd has command line switches to don't-do-that, and another to allow one step at startup time...  There are no simple answers.

--------

> Judging from that graphic... I don't think huff and puff was designed for
> the bufferbloated era! so the question remains, in hal's tests, did ntp
> adjust the clock backwards? 

No.  The system that collected that data was getting time from a good local GPS clock.  It helps to have a place to stand if you want to collect time data.

Here is a typical pattern from a system using the pool without any huff-n-puff while I did a big download.
 8 Jul 22:02:17 ntpd[26705]: 0.0.0.0 061c 0c clock_step -0.259747 s
 8 Jul 23:06:24 ntpd[26705]: 0.0.0.0 061c 0c clock_step +0.274448 s


-- 
These are my opinions.  I hate spam.




  parent reply	other threads:[~2015-07-09 10:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-08 10:23 Hal Murray
2015-07-08 15:55 ` Dave Taht
2015-07-08 16:11   ` Jan Ceuleers
2015-07-08 16:29     ` Jan Ceuleers
2015-07-08 19:09       ` Alan Jenkins
2015-07-08 16:32     ` Dave Taht
2015-07-09 10:08       ` Jan Ceuleers
2015-07-08 17:53   ` Rich Brown
2015-07-09 10:07   ` Hal Murray [this message]
2015-07-09 10:55     ` Sebastian Moeller
2015-07-09 18:27       ` Hal Murray
2015-07-09 15:08     ` Dave Taht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150709100723.CB767406057@ip-64-139-1-69.sjc.megapath.net \
    --to=hmurray@megapathdsl.net \
    --cc=bloat@lists.bufferbloat.net \
    --cc=dave.taht@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox