From: "Toke Høiland-Jørgensen" <toke@toke.dk>
To: "Eggert\, Lars" <lars@netapp.com>
Cc: Midori Kato <katoon@sfc.wide.ad.jp>,
"bloat-devel@lists.bufferbloat.net"
<bloat-devel@lists.bufferbloat.net>
Subject: Re: One-way delay measurement for netperf-wrapper
Date: Fri, 29 Nov 2013 13:04:41 +0000 [thread overview]
Message-ID: <87haavjr4m.fsf@toke.dk> (raw)
In-Reply-To: <5392347C-BC88-4836-BB53-F48523210237@netapp.com> (Lars Eggert's message of "Fri, 29 Nov 2013 10:20:44 +0000")
[-- Attachment #1: Type: text/plain, Size: 3424 bytes --]
"Eggert, Lars" <lars@netapp.com> writes:
> we tried this too. The TCP timestamps are too coarse-grained for
> datacenter latency measurements, I think under at least Linux and
> FreeBSD they get rounded up to 1ms or something. (Midori, do you
> remember the exact value?)
Right. Well now that you mention it, I do seem to recall having read
that Linux uses the clock ticks (related to the kernel hz value; i.e.
between 250 and 1000 hz depending on configuration) as timestamp units.
I suppose FreeBSD is similar.
> No, but the sender and receiver can agree to embed them every X bytes
> in the stream. Yeah, sometimes that timestamp may be transmitted in
> two segments, but I guess that should be OK?
Right, so a protocol might be something like this (I'm still envisioning
this in the context of the netperf TCP_STREAM / TCP_MAERTS tests):
1. Insert a sufficiently accurate timestamp into the TCP bandwidth
measurement stream every X bytes (or maybe every X milliseconds?).
2. On the receiver side, look for these timestamps and each time one is
received, calculate the delay (also in a sufficiently accurate, i.e.
sub-millisecond, unit). Echo this calculated delay back to the
sender, probably with a fresh timestamp attached.
3. The sender receives the delay measurements and either just outputs it
straight away, or holds on to them until the end of the test and
normalises them to be deltas against the minimum observed delay.
Now, some possible issues with this:
- Are we measuring the right thing? This will measure the time it takes
a message to get from the application level on one side to the
application level on another. There are a lot of things that could
impact this apart from queueing latency; the most obvious one is
packet loss and retransmissions which will give some spurious results
I suppose (?). Doing the measurement with UDP packets would alleviate
this, but then we're back to not being in-stream...
- As for point 3, not normalising the result and just outputting the
computed delay as-is means that the numbers will be meaningless
without very accurately synchronised clocks. On the other hand, not
processing the numbers before outputting them will allow people who
*do* have synchronised clocks to do something useful with them.
Perhaps a --assume-sync-clocks parameter?
- Echoing back the delay measurements causes traffic which may or may
not be significant; I'm thinking mostly in terms of running
bidirectional measurements. Is that significant? A solution could be
for the receiver to hold on to all the measurements until the end of
the test and then send them back on the control connection.
- Is clock drift something to worry about over the timescales of these
tests?
https://www.usenix.org/legacy/events/iptps10/tech/slides/cohen.pdf
seems to suggest it shouldn't be, as long as the tests only run for at
most a few minutes.
> http://e2epi.internet2.edu/thrulay/ is the original. There are several
> variants, but I think they also have been abandoned:
Thanks. From what I can tell, the measurement here basically works by
something akin to the above: for TCP, the timestamp is just echoed back
by the receiver, so roundtrip time is measured. For UDP, the receiver
calculates the delay, so presumably clock synchronisation is a
prerequisite.
So anyway, thoughts? Is the above something worth pursuing?
-Toke
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 489 bytes --]
next prev parent reply other threads:[~2013-11-29 13:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-28 18:51 Toke Høiland-Jørgensen
2013-11-29 8:45 ` Eggert, Lars
2013-11-29 9:42 ` Toke Høiland-Jørgensen
2013-11-29 10:20 ` Eggert, Lars
2013-11-29 13:04 ` Toke Høiland-Jørgensen [this message]
2013-11-29 14:30 ` Eggert, Lars
2013-11-29 16:55 ` Dave Taht
2013-12-02 18:11 ` Rick Jones
2013-12-02 18:20 ` Toke Høiland-Jørgensen
2013-11-28 20:30 Hal Murray
2013-12-02 5:45 Hal Murray
2013-12-02 8:34 ` Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87haavjr4m.fsf@toke.dk \
--to=toke@toke.dk \
--cc=bloat-devel@lists.bufferbloat.net \
--cc=katoon@sfc.wide.ad.jp \
--cc=lars@netapp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox