From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <neil.davies@pnsol.com>
Received: from eu1sys200aog127.obsmtp.com (eu1sys200aog127.obsmtp.com
	[207.126.144.176])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 6F88321F197
	for <bloat@lists.bufferbloat.net>; Sun, 14 Sep 2014 07:31:43 -0700 (PDT)
Received: from mail.la.pnsol.com ([89.145.213.110]) (using TLSv1) by
	eu1sys200aob127.postini.com ([207.126.147.11]) with SMTP
	ID DSNKVBWmxTFZvsHT0FAQB+zbsT+2PFP/sphN@postini.com;
	Sun, 14 Sep 2014 14:32:12 UTC
Received: from git.pnsol.com ([172.20.5.238] helo=roam.smtp.pnsol.com)
	by mail.la.pnsol.com with esmtp (Exim 4.76)
	(envelope-from <neil.davies@pnsol.com>)
	id 1XTApw-0004nR-JZ; Sun, 14 Sep 2014 15:31:32 +0100
Received: from [172.20.5.109]
	by roam.smtp.pnsol.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:128)
	(Exim 4.82) (envelope-from <neil.davies@pnsol.com>)
	id 1XTApw-0000OZ-Ep; Sun, 14 Sep 2014 14:31:32 +0000
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: Neil Davies <neil.davies@pnsol.com>
In-Reply-To: <A9170F0C-0E78-46EA-A878-FF36A2DEECB9@gmail.com>
Date: Sun, 14 Sep 2014 15:31:31 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <1AF70E51-A60F-47C1-AF90-9B1E6030227C@pnsol.com>
References: <20140913194126.5B0D1406062@ip-64-139-1-69.sjc.megapath.net>
	<20418644-AB62-43AE-A09E-5F85ED42DBF4@gmx.de>
	<A9170F0C-0E78-46EA-A878-FF36A2DEECB9@gmail.com>
To: Jonathan Morton <chromatix99@gmail.com>
X-Mailer: Apple Mail (2.1878.6)
Cc: Hal Murray <hmurray@megapathdsl.net>, bloat@lists.bufferbloat.net
Subject: Re: [Bloat] Measuring Latency
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Sun, 14 Sep 2014 14:32:14 -0000

Gents,

This is not actually true - you can measure one-way delays without =
completely accurately synchronised clocks (they have to be reasonably =
precise, not accurate) - see CERN thesis at http://goo.gl/ss6EBq

It is possible, with appropriate measurements, to construct arguments =
that make marketeers salivate (or the appropriate metaphor) - you can =
compare the relative effects of technology, location and instantaneous =
congestion. See slideshare at http://goo.gl/6vytmD

Neil

On 14 Sep 2014, at 00:32, Jonathan Morton <chromatix99@gmail.com> wrote:

>>>> When reading it, it strikes me, that you don't directly tell them =
what to
>>>> do; e.g. add a latency test during upload and download.  ...
>>>=20
>>> Does round trip latency have enough info, or do you need to know how =
much is=20
>>> contributed by each direction?
>>=20
>> RTT is fine, uni-directional transfer time would be too good to be =
true ;).
>=20
> To expand on this: to measure one-way delay, you would need finely =
synchronised clocks (to within a couple of ms) at both ends.  The =
average consumer doesn't have that sort of setup - not even if they =
happen to use NTP.  So it's not a goal worth pursuing at this level - =
save it for scientific investigations, where the necessary effort and =
equipment can be made available.
>=20
>>> If I gave you a large collection of latency data from a test run, =
how do you=20
>>> reduce it to something simple that a marketer could compare with the =
results=20
>>> from another test run?
>>=20
>> 	I believe the added latency under load would be a marketable =
number, but we had a discussion in the past where it was argued that =
marketing wants a number which increases with goodness, so larger =3D =
better, something the raw difference is not going to deliver=85
>=20
> The obvious solution is to report the inverse value in Hz, a figure of =
merit that gamers are familiar with (if not exactly in this context).
>=20
> For example, I occasionally play World Of Tanks, which has a latency =
meter (in ms) in the top corner, and I can roughly invert that in my =
head - if I set my shaper too optimistically, I get something like 500ms =
if something is updating in the background, but this goes down =
immediately to 100ms once I adjust the setting to a more conservative =
value - it's a 3G connection, so it's not going to get much better than =
that.  The corresponding inverted readings would be 2Hz (where I miss =
most of my shots) and 10Hz (where I can carry the team).  It's probably =
worth reporting to one decimal place.
>=20
> WoT isn't exactly the "twitchiest" of online games, either - have you =
any idea how long it takes to aim a tank gun?  Even so, when some tanks =
can move at over 30km/h, a half-second difference in position is a whole =
tank length, so with the slower response I no longer know *where* or =
*when* to fire, unless the target is a sitting duck.  Even though my =
framerate is at 30Hz or more and appears smooth, my performance as a =
player is dependent on the Internet's response frequency, because that =
is lower.
>=20
>=20
> So here's the outline of a step-by-step methodology:
>=20
> - Prepare space for a list of latency measurements.  Each measurement =
needs to be tagged with information about what else was going on at the =
same time, ie. idle/upload/download/both.  Some latency probes may be =
lost, and this fact should also be recorded on a per-tag basis.
>=20
> - Start taking latency measurements, tagging as idle to begin with.  =
Keep on doing so continuously, changing the tag as required, until =
several seconds after the bandwidth measurements are complete.
>=20
> - Latency measurements should be taken sufficiently frequently =
(several times a second is okay) that there will be at least a hundred =
samples with each tag, and the frequency of sampling should not change =
during the test.  Each probe must be tagged with a unique ID, so that =
losses or re-ordering of packets can be detected and don't confuse the =
measurement.
>=20
> - The latency probes should use UDP, not ICMP.  They should also use =
the same Diffserv/TOS tag as the bandwidth measurement traffic; the =
default "best-effort" tag is fine.
>=20
> - To satisfy the above requirements, the latency tester must *not* =
wait for a previous reply to return before sending the next one.  It =
should send at regular intervals based on wall-clock time.  But don't =
send so many probes that they themselves clog the link.
>=20
> - Once several seconds of "idle" samples are recorded, start the =
download test.  Change the tag to "download" at this point.
>=20
> - The download test is complete when all the data sent by the server =
has reached the client.  Change the tag back to "idle" at this moment.
>=20
> - Repeat the previous two steps for the upload measurement, using the =
"upload" tag.
>=20
> - Repeat again, but perform upload and download tests at the same time =
(predicting, if necessary, that the bandwidth in each direction should =
be similar to that previously measured), and use the "both" tag.  =
Uploads and downloads tend to interfere with each other when the loaded =
response frequency is poor, so don't simply assume that the results will =
be the same as in the individual tests - *measure* it.
>=20
> - Once a few seconds of "idle" samples have been taken, stop measuring =
and start analysis.
>=20
> - Separate the list of latency samples by tag, and sort the four =
resulting lists in ascending order.
>=20
> - In each list, find the sample nearest 98% of the way through the =
list.  This is the "98th percentile", a good way of finding the =
"highest" value while discarding irrelevant outliers.  The highest =
latency is the one that users will notice.  Typically poor results: idle =
50ms, download 250ms, upload 500ms, both 800ms.
>=20
> - Correct the 98-percentile latencies for packet loss, by multiplying =
it by the number of probes *sent* with the appropriate tag, and then =
dividing it by the number of probes *received* with that tag.  It is not =
necessary to report packet loss in any other way, *except* for the =
"idle" tag.
>=20
> - Convert the corrected 98-percentile latencies into "response =
frequencies" by dividing one second by them.  The typical figures above =
would become: idle 20.0 Hz, download 4.0 Hz, upload 2.0 Hz, both 1.25 Hz =
- assuming there was no packet loss.  These figures are comparable in =
meaning and importance to "frames per second" figures in games.
>=20
> - Report these response frequencies, to a precision of at least one =
decimal place, alongside and with equal visual importance to, the =
bandwidth figures.  For example:
>=20
> 	IDLE:      Response  20.0 Hz     Packet loss   0.00 %
> 	DOWNLOAD:  Response   4.0 Hz     Bandwidth    20.00 Mbps
> 	UPLOAD:    Response   2.0 Hz     Bandwidth     2.56 Mbps
> 	BIDIRECT:  Response   1.3 Hz     Bandwidth    15.23 / 2.35 Mbps
>=20
> - Improving the response figures in the loaded condition will probably =
also improve the *bidirectional* bandwidth figures as a side-effect, =
while having a minimal effect on the *unidirectional* figures.  A =
connection with such qualities can legitimately be described as =
supporting multiple activities at the same time.  A connection with the =
example figures shown above can *not*.
>=20
>=20
> The next trick is getting across the importance of acknowledging that =
more than one person in the average household wants to use the Internet =
at the same time these days, and they often want to do different things =
from each other.  In this case, the simplest valid argument probably has =
a lot going for it.
>=20
> An illustration might help to get the concept across.  A household =
with four users in different rooms: father in the study downloading a =
video, mother in the kitchen on the (VoIP) phone, son in his bedroom =
playing a game, daughter in her bedroom uploading photos.  All via a =
single black-box modem and connection.  Captions should emphasise that =
mother and son both need low latency (high response frequency), while =
father and daughter need high bandwidth (in opposite directions!), and =
that they're doing all these things at the same time.
>=20
> - Jonathan Morton
>=20
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat