From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eu1sys200aog127.obsmtp.com (eu1sys200aog127.obsmtp.com [207.126.144.176]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 6F88321F197 for ; Sun, 14 Sep 2014 07:31:43 -0700 (PDT) Received: from mail.la.pnsol.com ([89.145.213.110]) (using TLSv1) by eu1sys200aob127.postini.com ([207.126.147.11]) with SMTP ID DSNKVBWmxTFZvsHT0FAQB+zbsT+2PFP/sphN@postini.com; Sun, 14 Sep 2014 14:32:12 UTC Received: from git.pnsol.com ([172.20.5.238] helo=roam.smtp.pnsol.com) by mail.la.pnsol.com with esmtp (Exim 4.76) (envelope-from ) id 1XTApw-0004nR-JZ; Sun, 14 Sep 2014 15:31:32 +0100 Received: from [172.20.5.109] by roam.smtp.pnsol.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1XTApw-0000OZ-Ep; Sun, 14 Sep 2014 14:31:32 +0000 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) From: Neil Davies In-Reply-To: Date: Sun, 14 Sep 2014 15:31:31 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <1AF70E51-A60F-47C1-AF90-9B1E6030227C@pnsol.com> References: <20140913194126.5B0D1406062@ip-64-139-1-69.sjc.megapath.net> <20418644-AB62-43AE-A09E-5F85ED42DBF4@gmx.de> To: Jonathan Morton X-Mailer: Apple Mail (2.1878.6) Cc: Hal Murray , bloat@lists.bufferbloat.net Subject: Re: [Bloat] Measuring Latency X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Sep 2014 14:32:14 -0000 Gents, This is not actually true - you can measure one-way delays without = completely accurately synchronised clocks (they have to be reasonably = precise, not accurate) - see CERN thesis at http://goo.gl/ss6EBq It is possible, with appropriate measurements, to construct arguments = that make marketeers salivate (or the appropriate metaphor) - you can = compare the relative effects of technology, location and instantaneous = congestion. See slideshare at http://goo.gl/6vytmD Neil On 14 Sep 2014, at 00:32, Jonathan Morton wrote: >>>> When reading it, it strikes me, that you don't directly tell them = what to >>>> do; e.g. add a latency test during upload and download. ... >>>=20 >>> Does round trip latency have enough info, or do you need to know how = much is=20 >>> contributed by each direction? >>=20 >> RTT is fine, uni-directional transfer time would be too good to be = true ;). >=20 > To expand on this: to measure one-way delay, you would need finely = synchronised clocks (to within a couple of ms) at both ends. The = average consumer doesn't have that sort of setup - not even if they = happen to use NTP. So it's not a goal worth pursuing at this level - = save it for scientific investigations, where the necessary effort and = equipment can be made available. >=20 >>> If I gave you a large collection of latency data from a test run, = how do you=20 >>> reduce it to something simple that a marketer could compare with the = results=20 >>> from another test run? >>=20 >> I believe the added latency under load would be a marketable = number, but we had a discussion in the past where it was argued that = marketing wants a number which increases with goodness, so larger =3D = better, something the raw difference is not going to deliver=85 >=20 > The obvious solution is to report the inverse value in Hz, a figure of = merit that gamers are familiar with (if not exactly in this context). >=20 > For example, I occasionally play World Of Tanks, which has a latency = meter (in ms) in the top corner, and I can roughly invert that in my = head - if I set my shaper too optimistically, I get something like 500ms = if something is updating in the background, but this goes down = immediately to 100ms once I adjust the setting to a more conservative = value - it's a 3G connection, so it's not going to get much better than = that. The corresponding inverted readings would be 2Hz (where I miss = most of my shots) and 10Hz (where I can carry the team). It's probably = worth reporting to one decimal place. >=20 > WoT isn't exactly the "twitchiest" of online games, either - have you = any idea how long it takes to aim a tank gun? Even so, when some tanks = can move at over 30km/h, a half-second difference in position is a whole = tank length, so with the slower response I no longer know *where* or = *when* to fire, unless the target is a sitting duck. Even though my = framerate is at 30Hz or more and appears smooth, my performance as a = player is dependent on the Internet's response frequency, because that = is lower. >=20 >=20 > So here's the outline of a step-by-step methodology: >=20 > - Prepare space for a list of latency measurements. Each measurement = needs to be tagged with information about what else was going on at the = same time, ie. idle/upload/download/both. Some latency probes may be = lost, and this fact should also be recorded on a per-tag basis. >=20 > - Start taking latency measurements, tagging as idle to begin with. = Keep on doing so continuously, changing the tag as required, until = several seconds after the bandwidth measurements are complete. >=20 > - Latency measurements should be taken sufficiently frequently = (several times a second is okay) that there will be at least a hundred = samples with each tag, and the frequency of sampling should not change = during the test. Each probe must be tagged with a unique ID, so that = losses or re-ordering of packets can be detected and don't confuse the = measurement. >=20 > - The latency probes should use UDP, not ICMP. They should also use = the same Diffserv/TOS tag as the bandwidth measurement traffic; the = default "best-effort" tag is fine. >=20 > - To satisfy the above requirements, the latency tester must *not* = wait for a previous reply to return before sending the next one. It = should send at regular intervals based on wall-clock time. But don't = send so many probes that they themselves clog the link. >=20 > - Once several seconds of "idle" samples are recorded, start the = download test. Change the tag to "download" at this point. >=20 > - The download test is complete when all the data sent by the server = has reached the client. Change the tag back to "idle" at this moment. >=20 > - Repeat the previous two steps for the upload measurement, using the = "upload" tag. >=20 > - Repeat again, but perform upload and download tests at the same time = (predicting, if necessary, that the bandwidth in each direction should = be similar to that previously measured), and use the "both" tag. = Uploads and downloads tend to interfere with each other when the loaded = response frequency is poor, so don't simply assume that the results will = be the same as in the individual tests - *measure* it. >=20 > - Once a few seconds of "idle" samples have been taken, stop measuring = and start analysis. >=20 > - Separate the list of latency samples by tag, and sort the four = resulting lists in ascending order. >=20 > - In each list, find the sample nearest 98% of the way through the = list. This is the "98th percentile", a good way of finding the = "highest" value while discarding irrelevant outliers. The highest = latency is the one that users will notice. Typically poor results: idle = 50ms, download 250ms, upload 500ms, both 800ms. >=20 > - Correct the 98-percentile latencies for packet loss, by multiplying = it by the number of probes *sent* with the appropriate tag, and then = dividing it by the number of probes *received* with that tag. It is not = necessary to report packet loss in any other way, *except* for the = "idle" tag. >=20 > - Convert the corrected 98-percentile latencies into "response = frequencies" by dividing one second by them. The typical figures above = would become: idle 20.0 Hz, download 4.0 Hz, upload 2.0 Hz, both 1.25 Hz = - assuming there was no packet loss. These figures are comparable in = meaning and importance to "frames per second" figures in games. >=20 > - Report these response frequencies, to a precision of at least one = decimal place, alongside and with equal visual importance to, the = bandwidth figures. For example: >=20 > IDLE: Response 20.0 Hz Packet loss 0.00 % > DOWNLOAD: Response 4.0 Hz Bandwidth 20.00 Mbps > UPLOAD: Response 2.0 Hz Bandwidth 2.56 Mbps > BIDIRECT: Response 1.3 Hz Bandwidth 15.23 / 2.35 Mbps >=20 > - Improving the response figures in the loaded condition will probably = also improve the *bidirectional* bandwidth figures as a side-effect, = while having a minimal effect on the *unidirectional* figures. A = connection with such qualities can legitimately be described as = supporting multiple activities at the same time. A connection with the = example figures shown above can *not*. >=20 >=20 > The next trick is getting across the importance of acknowledging that = more than one person in the average household wants to use the Internet = at the same time these days, and they often want to do different things = from each other. In this case, the simplest valid argument probably has = a lot going for it. >=20 > An illustration might help to get the concept across. A household = with four users in different rooms: father in the study downloading a = video, mother in the kitchen on the (VoIP) phone, son in his bedroom = playing a game, daughter in her bedroom uploading photos. All via a = single black-box modem and connection. Captions should emphasise that = mother and son both need low latency (high response frequency), while = father and daughter need high bandwidth (in opposite directions!), and = that they're doing all these things at the same time. >=20 > - Jonathan Morton >=20 > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat