From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp7.sms.unimo.it (smtp7.sms.unimo.it [155.185.44.150]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 4BB8021F1BB for ; Mon, 4 May 2015 03:11:13 -0700 (PDT) Received: from [212.84.37.202] (port=54087 helo=[192.168.15.101]) by smtp7.sms.unimo.it with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1YpDL3-0003ax-8x; Mon, 04 May 2015 12:11:07 +0200 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Content-Type: text/plain; charset=utf-8 From: Paolo Valente In-Reply-To: <2F4DCB53-1E46-4829-B2F8-F8131664D1FF@pnsol.com> Date: Mon, 4 May 2015 12:10:56 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <0F8CB21C-792F-4F95-BC49-BED3DF0A2100@unimore.it> References: <72DB0260-F0DF-426F-A3F3-ECF5D8AF228F@pnsol.com> <766042D4-0C90-4C77-9033-07B8E436C35B@pnsol.com> <2F4DCB53-1E46-4829-B2F8-F8131664D1FF@pnsol.com> To: Neil Davies X-Mailer: Apple Mail (2.1878.6) UNIMORE-X-SA-Score: -2.9 Cc: Jonathan Morton , bloat Subject: Re: [Bloat] Detecting bufferbloat from outside a node X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 10:11:43 -0000 I have tried to fully digest this information (thanks), but there is = still some piece that I am missing. To highlight it, I would like to try = with an oversimplified example. I hope this will make it easier to point = out flaws in my understanding. Suppose that one wants/needs to discover whether outbound and/or inbound = packets experience high, internal queueing delays, in a given node A, = because some buffers are bloated (inside the node). For any packet = leaving or entering the node, we have that, regardless of whether the = packet exits from the node after experiencing a high internal = output-queueing delay, or whether the packet will experience a high = internal input-queueing delay after being received by node, the per-hop = or end-to-end delays experienced by the packet outside the node are = exactly the same. If this statement is true, then, since no information = of any sort is available about queueing delays inside the node, and = since the delays measurable from outside the node are invariant with = respect to the internal queueing delays, how can we deduce internal = delays from external ones? Thanks, Paolo Il giorno 28/apr/2015, alle ore 12:23, Neil Davies = ha scritto: >=20 > On 28 Apr 2015, at 10:58, Sebastian Moeller wrote: >=20 >> Hi Neil, >>=20 >>=20 >> On Apr 28, 2015, at 09:17 , Neil Davies = wrote: >>=20 >>> Jonathan >>>=20 >>> The timestamps don't change very quickly - dozens (or more) of = packets can have the same timestamp, so it doesn't give you the = appropriate discrimination power. Timed observations at key points gives = you all you need (actually, appropriately gathered they give you all you = can possibly know - by observation) >>=20 >> But this has two issues: >> 1) =E2=80=9Ctimed observations=E2=80=9D: relatively easy if all nodes = are under your control otherwise hard. I know about the CERN paper, but = they had all nodes under their control, symmetric bandwidth and shipload = of samples, so over the wild internet =E2=80=9Ctimed observations=E2=80=9D= are still hard (and harder as the temporal precision requirement goes = up) >=20 > =E2=88=86Q (with its improper CDF semantics and G,S and V basis set) = has composition and de-composisition properties - this means that you = don=E2=80=99t need to be able to observe everywhere - even in Lucian=E2=80= =99s case his observation points were limited (certain systems) - the = rest of the analysis is derived using the properies of the =E2=88=86Q = calculus. >=20 > Lucian also demonstrated how the standard timing observations (which = include issues of clock drift and distributed accuracy) can be resolved = in a practical situation - he reproduced - starting from libpcap = captures on machines - results that CERN guys build specialist h/w with = better than 20ns timing only 5 years before. >=20 > The good thing about Lucian=E2=80=99s thesis is that it is in the = public domain - but we use the same approach over wide (i.e world) = networks and get same properties (unfortunately that is done in a = commercial context). This all arises because we can perform the = appropriate measurement error analysis, and hence use standard = statistical techniques. >=20 >>=20 >> 2) =E2=80=9Ckey points=E2=80=9D: once you know the key points you = already must have a decent understanding on the effective topology of = the network, which again over the wider internet is much harder than if = one has all nodes under control. >=20 > Not really - the key points (as a start) are the end ones - and those = you have (reasonable) access to - and even if you don=E2=80=99t have = access to the *actual* end points - you can easily spin up a measurement = point that is very close (in =E2=88=86Q terms) to the ones you are = interested in - AWS and Google Compute are your friends here. >=20 >>=20 >>=20 >> I am not sure how Paolo=E2=80=99s =E2=80=9Cno-touching=E2=80=9D = problem fits into the requirements for your deltaQ (meta-)math ;) >=20 > I see =E2=80=9Cno touching=E2=80=9D as =E2=80=9Cno modification=E2=80=9D= - you can=E2=80=99t deduce information in the absence of data - what = you need to understand is the minimum data requirements to achieve the = measurement outcome - =E2=88=86Q calculus gives you that handle. >=20 >>=20 >> Best Regards >> Sebastian >>=20 >>>=20 >>> Neil >>>=20 >>> On 28 Apr 2015, at 00:11, Jonathan Morton = wrote: >>>=20 >>>> On 27 Apr 2015 23:31, "Neil Davies" wrote: >>>>>=20 >>>>> Hi Jonathan >>>>>=20 >>>>> On 27 Apr 2015, at 16:25, Jonathan Morton = wrote: >>>>>=20 >>>>>> One thing that might help you here is the TCP Timestamps option. = The timestamps thus produced are opaque, but you can observe them and = measure the time intervals between their production and echo. You should = be able to infer something from that, with care. >>>>>>=20 >>>>>> To determine the difference between loaded and unloaded states, = you may need to observe for an extended period of time. Eventually = you'll observe some sort of bulk flow, even if it's just a software = update cycle. It's not quite so certain that you'll observe an idle = state, but it is sufficient to observe an instance of the link not being = completely saturated, which is likely to occur at least occasionally. >>>>>>=20 >>>>>> - Jonathan Morton >>>>>=20 >>>>> We looked at using TCP timestamps early on in our work. The = problem is that they don't really help extract the fine-grained = information needed. The timestamps can move in very large steps, and the = accuracy (and precision) can vary widely from implementation to = implementation. >>>>=20 >>>> Well, that's why you have to treat them as opaque, just like I = said. Ignore whatever meaning the end host producing them might embed in = them, and simply watch which ones get echoed back and when. You only = have to rely on the resolution of your own clocks. >>>>=20 >>>>> The timestamps are there to try and get a gross (if my memory = serves me right ~100ms) approximation to the RTT - not good enough for = reasoning about TCP based interactive/"real time" apps >>>>=20 >>>> On the contrary, these timestamps can indicate much better = precision than that; in particular they indicate an upper bound on the = instantaneous RTT which can be quite tight under favourable = circumstances. On a LAN, you could reliably determine that the RTT was = below 1ms this way. >>>>=20 >>>> Now, what it doesn't give you is a strict lower bound. But you can = often look at what's going on in that TCP stream and determine that = favourable circumstances exist, such that the upper bound RTT estimate = is probably reasonably tight. Or you could observe that the stream is = mostly idle, and thus probably influenced by delayed acks and Nagle's = algorithm, and discount that measurement accordingly. >>>>=20 >>>> - Jonathan Morton >>>>=20 >>>=20 >>> _______________________________________________ >>> Bloat mailing list >>> Bloat@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/bloat >>=20 >=20 > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat -- Paolo Valente =20 Algogroup Dipartimento di Fisica, Informatica e Matematica =09 Via Campi, 213/B 41125 Modena - Italy =20 homepage: http://algogroup.unimore.it/people/paolo/