From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-11-iad.dyndns.com (mxout-024-iad.mailhop.org [216.146.32.24]) by lists.bufferbloat.net (Postfix) with ESMTP id 853BE2E0545 for ; Mon, 7 Mar 2011 13:18:05 -0800 (PST) Received: from scan-11-iad.mailhop.org (scan-11-iad.local [10.150.0.208]) by mail-11-iad.dyndns.com (Postfix) with ESMTP id 981EA1728AA for ; Mon, 7 Mar 2011 21:18:04 +0000 (UTC) X-Spam-Score: -1.0 () X-Mail-Handler: MailHop by DynDNS X-Originating-IP: 74.125.82.171 Received: from mail-wy0-f171.google.com (mail-wy0-f171.google.com [74.125.82.171]) by mail-11-iad.dyndns.com (Postfix) with ESMTP id 8F88A172844 for ; Mon, 7 Mar 2011 21:18:03 +0000 (UTC) Received: by wyf22 with SMTP id 22so5714029wyf.16 for ; Mon, 07 Mar 2011 13:18:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=FsMsOIUDjy0ozELsb2y+jwQsR6QtyXE8T0EdusXiuJM=; b=hAR+kv4YXJcopPLiusj7/1O8pt+6lDtgHsniYrGe1yVPnL3FHd9CR7UegqV8WuDW0g VfU7Fm8kvyO7X9Usq1oddwsc2atecEjNlU3Ii+gKwpTG6N+W4oo6NfJUuiYC6+genWTf XdhhlaenIxuWcXDswiTywSS4vSON/qvcQzj1w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Ulj3djG7z95E6VE2tIahhx4Ept6SFw1tl57fkI+pZRGo6jwkcTkwwpM2HS5j/I/r9z soODkVACe1yySSdai36vCcMgmfKwJX3i0FhgXVbYbCDtRVZGlXLfvs4tDByMLrzBDjKa tEjVyM4Bw13Qm0slMKXnvXuhXYLziG5HOjHyg= MIME-Version: 1.0 Received: by 10.227.139.19 with SMTP id c19mr4103103wbu.13.1299532682774; Mon, 07 Mar 2011 13:18:02 -0800 (PST) Received: by 10.227.156.75 with HTTP; Mon, 7 Mar 2011 13:18:02 -0800 (PST) In-Reply-To: <4D7523E5.3070009@freedesktop.org> References: <7A1666CF-3E98-4668-A265-F89B50D23909@cisco.com> <24D67DE4-5637-4FFE-A375-23CF52A6BBAF@cisco.com> <4D7523E5.3070009@freedesktop.org> Date: Mon, 7 Mar 2011 16:18:02 -0500 Message-ID: Subject: Re: Getting current interface queue sizes From: Justin McCann To: Jim Gettys Content-Type: multipart/alternative; boundary=0016e6da7da9ee49c6049deb0777 Cc: bloat-devel@lists.bufferbloat.net X-BeenThere: bloat-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Developers working on AQM, device drivers, and networking stacks" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 21:18:05 -0000 --0016e6da7da9ee49c6049deb0777 Content-Type: text/plain; charset=ISO-8859-1 On Mon, Mar 7, 2011 at 1:28 PM, Jim Gettys wrote: > Cisco is far from unique. I found it impossible to get this information > from Linux. Dunno about other operating systems. > It's one of the things we need to fix in general. So I'm not the only one. :) I'm looking to get this for Linux, and am willing to implement it if necessary, and was looking for the One True Way. I assume reporting back through netlink is the way to go. > Exactly what the right metric(s) is (are), is interesting, of course. The > problem with only providing instantaneous queue depth is that while it tells > you you are currently suffering, it won't really help you detect transient > bufferbloat due to web traffic, etc, unless you sample at a very high rate. > I really care about those frequent 100-200ms impulses I see in my traffic. > So a bit of additional information would be goodness.g > My PhD research is focused on automatically diagnosing these sorts of hiccups on a local host. I collect a common set of statistics across the entire local stack every 100ms, then run a diagnosis algorithm to detect which parts of the stack (connections, applications, interfaces) aren't doing their job sending/receiving packets. Among the research questions: What stats are necessary/sufficient for this kind of diagnosis, What should their semantics be, and What's the largest useful sample interval? It turns out that when send/recv stops altogether, the queue lengths indicate where things are being held up, leading to this discussion. I have them for TCP (via web100), but since my diagnosis rules are generic, I'd like to get them for the interfaces as well. I don't expect that the Ethernet driver would stop transmitting for a few 100 ms at a time, but a wireless driver might have to. Justin --0016e6da7da9ee49c6049deb0777 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Mon, Mar 7, 2011 at 1:28 PM, Jim Gettys <jg= @freedesktop.org> wrote:
Cisco is far from unique. =A0I found it impossible to = get this information from Linux. =A0Dunno about other operating systems.
It's one of the things we need to fix in general.

=
So I'm not the only one. :) I'm looking to get this for = Linux, and am willing to implement it if necessary, and was looking for the= One True Way. I assume reporting back through netlink is the way to go.
=A0
Exactly what the right metric(s) is (are), is interesting, of course. The p= roblem with only providing instantaneous queue depth is that while it tells= you you are currently suffering, it won't really help you detect trans= ient bufferbloat due to web traffic, etc, unless you sample at a very high = rate. =A0I really care about those frequent 100-200ms impulses I see in my = traffic. So a bit of additional information would be goodness.g

My PhD research is focused on automaticall= y diagnosing these sorts of hiccups on a local host.=A0I collect a common s= et of statistics across the entire local stack every 100ms, then run a diag= nosis algorithm to detect which parts of the stack (connections, applicatio= ns, interfaces) aren't doing their job sending/receiving packets.=A0

Among the research questions: What stats are necessary/= sufficient for this kind of diagnosis, What should their semantics be, and = What's the largest useful sample interval?

It turns out that when send/recv stops altogether, the queue lengths indica= te where things are being held up, leading to this discussion. I have them = for TCP (via web100), but since my diagnosis rules are generic, I'd lik= e to get them for the interfaces as well. I don't expect that the Ether= net driver would stop transmitting for a few 100 ms at a time, but a wirele= ss driver might have to.

=A0=A0 Justin
--0016e6da7da9ee49c6049deb0777--