<div class="gmail_quote">On Mon, Mar 7, 2011 at 1:28 PM, Jim Gettys <span dir="ltr"><<a href="mailto:jg@freedesktop.org" target="_blank">jg@freedesktop.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div></div><div>Cisco is far from unique. I found it impossible to get this information from Linux. Dunno about other operating systems.</div></div>
It's one of the things we need to fix in general.</blockquote><div><br></div><div>So I'm not the only one. :) I'm looking to get this for Linux, and am willing to implement it if necessary, and was looking for the One True Way. I assume reporting back through netlink is the way to go.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Exactly what the right metric(s) is (are), is interesting, of course. The problem with only providing instantaneous queue depth is that while it tells you you are currently suffering, it won't really help you detect transient bufferbloat due to web traffic, etc, unless you sample at a very high rate. I really care about those frequent 100-200ms impulses I see in my traffic. So a bit of additional information would be goodness.g<br>
</blockquote><div><br></div><div>My PhD research is focused on automatically diagnosing these sorts of hiccups on a local host. I collect a common set of statistics across the entire local stack every 100ms, then run a diagnosis algorithm to detect which parts of the stack (connections, applications, interfaces) aren't doing their job sending/receiving packets. </div>
<div><br></div><div>Among the research questions: What stats are necessary/sufficient for this kind of diagnosis, What should their semantics be, and What's the largest useful sample interval?</div><div><br></div><div>
It turns out that when send/recv stops altogether, the queue lengths indicate where things are being held up, leading to this discussion. I have them for TCP (via web100), but since my diagnosis rules are generic, I'd like to get them for the interfaces as well. I don't expect that the Ethernet driver would stop transmitting for a few 100 ms at a time, but a wireless driver might have to.</div>
<div><br></div><div> Justin</div></div>