Dave Taht writes: > It is highly unlikely these patches would be accepted by upstream as > allocating a skb in the fast path is probably a no-no. However at > least part of them seem generically useful - instrumenting pfifo_fast > in particular - and certainly helpful for debugging in general. Right, well, I never envisioned it was something people would turn on if they wanted to maximise performance. > I had hoped it would be possible to do this without mucking with > locking, tho. :( Well I'm not really doing any locking as it is; the whole thing is called from a point where the qdisc lock is already held. The only "mucking" I'm doing is making the gnet_stats stuff optionally *not* lock anything. Not so sure that is a good idea, but it seemed better than unlocking everything before calling into gnet_stats*. Of course an alternative would be to just not use that, but thought reusing the existing code would be worthwhile. > In my own case I'd like to have a qdisc_drop_with_stats > (skb,sch,"fq_codel",reason) that could be compiled in or out easily, > sending the packet headers of the dropped packet to a listener much as > you do here. This would be useful for monitoring, have potential as a > routing metric, and make it easier to analyze the behavior of various > qdisc types, without having to poll and lock via tc. Yeah, that should be quite doable starting from what I have, I think. What would "reason" be? > Email syntax for a RFC PATCH is RFC PATCH, not PATCH RFC Noted. > unnecessary change. try to touch as little as possible... There is an > emacs mode for linux indentation, btw which helps avoid annoyingly > messing up existing code. Yeah, punted on cleaning up the whitespace changes; will go over those again. > It's not clear to me where this stuff enters userspace or not, and in > general it's safer to append new enums/defines rather than insert > them. It's a number used for assigning the netlink multicast group; userspace includes the header file and subscribes to the same group number. Put it there to group related stuff together... > So this acted up on a SMP kernel? Not really; as noted above, gen_stats by default locks the qdisc before accessing it, but since I'm now calling it from somewhere where it's already locked, I had to get it to not do that. I'm not entirely sure it's exactly the same lock, but think so... > QDISC_BROADCAST_DEQUEUE_STATS maybe? Oh yeah, completely forgot to fix up the compile option after adding the sysctl... > Elsewhere you've defined the sysctl. Would argue for sysfs... Eh? > Is there a way to also merely detect if there is a listener present? Dunno. I assume that since the netlink stuff has multicast semantics it would just do something reasonable (such as discard the packet) if there's no listeners. I can see how detecting it would cut down on some overhead though; I'll see if I can find something. My alternative was to abort if the sysctl parameter is set to some magic value (0? negative number?). > Useful for debugging, but invasive for general use, far more > heavyweight than tossing it into netlink. Yeah, that wasn't supposed to stay in there. > NLMSG_SPACE seems arbitrary, and perhaps this needs to be GFP_ATOMIC? > (But I am totally not an expert down here) Right, will investigate. > Am thinking this could be a callback in the Qdisc_ops struct... Well right now it's the other way around; this is called for every qdisc, and dumps the general qdisc stats along with anything that the qdisc itself defines (through the dump_stats callback in qdisc_ops). It's the same way the stuff that tc talks to works, and makes explicit support from each qdisc optional. Also, another unresolved issue with this patch: if there's more than one qdisc on an interface (e.g. a tree of HTBs), it will probably be called for each one dequeueing a packet; not sure what to do about that; maybe just limit it to the root qdisc? -Toke