oprofiling is much saner looking now with rc6-smoketest

Dave Taht dave.taht at gmail.com
Tue Aug 30 21:58:29 EDT 2011

I have put the current rc6 smoketest up at:


So far it's proving very stable. Wireless performance is excellent and
wired performance dramatically improved. No crash bugs thus far,
though I had a scare...

For the final rc6, which I hope to have done by friday, I'm in the
process of cleanly re-assembling the patch set (sorry, the sources are
a bit of a mess at present). For this rc, I'm hoping that a new
iptables lands, in particular, and I have numerous other little things
in the queue to sort out.

All that said, getting oprofile running is not hard, and I do
appreciate smoke testers helping out!!! as I don't think I'll be able
to get another release candidate done before linux plumbers.

install the correct image on your router from the above via web
interface or sysupgrade -n
edit /etc/opkg.conf to have that url in it
opkg update
opkg install oprofile
cd /tmp
mkdir /tmp/oprofile
wget http://huchra.bufferbloat.net/~d/rc6-smoke-captures/vmlinux
opcontrol --vmlinux=/tmp/vmlinux --session-dir=/tmp/oprofile (saving
profile data to flash is a bad idea)

opcontrol --start
# do your testing
opcontrol --dump

opreport -c # or whatever options you like.

On Tue, Aug 30, 2011 at 6:45 PM, Dave Taht <dave.taht at gmail.com> wrote:
> On Tue, Aug 30, 2011 at 6:01 PM, Rick Jones <rick.jones2 at hp.com> wrote:
>> On 08/30/2011 05:32 PM, Dave Taht wrote:
>>> It bugs me that iptables and conntrack eat so much cpu for what
>>> is an internal-only connection, e.g. one that
>>> doesn't need conntracking.
>> The csum_partial is a bit surprising - I thought every NIC and its dog
>> offered CKO these days - or is that something happening with
>> ip_tables/contrack?
> If this chipset supports it, so far as I know, it isn't documented or
> implemented.
>> I also thought that Linux used an integrated
>> copy/checksum in at least one direction, or did that go away when CKO became
>> prevalent?
> Don't know.
>> If this is inbound, and there is just plain checksumming and not anything
>> funny from conntrack, I would have expected checksum to be much larger than
>> copy.  Checksum (in the inbound direction) will take the cache misses and
>> the copy would not.  Unless... the data cache of the processor is getting
>> completely trashed - say from the netserver running on the router not
>> keeping up with the inbound data fully and so the copy gets "far away" from
>> the checksum verification.
> 220Mbit isn't good enough for ya? Previous tests ran at about 140Mbit, but due
> to some major optimizations by felix to fix a bunch of mis-alignment
> issues. Through the router, I've seen 260Mbit - which is perilously
> close to the speed that I can drive it at from the test boxes.
>> Does perf/perf_events (whatever the followon to perfmon2 is called) have
>> support for the CPU used in the device?  (Assuming it even has a PMU to be
>> queried in the first place)
> Yes. Don't think it's enabled. It is running flat out, according to top.
>>> That said, I understand that people like their statistics, and me,
>>> I'm trying to make split-tcp work better, ultimately, one day....
>>> I'm going to rerun this without the fw rules next.
>> It would be interesting to see if the csum time goes away.  Long ago and far
>> away when I was beating on a 32-core system with aggregate netperf TCP_RR
>> and enabling or not FW rules, conntrack had a non-trivial effect indeed on
>> performance.
> Stays about the same. iptables time drops. How to disable conntrack?
> Don't you only really
> need it for nat?
>> http://markmail.org/message/exjtzel7vq2ugt66#query:netdev%20conntrack%20rick%20jones%2032%20netperf+page:1+mid:s5v5kylvmlfrpb7a+state:results
>> I think will get to the start of that thread.  The subject is '32 core
>> net-next stack/netfilter "scaling"'
>> rick jones
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://the-edge.blogspot.com

Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608

More information about the Bloat-devel mailing list