oprofiling is much saner looking now with rc6-smoketest

Dave Taht dave.taht at gmail.com
Tue Aug 30 21:45:55 EDT 2011


On Tue, Aug 30, 2011 at 6:01 PM, Rick Jones <rick.jones2 at hp.com> wrote:
> On 08/30/2011 05:32 PM, Dave Taht wrote:

>> It bugs me that iptables and conntrack eat so much cpu for what
>> is an internal-only connection, e.g. one that
>> doesn't need conntracking.
>
> The csum_partial is a bit surprising - I thought every NIC and its dog
> offered CKO these days - or is that something happening with
> ip_tables/contrack?

If this chipset supports it, so far as I know, it isn't documented or
implemented.

> I also thought that Linux used an integrated
> copy/checksum in at least one direction, or did that go away when CKO became
> prevalent?

Don't know.

>
> If this is inbound, and there is just plain checksumming and not anything
> funny from conntrack, I would have expected checksum to be much larger than
> copy.  Checksum (in the inbound direction) will take the cache misses and
> the copy would not.  Unless... the data cache of the processor is getting
> completely trashed - say from the netserver running on the router not
> keeping up with the inbound data fully and so the copy gets "far away" from
> the checksum verification.

220Mbit isn't good enough for ya? Previous tests ran at about 140Mbit, but due
to some major optimizations by felix to fix a bunch of mis-alignment
issues. Through the router, I've seen 260Mbit - which is perilously
close to the speed that I can drive it at from the test boxes.

>
> Does perf/perf_events (whatever the followon to perfmon2 is called) have
> support for the CPU used in the device?  (Assuming it even has a PMU to be
> queried in the first place)

Yes. Don't think it's enabled. It is running flat out, according to top.

>
>> That said, I understand that people like their statistics, and me,
>> I'm trying to make split-tcp work better, ultimately, one day....
>>
>> I'm going to rerun this without the fw rules next.
>
> It would be interesting to see if the csum time goes away.  Long ago and far
> away when I was beating on a 32-core system with aggregate netperf TCP_RR
> and enabling or not FW rules, conntrack had a non-trivial effect indeed on
> performance.

Stays about the same. iptables time drops. How to disable conntrack?
Don't you only really
need it for nat?

>
> http://markmail.org/message/exjtzel7vq2ugt66#query:netdev%20conntrack%20rick%20jones%2032%20netperf+page:1+mid:s5v5kylvmlfrpb7a+state:results
>
> I think will get to the start of that thread.  The subject is '32 core
> net-next stack/netfilter "scaling"'
>
> rick jones
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com



More information about the Bloat-devel mailing list