Hey,
I ported over most of the xdp-pping code, and then changed the entry point and packet parsing code to make use of the work already done in xdp-cpumap-tc (it's already parsed a big chunk of the packet, no need to do it twice). Then I switched the maps to per-cpu maps, and had to pin them - otherwise the two tc instances don't properly share data. Right now, output is just stubbed - I've still got to port the perfmap output code. Instead, I'm dumping a bunch of extra data to the kernel debug pipe, so I can see roughly what the output would look like.
With debug enabled and just logging I'm now getting about 4.9 Gbits/sec on single-stream iperf between two VMs (with a shaper VM in the middle). :-)
So my question: how would you prefer to receive this data? I'll have to write a daemon that provides userspace control (periodic cleanup as well as reading the performance stream), so the world's kinda our oyster. I can stick to Kathie's original format (and dump it to a named pipe, perhaps?), a condensed format that only shows what you want to use, an efficient binary format if you feel like parsing that...
(I'll post some code soon, getting sleepy)
Thanks,
Herbert
_______________________________________________