<div dir="ltr"><div>Hey,</div><div><br></div><div>I've had some pretty good success with merging xdp-pping ( <a href="https://github.com/xdp-project/bpf-examples/blob/master/pping/pping.h">https://github.com/xdp-project/bpf-examples/blob/master/pping/pping.h</a> ) into xdp-cpumap-tc ( <a href="https://github.com/xdp-project/xdp-cpumap-tc">https://github.com/xdp-project/xdp-cpumap-tc</a> ).</div><div><br></div><div>I ported over most of the xdp-pping code, and then changed the entry point and packet parsing code to make use of the work already done in xdp-cpumap-tc (it's already parsed a big chunk of the packet, no need to do it twice). Then I switched the maps to per-cpu maps, and had to pin them - otherwise the two tc instances don't properly share data. Right now, output is just stubbed - I've still got to port the perfmap output code. Instead, I'm dumping a bunch of extra data to the kernel debug pipe, so I can see roughly what the output would look like.</div><div><br></div><div>With debug enabled and just logging I'm now getting about 4.9 Gbits/sec on single-stream iperf between two VMs (with a shaper VM in the middle). :-)</div><div><br></div><div>So my question: how would you prefer to receive this data? I'll have to write a daemon that provides userspace control (periodic cleanup as well as reading the performance stream), so the world's kinda our oyster. I can stick to Kathie's original format (and dump it to a named pipe, perhaps?), a condensed format that only shows what you want to use, an efficient binary format if you feel like parsing that...</div><div><br></div><div>(I'll post some code soon, getting sleepy)<br></div><div><br></div><div>Thanks,</div><div>Herbert<br></div></div>