On Nov 22, 2017, at 7:49 PM, Pete Heist <peteheist@gmail.com> wrote:

On Nov 22, 2017, at 7:38 PM, Dave Taht <dave.taht@gmail.com> wrote:

It is somewhat unfair to not include the pfifo bandwidth on the test
(a cpu cost/byte might be a better metric), also pfifo_fast has three
tiers of classification in it.

Yeah, it’s probably better to not try to subtract the pfifo_fast system time out in the way that I did. I should probably just compare cake with and without the change, using a more accurate tool.

I don’t see how the change could hurt, but I also now am not sure it helps much either. I guess it’s just two divs per call to cake_hash, which is obviously going to happen more at GigE.

I didn’t figure out ‘perf’ for this, but I did instrument cake_hash in a simple way with calls to local_clock_ns using ‘stap'. Results on stap tab:

https://docs.google.com/spreadsheets/d/1LKoq5NaswuHm9H1atXoZA1AhNDg6L4UYS3Pn5lCsb1I/edit#gid=1493356365

It’s a head scratcher, but I saw about a 3% mean time reduction in cake_hash for the “optimized” version when limited at 950mbit, and a very slight slowdown when unlimited. “Confounding”...(by Estee Lauder).

Whether or not those results are either correct or statistically significant, it doesn’t look like it’s worth too much more effort, and I can leave it to you whether you want this change or not. I don’t see the harm in it, and neither do I see much of a benefit.