[Cake] an experiment with an alternate hasher

Dave Taht dave.taht at gmail.com
Sun Mar 26 12:37:22 EDT 2017


On Sun, Mar 26, 2017 at 9:16 AM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>
>> On 26 Mar, 2017, at 19:00, Dave Taht <dave.taht at gmail.com> wrote:
>>
>> popcount is, regrettably, an sse4.2-only instruction
>
> A read through the ARM ISA Quick Reference Card:
>
> http://infocenter.arm.com/help/topic/com.arm.doc.qrc0001m/QRC0001_UAL.pdf
>
> …shows that there is no equivalent instruction on ARM CPUs at least up to ARMv7, which I think covers all current-generation consumer-grade routers.

All the x86_64 routing platforms at my command have it, notably the
pcengines apu2.

finding a suitable algorithm(s) for arm and mips remains on my mind.

>
> However, the operation can be constructed using log2(N) operations on any modern CPU as a sequence of masks, shifts and adds.  GCC has a “builtin” intrinsic function to use a popcnt instruction where present, and this algorithm otherwise.

yes, I have the __builtin_popcount version too under test. Something
like 20ins without -msse4.2. :(

There are a wide variety of popcnt implementations for sse and neon.

https://github.com/WojciechMula/sse-popcount.git

The extreme value in the sse4.2 implementation is that it works in the
main register set (can be live patched in, too), not the sse regs....

and it only takes a clock.

Many cool popcount implementations here:

https://github.com/WojciechMula/sse-popcount.git

One thing that really irks me about all these sorts of benchmarks
(there's a good one for hashes, too) is that the startup cost really
dominates - we do three hashes, and move on.

>
> Obviously this will only be of any use if the resulting hash is of good quality.

Yep, I need to run this through some real data. I just really enjoyed
fitting the whole routine into 28 bytes.

> An obvious problem with popcnt is that inputs of 1, 2, 4, 8, etc have the same popcnt (1),

srcport,dstport, protocol have plenty of bits.

Not really sure what the distribution would look like on real data,
but (as one example) dnsmasq tries to hand out ips not sequentially
but on your mac address, so you get a bit better distribution than
sequential. Maybe.

>and it is trivial for an attacker to exploit this property.

cake is a set associative hash. Any "attacker" merely has to send 1k+
different kinds of flows to saturate it.


>  - Jonathan Morton
>



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


More information about the Cake mailing list