Plz take a look at the first comment
There are multiple implementation of the same functionality that must be shared: RowStorage::hashData(), Row::hash(), Row::colUpdateMariaDBHasher(), colUpdateMariaDBHasherTypeless().
They all contains the same or almost the same code that must be merged so that changes in the common hashing code are shared.
The benchmark tests from xxHash project demonstrates that MDB hash_bin_sort is slow in terms of hash calculation speed for arguments of different lenghts > 1 byte and its hash lacks avalance properties. So hash_bin_sort looks like a function mapping.
MDB's hash has a peculiar collision bench test output. All other hash functions that have avalance produce a way more collisions.
The current MM3 works good for arguments less then 4 bytes. However it becomes slower comparing with xxh3 algo when the argument size rizes, e.g. 20% for 8 byte ints.
Here is the stats for the proposed xxh3.
The modified code for the tests is available here. I also included the simple test that demonstrates the overal latency calculating hashes with our current MurMur3, MDB's hash_bin_sort, xxh3. The test code calls for xxHash static or dynamic library.