In terms of code, the current state is as follows:
# if SIZEOF_SIZE_T < 8
|
inline size_t ut_fold_ull(uint64_t d) noexcept
|
{
|
size_t n1= uint32_t(d), n2= uint32_t(d >> 32);
|
return ((((n1 ^ n2 ^ 1653893711) << 8) + n1) ^ 1463735687) + n2;
|
}
|
# else
|
# define ut_fold_ull(d) d
|
# endif
|
The 32-bit version is equivalent to the existing definition. The numeric constants seem to be arbitrary. I remember that Heikki Tuuri had a habit of creating "random numbers" by typing them on the upper row of the keyboard. Especially the 1463735687 does not look very random: it contains a long sequence of near-consecutive digits.
In fact, this function looks like "obfuscated hashing" mentioned by https://sortingsearching.com/2020/05/21/hashing.html. That function also mentions Universal hashing and a hash function devised by Carter and Wegman back in 1977:
#define h(x) ((a*x+b)%p)%m
|
Here, p would be a prime number and a and b would be random integers between 0 and p-1. The m applied to this case could be 1<<32, and a suitable p could be 0x1000000F. What if we "randomly" choose a=0 to avoid one expensive multiplication?
#include <cstdint>
|
uint32_t ut_fold_ull(uint64_t num) noexcept
|
{
|
return uint32_t((3 + num) % 0x1000000F);
|
}
|
On IA-32, the above would emit add and adc for the addition (no matter which constant we choose) and a call to the runtime library function __umoddi3, which looks very expensive. (It is unavoidable; it has to perform 64-bit division arithmetics on a 32-bit ISA.)
We know that GCC knows how to optimize mach_read_from_4() and friends, What if we used a single carry-less addition, a.k.a. exclusive OR?
static unsigned ut_fold_ull(unsigned long long num)
|
{
|
return ((unsigned) num) ^ (unsigned) (num >> 32);
|
}
|
int main (int argc, char **argv)
|
{
|
return ut_fold_ull((unsigned)argc * 123456);
|
}
|
The main function would translate to the following IA-32 instructions:
imul 123456,0x4(%esp),%eax
|
ret
|
I think that the above solution should be good
We should keep in mind that the only use of ut_fold_ull is to shrink 64-bit table or index identifiers to 32 bits. On 32-bit systems, I would assume that the most significant half of these IDs would typically be 0; nobody should be creating billions of tables on such a small machine. So, I would opt for the following simple function for narrower-than-64-bit ISA:
#include <cstdint>
|
uint32_t ut_fold_ull(uint64_t num) noexcept
|
{
|
return uint32_t(num) ^ uint32_t(num >> 32);
|
}
|
In terms of code, the current state is as follows:
# if SIZEOF_SIZE_T < 8
{
}
# else
# define ut_fold_ull(d) d
The 32-bit version is equivalent to the existing definition. The numeric constants seem to be arbitrary. I remember that Heikki Tuuri had a habit of creating "random numbers" by typing them on the upper row of the keyboard. Especially the 1463735687 does not look very random: it contains a long sequence of near-consecutive digits.
In fact, this function looks like "obfuscated hashing" mentioned by https://sortingsearching.com/2020/05/21/hashing.html. That function also mentions Universal hashing and a hash function devised by Carter and Wegman back in 1977:
Here, p would be a prime number and a and b would be random integers between 0 and p-1. The m applied to this case could be 1<<32, and a suitable p could be 0x1000000F. What if we "randomly" choose a=0 to avoid one expensive multiplication?
#include <cstdint>
uint32_t ut_fold_ull(uint64_t num) noexcept
{
}
On IA-32, the above would emit add and adc for the addition (no matter which constant we choose) and a call to the runtime library function __umoddi3, which looks very expensive. (It is unavoidable; it has to perform 64-bit division arithmetics on a 32-bit ISA.)
We know that GCC knows how to optimize mach_read_from_4() and friends, What if we used a single carry-less addition, a.k.a. exclusive OR?
{
}
{
}
The main function would translate to the following IA-32 instructions:
imul 123456,0x4(%esp),%eax
ret
I think that the above solution should be good
We should keep in mind that the only use of ut_fold_ull is to shrink 64-bit table or index identifiers to 32 bits. On 32-bit systems, I would assume that the most significant half of these IDs would typically be 0; nobody should be creating billions of tables on such a small machine. So, I would opt for the following simple function for narrower-than-64-bit ISA:
#include <cstdint>
uint32_t ut_fold_ull(uint64_t num) noexcept
{
}