🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

CRC32 of uint64 in uint32

Started by
3 comments, last by Alberth 5 years, 2 months ago

Hello,

I take a look at the hash function of Unreal Engine 4 on github.

I do not understand a special case for int64 and uint64.


inline uint32 GetTypeHash( const uint64 A )
{
    return (uint32)A+((uint32)(A>>32) * 23);
}

inline uint32 GetTypeHash( const int64 A )
{
    return (uint32)A+((uint32)(A>>32) * 23);
}

What I do not understand is how and why it's works?

I can't find any algorithm or trick like that on internet, The std for long long and unsigned long run the murmur hash for thoses types.

Can anybody can explain me or redirect me to a link that explain that?

Why adding the low with the high part * 23 is relevant? why 23?

Thank you very much,

Best Regards,

Advertisement

This is just a hash function, not CRC32.  It "works" if collisions are rare for expected input.  Multiplying by a small prime number like 23 has the effect of reducing the chance of collisions for certain types of input, at the cost of increasing the chance of collisions for other types of input.  In particular, if the input is a bit field, then the multiplication avoids a collision between 0x00000001 and 0x00010000, at the cost of creating a collision between 0x00000000 and 0x10000000.

Thank you, I see.

I make some research about hash combine and prime number in hash distribution.

Multiplying by a prime number performs a better distribution and reduce collision has your just told me.

Then what I did not understand was the addition... Then I understand that Unreal Engine 4 except that integer are two's complement so addition is a logical OR. Logical OR is a possible way to combine hash but the best for mathematical point of view is an XOR like Java do.

So If i'm right


inline uint32 GetTypeHash( const uint64 A )
{
    return (uint32)A ^ ((uint32)(A>>32) * 23);
}

has less collision than 


inline uint32 GetTypeHash( const uint64 A )
{
    return (uint32)A+((uint32)(A>>32) * 23);
}

 

Does this is correct for you gamedev programmer's ? :)

Note1: This is perfectly explains in the tab here https://www.javamex.com/tutorials/collections/hash_function_guidelines.shtml

This also means that multiplying by 31 (33 as it say is not a prime) is better than 23.

Quote
Two ints that have a biased distribution. (x * p) ^ y
Where p is a prime number4 and/or odd number close to a power of 2, such as 17 or 33.

Multiplying one of the numbers (or, put another way, shifting plus adding/subtracting from itself) will help to ensure that the "biased" bits of one number interact with the more "random" bits of the other, so that the bias is "ironed out" over more of the bits.

 

There is no theoretical best generic solution. You're compressing zillions of bits to much less bits, so collisions are going to happen. The optimal hashing function depends on what kind of bit patterns you expect to happen. Different expectations lead to different optimal hash function. In other words, "optimal hashing" changes if you change what data you are storing and how.

I use the general rule to avoid hashing as much as possible where it matters, linear indexing is always much much faster. At places where hashing becomes a viable option, hashing function is not critical any more by definition, so take whatever looks good, and improve if you get a performance problem.

This topic is closed to new replies.

Advertisement