Sunday, September 10, 2017

Good Hash Functions

Let m = number of slot in the hash table. k is the key we want to hash. We will be discussing some hashing mechanisms.

Division method
A well-known hash method is doing mod
h(k) = k mod m
We need to be careful with choice of m.

m should not have small divisor. For example, if d = 2 is a divisor of m, and the keys are all even,
then the hash function will always choose even slots and all the odd slots will be empty.

we want the h(k) depends on all the bits of k. m should not be a power of 2.
lets say m = 2 ^ k. then the hash value will not depend on all the bits of the key, k.

 001 100 111 mod 2 ^ 3 = 111 (the last 3 bits)
 in general k mod (2 ^ t) = t lower bits of k

pick m, such that it's a prime and not close to power of 2 or 10.

Division method is simple, but division takes a lot of cycles.

Multiplication method
Lets assume the word size of the computer = w.
let, m = 2 ^ r
h(k) = ((A * k) mod (2 ^ w)) >> (w - r), where A is odd, 2 ^ (w-1) < A < 2 ^ (w). A is not close to power of 2.

How it works:
A has w bits, k has w bits. A * k has 2w bits.
A * k mod (2 ^ w) has lower w bits of A * k.
((A * k) mod (2 ^ w)) >> (w - r) gives r bits slot result.

When multiplication is done, there is great mixing of bits of k.
m = 2 ^ 3 = 1000
k = 1001
A = 5 = 101
w = 4
A * k
                        1001
                        0101
-------------------------------
                        1001
                      00000
                    100100
                  0000000
--------------------------------
                 00101101 (hash result will be 110)

notice how the bits in the middle are nicely mixed with many other bits. this is one of the reasons the multiplication method works nicely.
the mod and shift operations are very fast, as it only discards some digits. that is one of the reasons it is faster than the division method.

Reference:
[0] https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/lecture-8-hashing-with-chaining/
[1] https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-046j-introduction-to-algorithms-sma-5503-fall-2005/video-lectures/lecture-7-hashing-hash-functions/

No comments:

Post a Comment