Recall that a set of hash functions (mapping the elements of a universe to the buckets is universal if for every distinct , the probability that and collide, assuming that the hash function is chosen uniformly at random from , is at most . In this problem you will prove that a collision probability of is essentially the best possible. Precisely, suppose that is a family of hash functions mapping to , as above. Show that there must be a pair of distinct elements such that, if is chosen uniformly at random from , then
ANSWER: Let be the size of the set , for . That is, maps inputs to bucket 1, maps inputs to bucket 2, and so on.
- Let us fix a hash function .
- Let .
- Given = number of buckets.
Then, the number of distinct that collide in bucket 1 is . The number of distinct that collide in all buckets is, therefore, (call this expression ). Note that . We are interested in lower-bounding the expression .
Suppose . Let us define be the set of all distinct . Note that . Let be an indicator random variable, such that, given a hash function and distinct :
Taking expectation of :
By the Pigeon hole Principle, we get: There exists such that the above inequality holds.