The hat-check problem

Posted by Antonio Catalano on April 11, 2018
Once you eliminate your number one problem, number two gets a promotion.
G.Weinberg

This problem is taken from the book "Introduction to Algorithms" by C.L.R.S. , a very tough and long book, but still deemed the bible of Algorithms. Let's quote the text:

Each of n customer gives a hat to a hat-check person at a restaurant. The hat-check person gives the hats back to the customers in a random order. What is the expected number of customers who get back their own hat?

Let's neglect the mathematical proof and try to figure out how we can solve the problem with more of a "coding" approach. The first idea is that each person corresponds to a unique hat. So a dictionary object seems perfect to work on it. The second thing is that we need a random generator, so we will import the random module and we'll use the choice(seq) method. This method returns a single element randomly chosen in a certain sequence that is the argument of the method. Eventually, we employ a Monte Carlo simulation (in other words: we repeat the test N times, where N is a large number, in order to fulfill the law of large numbers) to arrive at outcome. Let's see the code:

Hat-check problem

import random

dizio = {'Kristine':'red','John':'blu','Eric':'green',\
'Peter':'white','Paul':'black','Anthony':'brown'}

total = 0
N = 100000

for i in range(N): # Monte Carlo Simulation with 100k tests
    count = 0
    r = list(dizio.values()) # we create the sequence
                             # with the colours of hats

    for i in dizio: # we iterate through the dict keys
        if len(r) > 1: # we check that at least 2 people are in the list
            k = random.choice(r)
            if dizio[i]==k:   #line 12
                count +=1
            r.remove(k)   #line14
        elif len(r)==1:
            k = r[0]
            if dizio[i]==k:
                count +=1
    total += count

print("The expected number of customers\
who get back their own hat is: {} ".format(float(total/N)))


We create a dictionary of seven people in our example. We assign a hat of a given color to each person. In our example a color is a unique label, so every color is different from the others. Before every nested loop we create the list with the colours of the dictionary. As you can notice, for each test, we increase the count only if the random.choice method returns the same color of the key dictionary (see #line 12). Otherwise we continue the loop without increasing the counter. In line 14 we remove from the sequence of iterable colors the color randomly chosen in that step. When we iterate the last key of the dictionary (the customer's name), necessarily len(r) will be == 1, because the number of remaining people and number of iterable colors coincide.

At the end of every nested loop (the single test) we update the total variable by adding the count variable. When all the 100000 tests will finish, the outcome will be the ratio between total and N.

The output:

The expected number of customers who get back their own hat is: 1.00738

So, on average, only one person will get back their own hat. Ok....but...what happens if we change the number of people? Now we have only 7 people...what if we build a dictionary of 10, 100 or 1000 different persons paired with 10, 100, 1000 different hats?
...
Nothing. The outcome is always 1. It is independent of the number n of people. If you don't believe me, change the code in your editor by adding more items in dictionary and let me know... Strangeness of Math ;)

(You can find somewhere on the Web the mathematical proof where the result is 1. With a Monte Carlo simulation we will converge towards the exact result but for every simulation we will obtain a slightly different result obviously).