Skip to main content

Section 10.1 The Pigeonhole Principle

The Pigeonhole Principle is a technique that you can apply when you are faced with items chosen from a number of different categories of items, and you want to know whether or not some of them must come from the same category, without looking at all of the items.

Suppose I will be teaching an independent study course in graph theory to two students next semester, and I want to use Bondy & Murty's “Graph Theory” text book. It has been issued in two editions, and I don't care which edition we use, but I want both students to have the same edition.

I find a web site on which someone has posted that they have three copies of the text for sale, but they don't say which editions they are. Without any more information, I know that if I buy these texts, I will have suitable texts for my students.

The reasoning is straightforward. The first book could be edition \(1\) or edition \(2\text{.}\) If the second text is the same as the first, then I have what I need, so the only possible problem is if the first two books consist of one copy of edition \(1\text{,}\) and one copy of edition \(2\text{.}\) But then the third book must match one or the other of the first two, since there are only two editions, so I will have two copies of one or the other of the editions.

This idea can be generalised in several ways. We'll look at the most straightforward generalisation first.

Amongst the first \(m\) items, either two of the items are from the same category (so we are done), or there is exactly one item from each of the \(m\) categories. Since \(n>m\text{,}\) there is at least one more item. This item must fall into the same category as one of the previous items, since every category already has an item.

The name of this principle comes from the idea that it can be stated with the categories being a row of holes, and the items being pigeons who are assigned to these holes.

In Example 10.1.1, the categories were the editions, and the items were the text books.

Example 10.1.1 was a very direct and straightforward application of the Pigeonhole Principle. The Principle can also apply in much more subtle and surprising ways.

Maria makes a bet with Juan. He must buy her at least one chocolate bar every day for the next \(60\) days. If, at the end of that time, she cannot point out a span of consecutive days in which the number of chocolate bars he bought for her was precisely \(19\text{,}\) then she will pay for all of the chocolate bars and give them back to him. If she can find such a span, then she gets to keep the chocolate bars. To limit the size of the bet, they agree in advance that Juan will not buy more than \(100\) chocolate bars in total.

Is there a way for Juan to win this bet?

Solution

The answer is no. For \(1 \le i \le 60\text{,}\) let \(a_i\) represent the number of chocolate bars that Juan has bought for Maria by the end of day \(i\text{.}\) Then \(1 \le a_1 \lt a_2\lt \ldots \lt a_{60} \le 100\text{.}\) Maria is hoping that for some \(i\lt j\text{,}\) she will be able to find that \(a_i+19=a_j\text{.}\) We therefore also need to consider the values \(a_1+19\lt a_2+19\lt \ldots \lt a_{60}+19\text{.}\) By the bounds on \(a_1\) and \(a_{60}\text{,}\) we have \(a_1+19\ge 20\text{,}\) and \(a_{60}+19 \le 119\text{.}\) Thus, the values \(a_1, \ldots, a_{60}, a_1+19, \ldots, a_{60}+19\) are \(120\) numbers all of which lie between \(1\) and \(119\text{.}\)

By the Pigeonhole Principle, at least two of these numbers must be equal, but we know that the \(a_i\)s are strictly increasing (as are the \(a_i+19\)s), so there must exist some \(i\lt j\) such that \(a_i+19=a_j\text{.}\) Maria must point to the span of days from the start of day \(i+1\) to the end of day \(j\text{,}\) since in this span Juan bought her \(19\) chocolate bars.

In fact, Juan could not win a bet of this nature that lasted more than 56 days, but proving this requires more detailed analysis specific to the numbers involved, and is not really relevant to this course.

Here is another example that would be hard to prove if you didn't know the Pigeonhole Principle.

Fix \(n\text{,}\) and colour each point of the plane with one of \(n\) colours. Prove that there exists a rectangle whose four corners are the same colour.

Proof

Take a grid of points with \(n+1\) rows and \(n\binom{n+1}{2}+1\) columns. We claim that this grid will contain such a rectangle.

Since \(n\) colours have been used, and there are \(n+1\) points in each column, by the Pigeonhole Principle each column must contain at least two grid points that are the same colour.

In any column, there are \(\binom{n+1}{2}\) possible locations in which a pair of points of the same colour could appear. Thus there are at most \(\binom{n+1}{2}\) ways to position two points of colour \(1\) in a column so that the points do not occupy the same two locations in more than one of these columns. The same is true for each of the \(n\) colours. Therefore, we can create a maximum of \(n\binom{n+1}{2}\) columns, each having two points of some colour, in such a way as to avoid having the same colour occupy the same two locations in more than one of the columns. Since we have \(n\binom{n+1}{2}+1\) columns, there must exist some pair of columns such that the same colour does occupy the same two locations in both of the columns. These four points form a rectangle whose corners all have the same colour.

So far we have only thought about guaranteeing that there are at least two items in some category. Sometimes we might want to know that there are at least \(k\) items in some category, where \(k>2\text{.}\) There is a generalisation of the Pigeonhole Principle that applies to such situations.

Amongst the first \(km\) items, either \(k+1\) of the items are from the same category (so we are done), or there are exactly \(k\) items from each of the \(m\) categories. Since \(n>km\text{,}\) there is at least one more item. This item must fall into the same category as one of the previous items. Since every category already has \(k\) items, this means that there will be \(k+1\) items in this category.

Notice that the Pigeonhole Principle is a special case of the Generalised Pigeonhole Principle, obtained by taking \(k=1\text{.}\)

The population of West Lethbridge in the \(2014\) census was \(35,377\text{.}\)

Show that at least \(97\) residents of West Lethbridge share a birthday. If you live in West Lethbridge, how many people can you be sure have the same birthday as you?

Solution

For the first part of this question, we apply the generalised pigeonhole principle, with \(m=366\) (for the \(366\) days of the year, counting February \(29\) since it is just as legitimate a birthday as any other despite being more uncommon), \(k=96\text{,}\) and \(n=35,377\text{.}\) We have

\begin{equation*} n=35,377>km=96\cdot 366=35,136\text{,} \end{equation*}

so the Generalised Pigeonhole Principle tells us that at least \(k+1=97\) people must share a birthday.

For the second part of the question, the answer is \(0\text{.}\) There is no reason why every single other person in West Lethbridge might not have their birthday on the day after yours (although that particular possibility is quite unlikely). There is certainly no guarantee that any of them has the same birthday as yours.

Notice that although we have found in the above example that some group of at least \(97\) people in West Lethbridge must have the same birthday, we have no idea of which \(97\) people are involved, or of what the joint birthday is. This is rather remarkable, but is an example of a type of proof that is quite common in advanced mathematics. Such proofs are referred to as “non-constructive,” since they prove that something exists, without giving you any idea of how to find (or construct) such a thing.

The theorem we are about to present is not in itself central to the rest of the material in this book, but we include it here because of its proof. This proof demonstrates that a relatively straightforward idea such as the Generalised Pigeonhole Principle can be used in subtle and surprising ways to prove results that at first glance do not seem to relate to spreading things out across a variety of categories. If you are interested in combinatorics (or other higher mathematics), reading and understanding the approaches and ideas that are used in this proof may be of considerable value to you. The theorem and its proof are due to Pál Erdős (1913—1996) and György Szekeres (1911—2005).

Define a function \(f\) that maps each element of \(S\) to the length of the longest increasing subsequence that begins with that element.

If there exists some \(s \in S\) such that \(f(s)\ge a+1\text{,}\) then we are done. So we may assume that \(f(s)\le a\) for every \(s \in S\text{.}\) Since there is always an increasing sequence of length at least \(1\) starting at any element of \(S\text{,}\) we in fact have \(1 \le f(s) \le a\) for every \(s \in S\text{,}\) so there are \(a\) possible values for the outputs of \(f\text{.}\) Since \(|S|=ab+1\text{,}\) and \(ab+1>ab\text{,}\) the Generalised Pigeonhole Principle tells us that at least \(b+1\) elements of \(S\) must have the same output under the function \(f\text{.}\)

We claim that if \(x\) is before \(y\) in \(S\) and \(f(x)=f(y)\text{,}\) then \(x>y\text{.}\) By assumption, \(x\neq y\) (all values of \(S\) are distinct), so the only other possibility is \(x\lt y\text{.}\) If \(x\lt y\text{,}\) then taking \(x\) followed by an increasing subsequence of length \(f(y)\) that starts at \(y\text{,}\) would give an increasing subsequence of length \(f(y)+1\) that starts at \(x\text{,}\) contradicting \(f(x)=f(y)\text{.}\) This contradiction shows that \(x\lt y\) is not possible, so \(x>y\text{,}\) as claimed.

Let \(s_1, s_2, \ldots, s_{b+1}\) be elements of \(S\) that have the same output under \(f\text{,}\) and appear in this order. Then by the claim we proved in the previous paragraph, \(s_1>s_2>\ldots >s_{n+1}\text{,}\) which is a decreasing subsequence of length \(b+1\text{.}\)

For the sake of completeness in our presentation of the Erdős-Szekeres Theorem, we note that their result is best possible. That is, \(ab+1\) is the smallest possible length for \(S\) that guarantees the property. that \(S\) contains either an increasing subsequence of length \(a+1\) or a decreasing subsequence of length \(b+1\text{.}\)

To demonstrate this fact, for any \(a,b\text{,}\) we present a sequence of length \(ab\) in which the longest increasing sequence has length \(a\) and the longest decreasing subsequence has length \(b\text{.}\) One such sequence is

\begin{equation*} b, b-1, \ldots, 1; 2b, 2b-1, \ldots, b+1; \ldots; ab, ab-1, \ldots, (a-1)b+1\text{.} \end{equation*}

Any increasing subsequence can only have one entry from each of the \(a\) subsequences of length \(b\) that are separated by semicolons, so can only have length \(a\text{.}\) Any decreasing subsequence must be entirely contained within one of the subsequences of length \(b\) that are separated by semicolons, so can only have length \(b\text{.}\)

In the Pigeonhole Principle we considered how many items we must have in order to ensure that there are at least two items in some category. In the Generalised Pigeonhold Principle, we expanded on this by considering how many items we must have in order to ensure that there are at least \(k+1\) items in some category, where \(k\) can be any positive integer (taking \(k=1\) in this gives the Pigeonhole Principle). Sometimes, applications arise in which not all categories are equally valuable. It may be the case that we're happy if we have at least two items in one specific category, but it will require at least five items in any other category in order to satisfy us. Or each category may have its own specific number of items that we want, and satisfying this requirement for any one category is what we are looking for. This situation is covered by the most general form of the pigeonhole principle.

Amongst the first

\begin{equation*} n_1+n_2+\ldots+n_m-m \end{equation*}

items, either there is some \(1\le i\le m\) such that at least \(n_i\) of the items fall into the \(i\)th category, or there are precisely \(n_i-1\) objects in the \(i\)th category, for every \(1 \le i \le m\text{.}\) Since there is at least one more item, this item must fall into the \(i\)th category for some \(1 \le i \le m\text{,}\) which means that there will be \(n_i\) items in this category.

Notice that the Generalised Pigeonhole Principle is a special case of the “Even more generalised pigeonhole principle,” obtained by taking

\begin{equation*} n_1=n_2=\ldots=n_m=k\text{.} \end{equation*}

Terry wants to bring some of their special stuffed peppers dish to a family potluck, wants to ensure there is enough for everyone who will want it, and wants to use only one colour of peppers to avoid any arguments or disappointment. They also plan to bring a pasta salad dish that everyone likes. They know that of the \(12\) people who will be at the potluck, everyone will eat the dish if it is made with red peppers; there are \(4\) people who will not eat it if it is made with yellow peppers (but everyone else will); there are \(5\) people who will not eat it if it is made with orange peppers (but everyone else will); and there are \(6\) people who will eat it if it is made with green peppers. The local greenhouse has a great deal on peppers at the farmer's market this week, but you can't request specific colours; you get whatever they give you. How many peppers must Terry buy in order to ensure that they have enough to bring their pepper dish to the potluck?

Solution

Terry needs to end up with either \(12\) red peppers, \(8\) yellow peppers, \(7\) orange peppers, or \(6\) green peppers. So we take \(n_1=12\text{,}\) \(n_2=8\text{,}\) \(n_3=7\text{,}\) and \(n_4=6\) in the statement of the Even more generalised pigeonhole principle. Notice that we have \(m=4\) categories. Now the statement tells us that given at least

\begin{equation*} n_1+n_2+n_3+n_4-4+1 = 12+8+7+6-4+1=30 \end{equation*}

peppers, Terry is guaranteed to have either \(12\) red peppers, \(8\) yellow peppers, \(7\) orange peppers, or \(6\) green peppers.

If you are struggling with this, think again about the worst thing that could happen: Terry could end up with one pepper too few of each colour. This would happen if they had \(11\) red peppers, \(7\) yellow peppers, \(6\) orange peppers, and \(5\) green peppers, for a total of \(29\) peppers. As soon as they have one more, they must have enough of some colour.

Suppose Ali owes Tomas $10, and wants to give him a number of identical pieces of currency to pay her debt. Her bank only gives out currency in loonies (which have a value of $1), twonies (which have a value of $2), five-dollar bills, or ten-dollar bills, and does not take requests for specific kinds of currency. How much money must Ali request from the teller, if she wants to be sure to have $10 in identical pieces of currency with which to pay Tomas?

Solution

If Ali gets any $10 bills she can give one of those to Tomas and is done. If she gets at least two $5 bills, she is done. If she gets at least five twonies, she is done, and if she gets at least \(10\) loonies she is done. So the most money she can get without being able to give Tomas their $10 in a single type of currency, is \(9\) loonies, \(4\) twonies, and a $5 bill, for a total of $22. Therefore, if Ali asks for $23, she is guaranteed to be able to pay Tomas in a single type of currency.

Although the above example does not directly use the “Even more generalised pigeonhole principle” because it asks for the value of the currency Ali needs to request rather than the number of items she must request, it uses the same ideas and should be helpful in understanding the concept.

  1. Show that in any positioning of \(17\) rooks on an \(8\)-by-\(8\) chessboard, there must be at least three rooks none of which threaten each other (i.e. no two of which lie in the same row or column).

  2. Sixteen people must sit in a row of eighteen chairs. Prove that somewhere in the row there must be six adjacent chairs all occupied.

  3. An artist has produced a large work of art to be carried in a parade. Part of the concept is that it must be carried by people of roughly the same size (i.e., either all adults, or all children). The artist has left it to the last minute to find people to carry this, and is in a bit of a panic. They don't know if they will be able to assemble enough of either adults or children to carry the piece, so they decides to ask everyone they see, until they have enough volunteers. It takes \(15\) adults to carry the piece, or \(23\) children. If everyone approached agrees to help, how many people does the artist need to approach before they are sure to have enough people to carry their art in the parade?

  4. Let \(n\) be odd, let \(a\) be even, and let \(\pi\colon\{1,\ldots, n\} \to \{1, \ldots, n\}\) be a permutation. Prove that the product

    \begin{equation*} (a+1-\pi(1))(a+2-\pi(2)) \cdots (a+n-\pi(n)) \end{equation*}

    is even. Is the same conclusion necessarily true if \(n\) is even or if \(a\) is odd? Give a proof or a counterexamples in each case.

  5. Let \(n \ge 1\text{,}\) let \(x\) be a positive integer, and let \(S\) be a subset of cardinality \(n+1\) from \(\{x, x^2, \ldots, x^{2n}\}\text{.}\) Prove that there exist two numbers in \(S\) whose product is \(x^{2n+1}\text{.}\)

  6. Show that in every set of \(n + 1\) distinct integers, there exist two elements \(a\) and \(b\) such that \(a - b\) is divisible by \(n\text{.}\)

  7. A drawer contains socks of \(8\) different colours. How many socks must you pull out of the drawer to be certain that you have two of the same colour?

  8. There are \(15\) students in a Combinatorics class. Explain how you know that two of them have their birthday in the same month.

  9. A pizza restaurant has \(8\) different toppings. Every day in October, they will put a \(2\)-topping pizza on sale (a pizza that has two distinct toppings on it; the order of the toppings does not matter). Prove that the same pizza will be on sale on two different days.

  10. Suppose \(A\) is a set of \(10\) natural numbers between \(1\) and \(100\) (inclusive). Show that two different subsets of \(A\) have the same sum. For example, if

    \begin{equation*} A = \{ 2, 6, 13, 30, 45, 59, 65, 82, 88, 97 \}\text{,} \end{equation*}

    then the subsets \(\{ 6, 13, 45, 65\}\) and \(\{ 2, 30, 97\}\) both add up to \(129\text{.}\)
    Hint

    Compare the answers to two questions: How many subsets of \(A\) are there? Since there are only \(10\) elements of \(A\text{,}\) and each of them is at most \(100\text{,}\) how many different possible sums are there?

  11. Consider any set of \(5\) points in the plane that have integer coordinates. Prove that there is some pair from these \(5\) points such that the midpoint of the line segment joining this pair of points also has integer coordinates. Give an example of \(4\) points in the plane that do not have this property, and list all of the midpoints as evidence.