Skip to main content

Section 3.3 The Binomial Theorem

Here is an algebraic example in which “\(n\) choose \(r\)” arises naturally.

Consider

\begin{equation*} (a+b)^4=(a+b)(a+b)(a+b)(a+b)\text{.} \end{equation*}

If you try to multiply this out, you must systematically choose the \(a\) or the \(b\) from each of the four factors, and make sure that you make every possible combination of choices sooner or later.

One way of breaking this task down into smaller pieces, is to separate it into five parts, depending on how many of the factors you choose \(a\)s from (\(4\text{,}\) \(3\text{,}\) \(2\text{,}\) \(1\text{,}\) or \(0\)). Each time you choose \(4\) of the \(a\)s, you will obtain a single contribution to the coefficient of the term \(a^4\text{;}\) each time you choose \(3\) of the \(a\)s, you will obtain a single contribution to the term \(a^3b\text{;}\) each time you choose \(2\) of the \(a\)s, you will obtain a single contribution to the term \(a^2b^2\text{;}\) each time you choose \(1\) of the \(a\)s, you will obtain a single contribution to the term \(ab^3\text{;}\) and each time you choose \(0\) of the \(a\)s, you will obtain a single contribution to the term \(b^4\text{.}\) In other words, the coefficient of a particular term \(a^ib^{4-i}\) will be the number of ways in which you can choose \(i\) of the factors from which to take an \(a\text{,}\) taking a \(b\) from the other \(4-i\) factors (where \(0 \le i \le 4\)).

Let's go through each of these cases separately. By Theorem 3.2.3, there is \(\binom{4}{4}=1\) way to choose four factors from which to take \(a\)s. (Clearly, you must choose an \(a\) from every one of the four factors.) Thus, the coefficient of \(a^4\) will be \(1\text{.}\)

If you want to take \(a\)s from three of the four factors, Theorem 3.2.3 tells us that there are \(\binom{4}{3}=4\) ways in which to choose the factors from which you take the \(a\)s. (Specifically, these four ways consist of taking the \(b\) from any one of the four factors, and the \(a\)s from the other three factors). Thus, the coefficient of \(a^3b\) will be \(4\text{.}\)

If you want to take \(a\)s from two of the four factors, and \(b\)s from the other two, Theorem 3.2.3 tells us that there are \(\binom{4}{2}=6\) ways in which to choose the factors from which you take the \(a\)s (then take \(b\)s from the other two factors). This is a small enough example that you could easily work out all six ways by hand if you wish. Thus, the coefficient of \(a^2b^2\) will be \(6\text{.}\)

If you want to take \(a\)s from one of the four factors, Theorem 3.2.3 tells us that there are \(\binom{4}{1}=4\) ways in which to choose the factors from which you take the \(a\)s. (Specifically, these four ways consist of taking the \(a\) from any one of the four factors, and the \(b\)s from the other three factors). Thus, the coefficient of \(ab^3\) will be \(4\text{.}\)

Finally, by Theorem 3.2.3, there is \(\binom{4}{0}=1\) way to choose zero factors from which to take \(a\)s. (Clearly, you must choose a \(b\) from every one of the four factors.) Thus, the coefficient of \(b^4\) will be \(1\text{.}\)

Putting all of this together, we see that

\begin{equation*} (a+b)^4=a^4+4a^3b+6a^2b^2+4ab^3+b^4\text{.} \end{equation*}

In fact, if we leave the coefficients in the original form in which we worked them out, we see that

\begin{equation*} (a+b)^4=\binom{4}{4}a^4+\binom{4}{3}a^3b+\binom{4}{2}a^2b^2+\binom{4}{1}ab^3+\binom{4}{0}b^4\text{.} \end{equation*}

This example generalises into a significant theorem of mathematics:

As in Example 3.3.1, we see that the coefficient of \(a^rb^{n-r}\) in \((a+b)^n\) will be the number of ways of choosing \(r\) of the \(n\) factors from which we'll take the \(a\) (taking the \(b\) from the other \(n-r\) factors). By Theorem 3.2.3, there are \(\binom{n}{r}\) ways of making this choice.

For the special case, begin by observing that \((1+x)^n=(x+1)^n\text{;}\) then take \(a=x\) and \(b=1\) in the general formula. Use the fact that \(1^{n-r}=1\) for any integers \(n\) and \(r\text{.}\)

The Binomial Theorem has been known for a long time. In China it was often used as a way to work out numerical estimations for high powers of mixed numbers, for example by taking the first few terms from the expansion of something like \((5+.11)^5\text{.}\)

From the theorem above, we see that the values \(\binom{n}{r}\) are the coefficients of the terms in the Binomial Theorem.

Definition 3.3.3.

Expressions of the form \(\binom{n}{r}\) are referred to as binomial coefficients.

There are some nice, simple consequences of the Binomial Theorem.

This is an immediate consequence of substituting \(a=b=1\) into the Binomial Theorem.

From the special case of the Binomial Theorem, we have

\begin{equation*} (1+x)^n=\sum_{r=0}^n\binom{n}{r}x^r\text{.} \end{equation*}

If we differentiate both sides, we obtain

\begin{equation*} n(1+x)^{n-1}=\sum_{r=0}^n r\binom{n}{r}x^{r-1}\text{.} \end{equation*}

Substituting \(x=-1\) gives the result (the left-hand side is zero).

Remark 3.3.6.

We've encountered a number of “theorems” earlier in this book, which is probably a term you've seen previously, also. This is the first time we've stated a result that we haven't called a “theorem”, so it's worth spending a few moments reviewing the various terms we'll use for results, and the circumstances under which each is appropriate.

The term theorem is generally reserved for significant results. A result that we might want to refer to from other courses or in other contexts should receive this term: an example of this is the Fundamental Theorem of Calculus, or from this course, the Binomial Theorem. In higher mathematics we don't generally call something a theorem if its proof is too easy, but within a course like this we tend to use the term more broadly, for results that are important within the context of the course.

A corollary such as the ones we see here, means a result that follows as an easy or direct consequence of another result. Typically the proof of a corollary should be very short. It may in large measure involve looking at the previous result (often a theorem) from a slightly different perspective, or taking a special case of it.

A lemma is a self-contained result that provides a stepping stone to one or more results of greater interest. Lemmas are used in a variety of situations, including:

  • if a particular fact is needed more than once in a proof, it is often broken out into a lemma in order to avoid repeating the argument;
  • if a proof is long and complex but breaks down into a series of steps that are reasonably self-contained, these may be separated into lemmas to make the arguments easier to follow;
  • if a mathematician thinks that one piece of their argument may be of use in other contexts or in its own right, they may separate it out into a lemma.

A proposition is a self-contained result that is not a step along the way to another result, but is not sufficiently significant to deserve to be called a theorem.

These terms aren't always used with precision. When we encounter Euler's handshaking lemma later in the course, you might reasonably think it deserves to be called a theorem. However, perhaps because it is usually used to show other interesting results rather than on its own, the term “lemma” has become part of its title.

Use the Binomial Theorem to evaluate the following:

  1. \(\sum_{i=1}^n \binom{n}{i}2^i\text{.}\)

  2. the coefficient of \(a^2b^3c^2d^4\) in \((a+b)^5(c+d)^6\text{.}\)

  3. the coefficient of \(a^2b^6c^3\) in \((a+b)^5(b+c)^6\text{.}\)

  4. the coefficient of \(a^3b^2\) in \((a+b)^5+(a+b^2)^4\text{.}\)