In the previous section we found that the definition of matrix addition was very intuitive, and we ended that section discussing the fact that eventually we’d like to know what it means to multiply matrices together.
In the spirit of the last section, take another wild stab: what do you think
but this is, in fact, not right. (You could define multiplication this way; you’ll even find that it satisfies plenty of nice properties. Unfortunately, nice properties don’t make up for the fact that this definition just isn’t useful.) The actual answer is
If you can look at this one example and suddenly understand exactly how matrix multiplication works, then you are probably smarter than the author. While matrix multiplication isn’t hard, it isn’t nearly as intuitive as matrix addition is.
To further muddy the waters (before we clear them), consider
Our experience from the last section would lend us to believe that this is not defined, but our confidence is probably a bit shaken by now. In fact, this multiplication is defined, and it is
You may see some similarity in this answer to what we got before, but again, probably not enough to really figure things out.
Before diving in to the general definition of matrix multiplication, let’s start simple, with row and column vectors. Recall from Definition 4.1.3 in Section 4.1 that a row vector is a \(1\times n\) matrix of the form \(\vec a = \bbm a_1 \amp a_2 \amp \cdots \amp a_n\ebm\text{,}\) and a column vector is an \(m\times 1\) matrix of the form \(\vec{b} = \bbm b_1\\b_2\\\vdots \\ b_m\ebm\text{.}\)
Definition4.2.1.Multiplying a row vector by a column vector.
Let \(\vu\) be an \(1\times n\) row vector with entries \(u_1, u_2, \cdots, u_n\) and let \(\vvv\) be an \(n\times 1\) column vector with entries \(v_1, v_2, \cdots, v_n\text{.}\) The product of \(\vu\) and \vvv, denoted \(\dotp uv\) or \(\vu\vvv\text{,}\) is
Notice that this is essentially the same as the definition of the dot product given at the beginning of Section 2.7. There are two key points to notice about the product defined in Definition 4.2.1:
In order for the product \(\vu\vvv\) to be defined, \(\vu\) and \(\vvv\) need to have the same number of entries.
To multiply \(\vu\) and \(\vvv\text{,}\) we multiply the corresponding entries, and then add up the resulting values.
\(\vu\vy\) is not defined; Definition 4.2.1 specifies that in order to multiply a row vector and column vector, they must have the same number of entries.
\(\vu\vvv\) is not defined; we only know how to multipy row vectors by column vectors. We haven’t defined how to multiply two row vectors (in general, it can’t be done).
The product \(\vx\vu\)is defined, but we don’t know how to do it yet. Right now, we only know how to multiply a row vector times a column vector; we don’t know how to multiply a column vector times a row vector. (That’s right: \(\vu\vx \neq \vx\vu!\))
Now that we understand how to multiply a row vector by a column vector, we are ready to define matrix multiplication.
Definition4.2.3.Matrix Multiplication.
Let \(\tta\) be an \(m\times r\) matrix, and let \(\ttb\) be an \(r\times n\) matrix. The matrix product of \(\tta\) and \ttb, denoted \(\tta\cdot\ttb\text{,}\) or simply \(\tta\ttb\text{,}\) is the \(m\times n\) matrix \(\ttm\) whose entry in the \(i\)th row and \(j\)th column is the product of the \(i\)th row of \(\tta\) and the \(j\)th column of \(\ttb\text{.}\)
It may help to illustrate it in this way. Let matrix \(\tta\) have rows \(\vec{a}_{1}\text{,}\)\(\vec{a}_{2}\text{,}\)\(\cdots\text{,}\)\(\vec{a}_{m}\) and let \(\ttb\) have columns \(\vec{b}_{1}\text{,}\)\(\vec{b}_{2}\text{,}\)\(\cdots\text{,}\)\(\vec{b}_{n}\text{.}\) Thus \(\tta\) looks like
Two quick notes about this definition. First, notice that in order to multiply \(\tta\) and \(\ttb\text{,}\) the number of columns of \(\tta\) must be the same as the number of rows of \(\ttb\) (we refer to these as the “inner dimensions”). Secondly, the resulting matrix has the same number of rows as \(\tta\) and the same number of columns as \(\ttb\) (we refer to these as the “outer dimensions”).
\begin{equation*}
\overbrace{(m\times\hspace{-38pt} \underbrace{r) \times (r}_\text{these inner dimensions must match}\hspace{-38pt}\times n)}^\text{final dimensions are the outer dimensions}
\end{equation*}
Of course, this will make much more sense when we see an example.
Example4.2.4.A more general matrix product.
Revisit the matrix product we saw at the beginning of this section; multiply
Let’s call our first matrix \(\tta\) and the second \(\ttb\text{.}\) We should first check to see that we can actually perform this multiplication. Matrix \(\tta\) is \(2\times 2\) and \(\ttb\) is \(2\times 3\text{.}\) The “inner” dimensions match up, so we can compute the product; the “outer” dimensions tell us that the product will be \(2\times 3\text{.}\) Let
The entry \(m_{11}\) is in the first row and first column; therefore to find its value, we need to multiply the first row of \(\tta\) by the first column of \(\ttb\text{.}\) Thus
Let’s first check to make sure this product is defined. Again calling the first matrix \(\tta\) and the second \(\ttb\text{,}\) we see that \(\tta\) is a \(3\times 2\) matrix and \(\ttb\) is a \(2\times4\) matrix; the inner dimensions match so the product is defined, and the product will be a \(3\times 4\) matrix,
Again, we’ll call the first matrix \(\tta\) and the second \(\ttb\text{.}\) Checking the dimensions of each matrix, we see that \(\tta\) is a \(2\times 3\) matrix, whereas \(\ttb\) is a \(2\times2\) matrix. The inner dimensions do not match, therefore this multiplication is not defined.
Example4.2.7.A vector product revisited.
In Example 4.2.2, we were told that the product \(\vx\vu\) was defined, where
although we were not shown what that product was. Find \(\vx\vu\text{.}\)
Solution.
Again, we need to check to make sure the dimensions work correctly (remember that even though we are referring to \(\vu\) and \(\vx\) as vectors, they are, in fact, just matrices).
The column vector \(\vx\) has dimensions \(3\times1\text{,}\) whereas the row vector \(\vu\) has dimensions \(1\times 3\text{.}\) Since the inner dimensions match, the matrix product is defined; the outer dimensions tell us that the product will be a \(3\times3\) matrix, as shown below:
To compute the entry \(m_{11}\text{,}\) we multiply the first row of \(\vx\) by the first column of \(\vu\text{.}\) What is the first row of \(\vx\text{?}\) Simply the number \(-2\text{.}\) What is the first column of \(\vu\text{?}\) Just the number 1. Thus \(m_{11} = -2\text{.}\) (This does seem odd, but through checking, you can see that we are indeed following the rules.)
What about the entry \(m_{12}\text{?}\) Again, we multiply the first row of \(\vx\) by the first column of \(\vu\text{;}\) that is, we multiply \(-2(2)\text{.}\) So \(m_{12} = -4\text{.}\)
What about \(m_{23}\text{?}\) Multiply the second row of \(\vx\) by the third column of \(\vu\text{;}\) multiply \(4(3)\text{,}\) so \(m_{23} = 12\text{.}\)
One final example: \(m_{31}\) comes from multiplying the third row of\(\vx\text{,}\) which is 3, by the first column of \(\vu\text{,}\) which is 1. Therefore \(m_{31} = 3\text{.}\)
In this last example, we saw a “nonstandard” multiplication (at least, it felt nonstandard). Studying the entries of this matrix, it seems that there are several different patterns that can be seen amongst the entries. (Remember that mathematicians like to look for patterns. Also remember that we often guess wrong at first; don’t be scared and try to identify some patterns.)
In Section 4.1, we identified the zero matrix \(\tto\) that had a nice property in relation to matrix addition (i.e., \(\tta+\tto = \tta\) for any matrix \(\tta\)). In the following example we’ll identify a matrix that works well with multiplication as well as some multiplicative properties. For instance, we’ve learned how \(1\cdot\tta = \tta\text{;}\) is there a matrix that acts like the number 1? That is, can we find a matrix \(\ttx\) where \(\ttx\cdot\tta=\tta\text{?}\) (We made a guess in Section 4.1 that maybe a matrix of all 1s would work, but you can probably already see that this guess is doomed to failure.)
This example is simply chock full of interesting ideas; it is almost hard to think about where to start.
Interesting Idea #1
Notice that in our example, \(\tta\ttb \neq \ttb\tta\text{!}\) When dealing with numbers, we were used to the idea that \(ab = ba\text{.}\) With matrices, multiplication is not commutative. (Of course, we can find special situations where it does work. In general, though, it doesn’t.)
Interesting Idea #2
Right before this example we wondered if there was a matrix that “acted like the number 1,” and guessed it may be a matrix of all 1s. However, we found out that such a matrix does not work in that way; in our example, \(\tta\ttb \neq \tta\text{.}\) We did find that \(\tta\tti = \tti\tta = \tta\text{.}\) There is a Multiplicative Identity; it just isn’t what we thought it would be. And just as \(1^2 = 1\text{,}\)\(\tti^2 = \tti\text{.}\)
Interesting Idea #3
When dealing with numbers, we are very familiar with the notion that “If \(ax = bx\text{,}\) then \(a=b\text{.}\)” (As long as \(x\neq 0\text{.}\)) Notice that, in our example, \(\ttb\ttb = \ttb\ttc\text{,}\) yet \(\ttb\neq\ttc\text{.}\) In general, just because \(\tta\ttx = \ttb\ttx\text{,}\) we cannot conclude that \(\tta =\ttb\text{.}\)
Matrix multiplication is turning out to be a very strange operation. We are very used to multiplying numbers, and we know a bunch of properties that hold when using this type of multiplication. When multiplying matrices, though, we probably find ourselves asking two questions, “What does work?” and “What doesn’t work?” We’ll answer these questions; first we’ll do an example that demonstrates some of the things that do work.
Example4.2.9.Exploring properties of matrix multiplication.
We’ll compute each of these without showing all the intermediate steps. Keep in mind order of operations: things that appear inside of parentheses are computed first.
In looking at our example, we should notice two things. First, it looks like the “distributive property” holds; that is, \(\tta(\ttb+\ttc) = \tta\ttb + \tta\ttc\text{.}\) This is nice as many algebraic techniques we have learned about in the past (when doing “ordinary algebra”) will still work. Secondly, it looks like the “associative property” holds; that is, \(\tta(\ttb\ttc) = (\tta\ttb)\ttc\text{.}\) This is nice, for it tells us that when we are multiplying several matrices together, we don’t have to be particularly careful in what order we multiply certain pairs of matrices together.
In leading to an important theorem, let’s define a matrix we saw in an earlier example.
Definition4.2.10.Identity Matrix.
The \(n\times n\) matrix with 1s on the diagonal and zeros elsewhere is the \(n\times n\) identity matrix, denoted \(\tti_n\text{.}\) When the context makes the dimension of the identity clear, the subscript is generally omitted.
Note that while the zero matrix can come in all different shapes and sizes, the identity matrix is always a square matrix. We show a few identity matrices below.
In our examples above, we have seen examples of things that do and do not work. We should be careful about what examples prove, though. If someone were to claim that \(\tta\ttb = \ttb\tta\) is always true, one would only need to show them one example where they were false, and we would know the person was wrong. However, if someone claims that \(\tta(\ttb+\ttc) = \tta\ttb+\tta\ttc\) is always true, we can’t prove this with just one example. We need something more powerful; we need a true proof.
In this text, we forgo most proofs. The reader should know, though, that when we state something in a theorem, there is a proof that backs up what we state. Our justification comes from something stronger than just examples.
Now we give the good news of what does work when dealing with matrix multiplication.
Theorem4.2.11.Properties of Matrix Multiplication.
Let \(\tta\text{,}\)\(\ttb\) and \(\ttc\) be matrices whose sizes are such that the following operations make sense, and let \(k\) be a scalar. The following equalities hold:
The above box contains some very good news, and probably some very surprising news. Matrix multiplication probably seems to us like a very odd operation, so we probably wouldn’t have been surprised if we were told that \(\tta(\ttb\ttc)\neq(\tta\ttb)\ttc\text{.}\) It is a very nice thing that the Associative Property does hold.
As we near the end of this section, we raise one more issue of notation. We define \(\tta^0 = \tti\text{.}\) If \(n\) is a positive integer, we define
With numbers, we are used to \(a^{-n} = \frac{1}{a^n}\text{.}\) Do negative exponents work with matrices, too? The answer is yes, sort of. We’ll have to be careful, and we’ll cover the topic in detail once we define the inverse of a matrix. For now, though, we recognize the fact that \(\tta^{-1} \neq \frac{1}{\tta}\text{,}\) for \(\frac{1}{\tta}\) makes no sense; we don’t know how to “divide” by a matrix.
We end this section with a reminder of some of the things that do not work with matrix multiplication. The good news is that there are really only two things on this list.
Matrix multiplication is not commutative; that is, \(\tta\ttb \neq \ttb\tta\text{.}\)
In general, just because \(\tta\ttx = \ttb\ttx\text{,}\) we cannot conclude that \(\tta=\ttb\text{.}\)
The bad news is that these ideas pop up in many places where we don’t expect them. For instance, we are used to