In the previous section we found that the definition of matrix addition was very intuitive, and we ended that section discussing the fact that eventually weβd like to know what it means to multiply matrices together.
but this is, in fact, not right. (You could define multiplication this way; youβll even find that it satisfies plenty of nice properties. Unfortunately, nice properties donβt make up for the fact that this definition just isnβt useful.) The actual answer is
If you can look at this one example and suddenly understand exactly how matrix multiplication works, then you are probably smarter than the author. While matrix multiplication isnβt hard, it isnβt nearly as intuitive as matrix addition is.
Our experience from the last section would lend us to believe that this is not defined, but our confidence is probably a bit shaken by now. In fact, this multiplication is defined, and it is
Before diving in to the general definition of matrix multiplication, letβs start simple, with row and column vectors. Recall from DefinitionΒ 4.1.3 in SectionΒ 4.1 that a row vector is a \(1\times n\) matrix of the form \(\vec a = \bbm a_1 \amp a_2 \amp \cdots \amp a_n\ebm\text{,}\) and a column vector is an \(m\times 1\) matrix of the form \(\vec{b} = \bbm b_1\\b_2\\\vdots \\ b_m\ebm\text{.}\)
Definition4.2.1.Multiplying a row vector by a column vector.
Let \(\vu\) be an \(1\times n\) row vector with entries \(u_1, u_2, \cdots, u_n\) and let \(\vvv\) be an \(n\times 1\) column vector with entries \(v_1, v_2, \cdots, v_n\text{.}\) The product of \(\vu\) and \vvv, denoted \(\dotp uv\) or \(\vu\vvv\text{,}\) is
Notice that this is essentially the same as the definition of the dot product given at the beginning of SectionΒ 2.7. There are two key points to notice about the product defined in DefinitionΒ 4.2.1:
In order for the product \(\vu\vvv\) to be defined, \(\vu\) and \(\vvv\) need to have the same number of entries.
\(\vu\vy\) is not defined; DefinitionΒ 4.2.1 specifies that in order to multiply a row vector and column vector, they must have the same number of entries.
\(\vu\vvv\) is not defined; we only know how to multipy row vectors by column vectors. We havenβt defined how to multiply two row vectors (in general, it canβt be done).
The product \(\vx\vu\)is defined, but we donβt know how to do it yet. Right now, we only know how to multiply a row vector times a column vector; we donβt know how to multiply a column vector times a row vector. (Thatβs right: \(\vu\vx \neq \vx\vu!\))
Let \(A\) be an \(m\times r\) matrix, and let \(B\) be an \(r\times n\) matrix. The matrix product of \(A\) and \(B\text{,}\) denoted \(A\cdot B\text{,}\) or simply \(AB\text{,}\) is the \(m\times n\) matrix \(M\) whose entry in the \(i\)th row and \(j\)th column is the product of the \(i\)th row of \(A\) and the \(j\)th column of \(B\text{.}\)
It may help to illustrate it in this way. Let matrix \(A\) have rows \(\vec{a}_{1}\text{,}\)\(\vec{a}_{2}\text{,}\)\(\cdots\text{,}\)\(\vec{a}_{m}\) and let \(B\) have columns \(\vec{b}_{1}\text{,}\)\(\vec{b}_{2}\text{,}\)\(\cdots\text{,}\)\(\vec{b}_{n}\text{.}\) Thus \(A\) looks like
Two quick notes about this definition. First, notice that in order to multiply \(A\) and \(B\text{,}\) the number of columns of \(A\) must be the same as the number of rows of \(B\) (we refer to these as the βinner dimensionsβ). Secondly, the resulting matrix has the same number of rows as \(A\) and the same number of columns as \(B\) (we refer to these as the βouter dimensionsβ).
\begin{equation*}
\overbrace{(m\times\hspace{-38pt} \underbrace{r) \times (r}_\text{these inner dimensions must match}\hspace{-38pt}\times n)}^\text{final dimensions are the outer dimensions}
\end{equation*}
Of course, this will make much more sense when we see an example.
Letβs call our first matrix \(A\) and the second \(B\text{.}\) We should first check to see that we can actually perform this multiplication. Matrix \(A\) is \(2\times 2\) and \(B\) is \(2\times 3\text{.}\) The βinnerβ dimensions match up, so we can compute the product; the βouterβ dimensions tell us that the product will be \(2\times 3\text{.}\) Let
The entry \(m_{11}\) is in the first row and first column; therefore to find its value, we need to multiply the first row of \(A\) by the first column of \(B\text{.}\) Thus
Letβs first check to make sure this product is defined. Again calling the first matrix \(A\) and the second \(B\text{,}\) we see that \(A\) is a \(3\times 2\) matrix and \(B\) is a \(2\times4\) matrix; the inner dimensions match so the product is defined, and the product will be a \(3\times 4\) matrix,
Again, weβll call the first matrix \(A\) and the second \(B\text{.}\) Checking the dimensions of each matrix, we see that \(A\) is a \(2\times 3\) matrix, whereas \(B\) is a \(2\times2\) matrix. The inner dimensions do not match, therefore this multiplication is not defined.
Again, we need to check to make sure the dimensions work correctly (remember that even though we are referring to \(\vu\) and \(\vx\) as vectors, they are, in fact, just matrices).
The column vector \(\vx\) has dimensions \(3\times1\text{,}\) whereas the row vector \(\vu\) has dimensions \(1\times 3\text{.}\) Since the inner dimensions match, the matrix product is defined; the outer dimensions tell us that the product will be a \(3\times3\) matrix, as shown below:
To compute the entry \(m_{11}\text{,}\) we multiply the first row of \(\vx\) by the first column of \(\vu\text{.}\) What is the first row of \(\vx\text{?}\) Simply the number \(-2\text{.}\) What is the first column of \(\vu\text{?}\) Just the number 1. Thus \(m_{11} = -2\text{.}\) (This does seem odd, but through checking, you can see that we are indeed following the rules.)
What about the entry \(m_{12}\text{?}\) Again, we multiply the first row of \(\vx\) by the first column of \(\vu\text{;}\) that is, we multiply \(-2(2)\text{.}\) So \(m_{12} = -4\text{.}\)
What about \(m_{23}\text{?}\) Multiply the second row of \(\vx\) by the third column of \(\vu\text{;}\) multiply \(4(3)\text{,}\) so \(m_{23} = 12\text{.}\)
One final example: \(m_{31}\) comes from multiplying the third row of\(\vx\text{,}\) which is 3, by the first column of \(\vu\text{,}\) which is 1. Therefore \(m_{31} = 3\text{.}\)
In this last example, we saw a βnonstandardβ multiplication (at least, it felt nonstandard). Studying the entries of this matrix, it seems that there are several different patterns that can be seen amongst the entries. (Remember that mathematicians like to look for patterns. Also remember that we often guess wrong at first; donβt be scared and try to identify some patterns.)
In SectionΒ 4.1, we identified the zero matrix \(\tto\) that had a nice property in relation to matrix addition (i.e., \(A+\tto = A\) for any matrix \(A\)). In the following example weβll identify a matrix that works well with multiplication as well as some multiplicative properties. For instance, weβve learned how \(1\cdot A = A\text{;}\) is there a matrix that acts like the number 1? That is, can we find a matrix \(X\) where \(X\cdot A=A\text{?}\) (We made a guess in SectionΒ 4.1 that maybe a matrix of all 1s would work, but you can probably already see that this guess is doomed to failure.)
Notice that in our example, \(AB \neq BA\text{!}\) When dealing with numbers, we were used to the idea that \(ab = ba\text{.}\) With matrices, multiplication is not commutative. (Of course, we can find special situations where it does work. In general, though, it doesnβt.)
Right before this example we wondered if there was a matrix that βacted like the number 1,β and guessed it may be a matrix of all 1s. However, we found out that such a matrix does not work in that way; in our example, \(AB \neq A\text{.}\) We did find that \(AI = IA = A\text{.}\) There is a Multiplicative Identity; it just isnβt what we thought it would be. And just as \(1^2 = 1\text{,}\)\(I^2 = I\text{.}\)
When dealing with numbers, we are very familiar with the notion that βIf \(ax = bx\text{,}\) then \(a=b\text{.}\)β (As long as \(x\neq 0\text{.}\)) Notice that, in our example, \(BB = BC\text{,}\) yet \(B\neq C\text{.}\) In general, just because \(AX = BX\text{,}\) we cannot conclude that \(A =B\text{.}\)
Matrix multiplication is turning out to be a very strange operation. We are very used to multiplying numbers, and we know a bunch of properties that hold when using this type of multiplication. When multiplying matrices, though, we probably find ourselves asking two questions, βWhat does work?β and βWhat doesnβt work?β Weβll answer these questions; first weβll do an example that demonstrates some of the things that do work.
Weβll compute each of these without showing all the intermediate steps. Keep in mind order of operations: things that appear inside of parentheses are computed first.
In looking at our example, we should notice two things. First, it looks like the βdistributive propertyβ holds; that is, \(A(B+C) = AB + AC\text{.}\) This is nice as many algebraic techniques we have learned about in the past (when doing βordinary algebraβ) will still work. Secondly, it looks like the βassociative propertyβ holds; that is, \(A(BC) = (AB)C\text{.}\) This is nice, for it tells us that when we are multiplying several matrices together, we donβt have to be particularly careful in what order we multiply certain pairs of matrices together.
The \(n\times n\) matrix with 1s on the diagonal and zeros elsewhere is the \(n\times n\) identity matrix, denoted \(I_n\text{.}\) When the context makes the dimension of the identity clear, the subscript is generally omitted.
Note that while the zero matrix can come in all different shapes and sizes, the identity matrix is always a square matrix. We show a few identity matrices below.
In our examples above, we have seen examples of things that do and do not work. We should be careful about what examples prove, though. If someone were to claim that \(AB = BA\) is always true, one would only need to show them one example where they were false, and we would know the person was wrong. However, if someone claims that \(A(B+C) = AB+AC\) is always true, we canβt prove this with just one example. We need something more powerful; we need a true proof.
In this text, we forgo most proofs. The reader should know, though, that when we state something in a theorem, there is a proof that backs up what we state. Our justification comes from something stronger than just examples.
Theorem4.2.11.Properties of Matrix Multiplication.
Let \(A\text{,}\)\(B\) and \(C\) be matrices whose sizes are such that the following operations make sense, and let \(k\) be a scalar. The following equalities hold:
The above box contains some very good news, and probably some very surprising news. Matrix multiplication probably seems to us like a very odd operation, so we probably wouldnβt have been surprised if we were told that \(A(BC)\neq(AB)C\text{.}\) It is a very nice thing that the Associative Property does hold.
With numbers, we are used to \(a^{-n} = \frac{1}{a^n}\text{.}\) Do negative exponents work with matrices, too? The answer is yes, sort of. Weβll have to be careful, and weβll cover the topic in detail once we define the inverse of a matrix. For now, though, we recognize the fact that \(A^{-1} \neq \frac{1}{A}\text{,}\) for \(\frac{1}{A}\) makes no sense; we donβt know how to βdivideβ by a matrix.
We end this section with a reminder of some of the things that do not work with matrix multiplication. The good news is that there are really only two things on this list.
Matrix multiplication is not commutative; that is, \(AB \neq BA\text{.}\)