Let us now think in terms of matrices. We have learned of the identity matrix \(I\) that โacts like the number 1.โ That is, if \(A\) is a square matrix, then
\begin{equation*}
IA = AI = A\text{.}
\end{equation*}
If we had a matrix, which weโll call \(A^{-1}\text{,}\) where \(A^{-1}A=I\text{,}\) then by analogy to our algebra example above it seems like we might be able to solve the linear system \(A\vx = \vb\) for \(\vx\) by multiplying both sides of the equation by \(A^{-1}\text{.}\) That is, perhaps
There is no guarantee that such a matrix is going to exist for an arbitrary \(n\times n\) matrix \(A\text{,}\) but if it does, we say that \(A\) is invertible.
Of course, there is a lot of speculation here. We donโt know in general that such a matrix like \(A^{-1}\) exists. (And if it does, whether that matrix is unique, despite the use of the definite article in stating that \(X\) is โtheโ inverse of \(A\text{.}\)) However, we do know how to solve the matrix equation \(AX = B\text{,}\) so we can use that technique to solve the equation \(AX = I\) for \(X\text{.}\) This seems like it will get us close to what we want. Letโs practice this once and then study our results.
We know how to solve this from the previous section: we form the proper augmented matrix, put it into reduced row echelon form and interpret the results.
Looking at our previous example, we are tempted to jump in and call the matrix \(X\) that we found โ\(A^{-1}\text{.}\)โ However, there are two obstacles in the way of us doing this.
Secondly, we have seen examples of matrices where \(AB = AC\text{,}\) but \(B\neq C\text{.}\) So just because \(AX = I\text{,}\) it is possible that another matrix \(Y\) exists where \(AY = I\text{.}\) If this is the case, using the notation \(A^{-1}\) would be misleading, since it could refer to more than one matrix.
These obstacles that we face are not insurmountable. The first obstacle was that we know that \(AX=I\) but didnโt know that \(XA=I\text{.}\) Thatโs easy enough to check, though. Letโs look at \(A\) and \(X\) from our previous example.
Perhaps this first obstacle isnโt much of an obstacle after all. Of course, we only have one example where it worked, so this doesnโt mean that it always works. We have good news, though: it always does work. The only โbadโ news is that this is a bit harder to prove. For now, we will state it as theorem, but the proof will have to wait until later: see the proof of Theoremย 4.5.1.
The second obstacle is easier to address. We want to know if another matrix \(Y\) exists where \(AY = I =YA\text{.}\) Letโs suppose that it does. Consider the expression \(XAY\text{.}\) Since matrix multiplication is associative, we can group this any way we choose. We could group this as \((XA)Y\text{;}\) this results in
Combining the two ideas above, we see that \(X = XAY = Y\text{;}\) that is, \(X=Y\text{.}\) We conclude that there is only one matrix \(X\) where \(XA = I = AX\text{.}\) (Even if we think we have two, we can do the above exercise and see that we really just have one.)
Theorem4.4.4.Uniqueness of Solutions to \(AX=I_n\).
Let \(A\) be an \(n\times n\) matrix and let \(X\) be a matrix where \(AX = I_n\text{.}\) Then \(X\) is unique; it is the only matrix that satisfies this equation. In other words, if \(A\) is an \(n\times n\) matrix and \(AX = AY = I_n\text{,}\) then \(X=Y = A^{-1}\text{.}\)
Thus, we were justified in Definitionย 4.4.1 in calling \(A^{-1}\) โtheโ inverse of \(A\) (rather than merely โanโ inverse). Theoremย 4.4.4 is incredibly important in practice. It tells us that if we are able to establish that either\(AX=I_n\) or \(XA = I_n\) for some matrix \(X\text{,}\) then we can immediately conclude two things: first, that \(A\) is invertible, and second, that \(A=A^{-1}\text{.}\) We put this observation to use in the next example.
Thus, if we set \(X=A^4\text{,}\) then \(AX = I_n\text{,}\) so by Theorems Theoremย 4.4.3 and Theoremย 4.4.4, \(A\) is invertible, and \(A^{-1} = A^4\text{.}\)
At this point, it is natural to wonder which \(n\times n\) matrices will be invertible. Will any non-zero matrix do? (No.) Are such matrices a rare occurrence? (No.) As we proceed through this chapter and the next, we will see that there are many different conditions one can place on an \(n\times n\) matrix that are equivalent to the statement โThe matrix \(A\) is invertible.โ Before we begin our attempt to answer this question in general, letโs look at a particular example.
By solving the equation \(AX = I\) for \(X\) will give us the inverse of \(A\text{.}\) Forming the appropriate augmented matrix and finding its reduced row echelon form gives us
We have just seen that not all matrices are invertible. The attentive reader might have been able to spot the source of the trouble in the previous example: notice that the second row of \(A\) is a multiple of the first, so that the row operation \(R_2-2R_1\to R_2\) created a row of zeros. Can you think what sort of condition would signal trouble for a general \(n\times n\) matrix? Here, we need to think back to our discussions of the various theoretical concepts weโve encountered, such as rank, span, linear independence, and so on. Let us think of the rows of \(A\) as row vectors.
The elementary row operations that we perform on a matrix either rearrange these vectors, or create new vectors that are linear combinations of the old ones. The only way we end up with a row of zeros in the reduced row echelon form of \(A\) is if one of the rows of \(A\) can be written as a linear combination of the others; that is, if the rows of \(A\) are linearly dependent. We also know that if there is a row of zeros in the reduced row echelon form of \(A\text{,}\) then not every row contains a leading 1. Recalling that the rank of \(A\) is equal to the number of leading 1s in the reduced row echelon form of \(A\text{,}\) we have the following:
The claim that โthe following statements are equivalentโ in Theoremย 4.4.7 means that as soon as we know that one of the statements on the list is true, we can immediately conclude that the others are true as well. This is also the case if we know one of the statements is false. For example, if we know that \(\operatorname{rank}(A)\lt n\text{,}\) then we can immediately conclude that \(A\) will not be invertible.
Letโs sum up what weโve learned so far. Weโve discovered that if a matrix has an inverse, it has only one. Therefore, we gave that special matrix a name, โthe inverse.โ Finally, we describe the most general way to find the inverse of a matrix, and a way to tell if it does not have one.
Let \(A\) be an \(n \times n\) matrix. To find \(A^{-1}\text{,}\) put the augmented matrix
\begin{equation*}
\bbm A \amp I_n \ebm
\end{equation*}
into reduced row echelon form. If the result is of the form
\begin{equation*}
\bbm I_n \amp X \ebm,
\end{equation*}
then \(A^{-1} = X\text{.}\) If not, (that is, if the first \(n\) columns of the reduced row echelon form are not \(I_n\)), then \(A\) is not invertible.
In general, given a matrix \(A\text{,}\) to find \(A^{-1}\) we need to form the augmented matrix \(\bbm A\amp I \ebm\) and put it into reduced row echelon form and interpret the result. In the case of a \(2\times 2\) matrix, though, there is a shortcut. We give the shortcut in terms of a theorem.
We started this section out by speculating that just as we solved algebraic equations of the form \(ax=b\) by computing \(x = a^{-1}b\text{,}\) we might be able to solve matrix equations of the form \(A\vx = \vb\) by computing \(\vx = A^{-1}\vb\text{.}\) If \(A^{-1}\) does exist, then we can solve the equation \(A\vx = \vb\) this way. Consider:
\begin{align*}
A \vx \amp = \vb \amp \amp \text{ (original equation)}\\
A^{-1}A\vx \amp = A^{-1}\vb \amp \amp \text{ (multiply both sides on the left by } A^{-1})\\
I\vx \amp = A^{-1}\vb \amp \amp \text{ (since } A^{-1}A=I)\\
\vx \amp = A^{-1}\vb \amp \amp \text{ (since } I\vx = \vx)\text{.}
\end{align*}
Letโs step back and think about this for a moment. The only thing we know about the equation \(A\vx = \vb\) is that \(A\) is invertible. We also know that solutions to \(A\vx = \vb\) come in three forms: exactly one solution, infinitely many solutions, and no solution. We just showed that if \(A\) is invertible, then \(A\vx = \vb\) has at least one solution. We showed that by setting \(\vx\) equal to \(A^{-1}\vb\text{,}\) we have a solution. Is it possible that more solutions exist?
No. Suppose we are told that a known vector \(\vvv\) is a solution to the equation \(A\vx = \vb\text{;}\) that is, we know that \(A\vvv=\vb\text{.}\) We can repeat the above steps:
Theorem4.4.12.Invertible Matrices and Solutions to \(A\vx = \vb\).
Let \(A\) be an invertible \(n\times n\) matrix, and let be any \(n\times 1\) column vector. Then the equation \(A\vx = \vb\) has exactly one solution, namely
A corollary to this theorem is: If \(A\) is not invertible, then \(A\vx = \vb\) does not have exactly one solution. It may have infinitely many solutions and it may have no solution, and we would need to examine the reduced row echelon form of the augmented matrix \(\bbm A \amp \vb \ebm\) to see which case applies.
Knowing a matrix is invertible is incredibly useful. Among many other reasons, if you know \(A\) is invertible, then you know for sure that \(A\vx = \vb\) has a solution (as we just stated in Theoremย 4.4.12). In the next section weโll demonstrate many different properties of invertible matrices, including stating several different ways in which we know that a matrix is invertible.