We ended the previous section by stating that invertible matrices are important. Since they are, in this section we study invertible matrices in two ways. First, we look at ways to tell whether or not a matrix is invertible, and second, we study properties of invertible matrices (that is, how they interact with other matrix operations).
In the last section we stated Theoremย 4.4.7, in which we listed several properties of a matrix that equivalent to that matrix being invertible, including the fact that an \(n\times n\) matrix must have rank \(n\) in order to be invertible.
We begin this section by collecting additional properties that are equivalent to matrix a matrix being invertible. Some of these results were established in the previous section, but we state them again here for ease of reference.
Note that the theorem uses the phrase โthe following statements are equivalent.โ Recall that when two or more statements are equivalent, it means that the truth of any one of them implies that the rest are also true; if any one of the statements is false, then they are all false. So, for example, if we determined that the equation \(A\vx = \zero\) had exactly one solution (and \(A\) was an \(n\times n\) matrix) then we would know that \(A\) was invertible, that \(A\vx = \vb\) had only one solution, that the reduced row echelon form of \(A\) was \(I\text{,}\) etc.
Since this chain of implications circles back on itself, each of the statements implies the others. (For example, to show (e) implies (c), we start at (e) and circle around through (a) until we get to (c).)
(a) \(\Rightarrow\) (b): Suppose \(A\vx = \zero\) for some vector \(\vx\text{.}\) Since \(A\) is invertible, we can multiply both sides of this equation by \(A^{-1}\text{,}\) giving us
(b) \(\Rightarrow\) (c): Suppose the only solution to \(A\vx = \zero\) is \(\vx = \zero\text{.}\) Why must the reduced row echelon form of \(A\) be equal to \(I\text{?}\) If we let \(R\) denote the reduced row echelon form of \(A\text{,}\) we know that to solve \(A\vx = \zero\text{,}\) we form the augmented matrix \(\bbm A\amp \zero\ebm\) and reduce:
If the \(R\neq I\text{,}\) then \(R\) has a row of zeros, and thus, so does \(\bbm R\amp \zero\ebm\text{,}\) in which case the system \(A\vx = \zero\) would have at least one parameter, and thus infinitely many solutions. Since weโre assuming that \(A\vx = \zero\) has the unique solution \(\vx=\zero\text{,}\) it follows that \(R\) cannot have a row of zeros, and thus \(R=I\text{.}\)
(c) \(\Rightarrow\) (d): Suppose that the reduced row echelon form of \(A\) is equal to \(I\text{.}\) It follows that when solving the system \(A\vx = \vb\text{,}\) we would have
(d) \(\Rightarrow\) (e): Suppose that \(A\vx = \vb\) has a unique solution for every column vector \(\vb\text{.}\) Then in particular, we have a unique solution \(\vec{c}_j\) to the systems \(A\vx=\ven{j}\) for \(j=1,\ldots, n\text{,}\) where \(\ven{1}, \ven{2}, \ldots, \ven{n}\) are the standard basis vectors in \(\R^n\) (and also the columns of \(I_n\)). If we let
(e) \(\Rightarrow\) (f): Suppose that \(AC=I_n\) for some \(n\times n\) matrix \(C\text{.}\) We claim that Property Itemย b holds for the matrix \(C\text{.}\) To see this, note that since \(AC = I_n\text{,}\) if \(C\vx = \zero\text{,}\) then
Since Property Itemย b holds for the matrix \(C\text{,}\) it follows that Properties Itemย c, Itemย d, and Itemย e do as well, by what weโve proven so far. Thus, there exists a matrix \(D\) such that \(CD = I_n\text{.}\) We can complete our proof by showing that \(D=A\text{,}\) and this is the case since (recalling that weโve assumed \(AC=I_n\))
(f) \(\Rightarrow\) (a): Suppose that \(BA=I_n\) for some matrix \(B\text{.}\) Using the same argument we just gave, it then follows that \(AB=I_n\text{,}\) and if \(AB=BA=I_n\text{,}\) then by definition \(A\) is invertible and \(B=A^{-1}\text{.}\)
So we came up with a list of statements that are all equivalent to the statement โ\(A\) is invertible.โ Again, if we know that if any one of them is true (or false), then they are all true (or all false).
Theoremย 4.5.1 states formally that if \(A\) is invertible, then \(A\vx = \vb\) has exactly one solution, namely \(A^{-1}\vb\text{.}\) What if \(A\) is not invertible? What are the possibilities for solutions to \(A\vx = \vb\text{?}\)
We know that \(A\vx = \vb\)cannot have exactly one solution; if it did, then by our theorem it would be invertible. Recalling that linear equations have either one solution, infinitely many solutions, or no solution, we are left with the latter options when \(A\) is not invertible. This idea is important and so weโll state it again as a Key Idea.
In Theoremย 4.5.1 weโve come up with a list of ways in which we can tell whether or not a matrix is invertible. At the same time, we have come up with a list of properties of invertible matrices โ things we know that are true about them. (For instance, if we know that \(A\) is invertible, then we know that \(A\vx = \vb\) has only one solution.)
We now go on to discover other properties of invertible matrices. Specifically, we want to find out how invertibility interacts with other matrix operations. For instance, if we know that \(A\) and \(B\) are invertible, what is the inverse of \(A+B\text{?}\) What is the inverse of \(AB\text{?}\) What is โthe inverse of the inverse?โ Weโll explore these questions through an example.
We notice immediately that the two rows of \(A+B\) are the same! Subtracting Row 1 from Row 2 would produce a row of zeros, so \(A+B\) has rank \(1\lt 2\text{,}\) and therefore cannot be invertible.
Is there some sort of relationship between \((AB)^{-1}\) and \(A^{-1}\) and \(B^{-1}\text{?}\) A first guess that seems plausible is \((AB)^{-1} = A^{-1}B^{-1}\text{.}\) Is this true? Using our work from above, we have
Obviously, this is not equal to \((AB)^{-1}\text{.}\) Before we do some further guessing, letโs think about what the inverse of \(AB\) is supposed to do. The inverse โ letโs call it \(C\) โ is supposed to be a matrix such that
In examining the expression \((AB)C\text{,}\) we see that we want \(B\) to somehow โcancelโ with \(C\text{.}\) What โcancelsโ \(B\text{?}\) An obvious answer is \(B^{-1}\text{.}\) This gives us a thought: perhaps we got the order of \(A^{-1}\) and \(B^{-1}\) wrong before. After all, we were hoping to find that
\begin{equation*}
ABA^{-1}B^{-1} \overset{\text{?}}{=} I,
\end{equation*}
but algebraically speaking, it is hard to cancel out these terms. (Recall that matrix multiplication is not commutative: \(AB\neq BA\) in general.)
Since \((AB)(B^{-1}A^{-1})=I_n\text{,}\) we know immediately from Theoremย 4.5.1 that \((AB)^{-1} = B^{-1}A^{-1}\text{.}\) Note also that our argument above was completely general, so this result holds true for any pair of \(n\times n\) matrices \(A\) and \(B\text{.}\) Letโs confirm this with our example matrices.
Is there some sort of connection between \((A^{-1})^{-1}\) and \(A\text{?}\) The answer is pretty obvious: they are equal. The โinverse of the inverseโ returns one to the original matrix.
Is there some sort of relationship between \((A+B)^{-1}\text{,}\)\(A^{-1}\) and \(B^{-1}\text{?}\) Certainly, if we were forced to make a guess without working any examples, we would guess that
Letโs summarize the results of this example. If \(A\) and \(B\) are both invertible matrices, then so is their product, \(AB\text{.}\) We demonstrated this with our example, and there is more to be said. Letโs suppose that \(A\) and \(B\) are \(n\times n\) matrices, but we donโt yet know if they are invertible. If \(AB\) is invertible, then each of \(A\) and \(B\) are; if \(AB\) is not invertible, then either \(A\) or \(B\) is not invertible.
In short, invertibility โworks wellโ with matrix multiplication. However, we saw that it doesnโt work well with matrix addition. Knowing that \(A\) and \(B\) are invertible does not help us find the inverse of \((A+B)\text{;}\) in fact, the latter matrix may not even be invertible.
The matrix \(A\) in the previous example is a diagonal matrix: the only nonzero entries of \(A\) lie on the diagonal. The relationship between \(A\) and \(A^{-1}\) in the above example seems pretty strong, and it holds true in general. Weโll state this and summarize the results of this section with the following theorem.
If \(A\) is a diagonal matrix, with diagonal entries \(d_1, d_2, \cdots , d_n\text{,}\) where none of the diagonal entries are 0, then \(A^{-1}\) exists and is a diagonal matrix. Furthermore, the diagonal entries of \(A^{-1}\) are \(1/d_1, 1/d_2, \cdots, 1/d_n\text{.}\)
We end this section with a comment about solving systems of equations โin real life.โ Solving a system \(A\vx = \vb\) by computing \(A^{-1}\vb\) seems pretty slick, so it would make sense that this is the way it is normally done. However, in practice, this is rarely done. There are two main reasons why this is the case.
First, computing \(A^{-1}\) and \(A^{-1}\vb\) is โexpensiveโ in the sense that it takes up a lot of computing time. Certainly, our calculators have no trouble dealing with the \(3 \times 3\) cases we often consider in this textbook, but in real life the matrices being considered are very large (as in, hundreds of thousand rows and columns). Computing \(A^{-1}\) alone is rather impractical, and we waste a lot of time if we come to find out that \(A^{-1}\) does not exist. Even if we already know what \(A^{-1}\) is, computing \(A^{-1}\vb\) is computationally expensive โ Gaussian elimination is faster.
Secondly, computing \(A^{-1}\) using the method weโve described often gives rise to numerical roundoff errors. Even though computers often do computations with an accuracy to more than 8 decimal places, after thousands of computations, rounding off can cause big errors. (A โsmallโ \(1,000 \times 1,000\) matrix has \(1,000,000\) entries! Thatโs a lot of places to have roundoff errors accumulate!) It is not unheard of to have a computer compute \(A^{-1}\) for a large matrix, and then immediately have it compute \(AA^{-1}\) and not get the identity matrix. (The result is usually very close, with the numbers on the diagonal close to 1 and the other entries near 0. But it isnโt exactly the identity matrix.)
Therefore, in real life, solutions to \(A\vx = \vb\) are usually found using the methods we learned in Sectionย 3.6. It turns out that even with all of our advances in mathematics, it is hard to beat the basic method that Gauss introduced a long time ago.
Create a random \(6\times 6\) matrix \(A\text{,}\) then have a calculator or computer compute \(AA^{-1}\text{.}\) Was the identity matrix returned exactly? Comment on your results.
Use a calculator or computer to compute \(AA^{-1}\text{,}\) where \(A = \bbm 1 \amp 2 \amp 3\amp 4\\1\amp 4\amp 9\amp 16\\1\amp 8\amp 27\amp 64\\1\amp 16\amp 81\amp 256\ebm.\) Was the identity matrix returned exactly? Comment on your results.