We ended the previous section by stating that invertible matrices are important. Since they are, in this section we study invertible matrices in two ways. First, we look at ways to tell whether or not a matrix is invertible, and second, we study properties of invertible matrices (that is, how they interact with other matrix operations).
In the last section we stated Theorem 4.4.7, in which we listed several properties of a matrix that equivalent to that matrix being invertible, including the fact that an \(n\times n\) matrix must have rank \(n\) in order to be invertible.
We begin this section by collecting additional properties that are equivalent to matrix a matrix being invertible. Some of these results were established in the previous section, but we state them again here for ease of reference.
Theorem4.5.1.Invertible Matrix Theorem.
Let \(\tta\) be an \(n\times n\) matrix. The following statements are equivalent.
\(\tta\) is invertible.
The equation \(\ttaxo\) has exactly one solution (namely, \(\vx = \zero\)).
The reduced row echelon form of \(\tta\) is \(\tti\text{.}\)
The equation \(\ttaxb\) has exactly one solution for every \(n\times 1\) vector\(\vb\text{.}\)
There exists a matrix \(\ttc\) such that \(\tta\ttc = \tti\text{.}\)
There exists a matrix \(\ttb\) such that \(\ttb\tta = \tti\text{.}\)
Note that the theorem uses the phrase “the following statements are equivalent.” Recall that when two or more statements are equivalent, it means that the truth of any one of them implies that the rest are also true; if any one of the statements is false, then they are all false. So, for example, if we determined that the equation \(\ttaxo\) had exactly one solution (and \(\tta\) was an \(n\times n\) matrix) then we would know that \(\tta\) was invertible, that \(\ttaxb\) had only one solution, that the reduced row echelon form of \(\tta\) was \(\tti\text{,}\) etc.
Let’s see exactly why all of these statements are equivalent. What we will do is establish the following chain of logic:
Since this chain of implications circles back on itself, each of the statements implies the others. (For example, to show (e) implies (c), we start at (e) and circle around through (a) until we get to (c).)
(a) \(\Rightarrow\) (b): Suppose \(\ttaxo\) for some vector \(\vx\text{.}\) Since \(A\) is invertible, we can multiply both sides of this equation by \(A^{-1}\text{,}\) giving us
Thus, the only possible solution to the system is \(\vx = \zero\text{.}\)
(b) \(\Rightarrow\) (c): Suppose the only solution to \(\ttaxo\) is \(\vx = \zero\text{.}\) Why must the reduced row echelon form of \(A\) be equal to \(I\text{?}\) If we let \(R\) denote the reduced row echelon form of \(A\text{,}\) we know that to solve \(\ttaxo\text{,}\) we form the augmented matrix \(\bbm A\amp \zero\ebm\) and reduce:
If the \(R\neq I\text{,}\) then \(R\) has a row of zeros, and thus, so does \(\bbm R\amp \zero\ebm\text{,}\) in which case the system \(\ttaxo\) would have at least one parameter, and thus infinitely many solutions. Since we’re assuming that \(\ttaxo\) has the unique solution \(\vx=\zero\text{,}\) it follows that \(R\) cannot have a row of zeros, and thus \(R=I\text{.}\)
(c) \(\Rightarrow\) (d): Suppose that the reduced row echelon form of \(A\) is equal to \(I\text{.}\) It follows that when solving the system \(\ttaxb\text{,}\) we would have
for some column vector \(\vec{c}\text{,}\) and thus \(\vx = \vec{c}\) is the unique solution to \(\ttaxb\text{.}\)
(d) \(\Rightarrow\) (e): Suppose that \(\ttaxb\) has a unique solution for every column vector \(\vb\text{.}\) Then in particular, we have a unique solution \(\vec{c}_j\) to the systems \(A\vx=\ven{j}\) for \(j=1,\ldots, n\text{,}\) where \(\ven{1}, \ven{2}, \ldots, \ven{n}\) are the standard basis vectors in \(\R^n\) (and also the columns of \(I_n\)). If we let
(e) \(\Rightarrow\) (f): Suppose that \(AC=I_n\) for some \(n\times n\) matrix \(C\text{.}\) We claim that Property Item b holds for the matrix \(C\text{.}\) To see this, note that since \(AC = I_n\text{,}\) if \(C\vx = \zero\text{,}\) then
Since Property Item b holds for the matrix \(C\text{,}\) it follows that Properties Item c, Item d, and Item e do as well, by what we’ve proven so far. Thus, there exists a matrix \(D\) such that \(CD = I_n\text{.}\) We can complete our proof by showing that \(D=A\text{,}\) and this is the case since (recalling that we’ve assumed \(AC=I_n\))
Thus, (f) holds. (Note that this argument finally establishes the truth of Theorem 4.4.3.)
(f) \(\Rightarrow\) (a): Suppose that \(BA=I_n\) for some matrix \(B\text{.}\) Using the same argument we just gave, it then follows that \(AB=I_n\text{,}\) and if \(AB=BA=I_n\text{,}\) then by definition \(A\) is invertible and \(B=A^{-1}\text{.}\)
So we came up with a list of statements that are all equivalent to the statement “\(\tta\) is invertible.” Again, if we know that if any one of them is true (or false), then they are all true (or all false).
Theorem 4.5.1 states formally that if \(\tta\) is invertible, then \(\ttaxb\) has exactly one solution, namely \(\ttai\vb\text{.}\) What if \(\tta\) is not invertible? What are the possibilities for solutions to \(\ttaxb\text{?}\)
We know that \(\ttaxb\)cannot have exactly one solution; if it did, then by our theorem it would be invertible. Recalling that linear equations have either one solution, infinitely many solutions, or no solution, we are left with the latter options when \(\tta\) is not invertible. This idea is important and so we’ll state it again as a Key Idea.
Key Idea4.5.2.Solutions to \(\ttaxb\) and the Invertibility of \tta.
Consider the system of linear equations \(\ttaxb\text{.}\)
If \(\tta\) is invertible, then \(\ttaxb\) has exactly one solution, namely \(\ttai\vb\text{.}\)
If \(\tta\) is not invertible, then \(\ttaxb\) has either infinitely many solutions or no solution.
In Theorem 4.5.1 we’ve come up with a list of ways in which we can tell whether or not a matrix is invertible. At the same time, we have come up with a list of properties of invertible matrices — things we know that are true about them. (For instance, if we know that \(\tta\) is invertible, then we know that \(\ttaxb\) has only one solution.)
We now go on to discover other properties of invertible matrices. Specifically, we want to find out how invertibility interacts with other matrix operations. For instance, if we know that \(\tta\) and \(\ttb\) are invertible, what is the inverse of \(\tta+\ttb\text{?}\) What is the inverse of \(\tta\ttb\text{?}\) What is “the inverse of the inverse?” We’ll explore these questions through an example.
We notice immediately that the two rows of \(A+B\) are the same! Subtracting Row 1 from Row 2 would produce a row of zeros, so \(A+B\) has rank \(1\lt 2\text{,}\) and therefore cannot be invertible.
To compute \((5\tta)^{-1}\text{,}\) we compute 5\(\tta\) and then apply Theorem 4.4.10.
We now look for connections between \(\ttai\text{,}\)\(\ttbi\text{,}\)\((\tta\ttb)^{-1}\text{,}\)\((\ttai)^{-1}\) and \((\tta+\ttb)^{-1}\text{.}\)
Is there some sort of relationship between \((\tta\ttb)^{-1}\) and \(\ttai\) and \(\ttbi\text{?}\) A first guess that seems plausible is \((\tta\ttb)^{-1} = \ttai\ttbi\text{.}\) Is this true? Using our work from above, we have
Obviously, this is not equal to \((\tta\ttb)^{-1}\text{.}\) Before we do some further guessing, let’s think about what the inverse of \(\tta\ttb\) is supposed to do. The inverse — let’s call it \(\ttc\) — is supposed to be a matrix such that
In examining the expression \((\tta\ttb)\ttc\text{,}\) we see that we want \(\ttb\) to somehow “cancel” with \(\ttc\text{.}\) What “cancels” \(\ttb\text{?}\) An obvious answer is \(\ttbi\text{.}\) This gives us a thought: perhaps we got the order of \(\ttai\) and \(\ttbi\) wrong before. After all, we were hoping to find that
but algebraically speaking, it is hard to cancel out these terms. (Recall that matrix multiplication is not commutative: \(\tta\ttb\neq \ttb\tta\) in general.)
However, switching the order of \(\ttai\) and \(\ttbi\) gives us some hope. Is \((\tta\ttb)^{-1} = \ttbi\ttai\text{?}\) Let’s see.
Since \((AB)(B^{-1}A^{-1})=I_n\text{,}\) we know immediately from Theorem 4.5.1 that \((\tta\ttb)^{-1} = \ttbi\ttai\text{.}\) Note also that our argument above was completely general, so this result holds true for any pair of \(n\times n\) matrices \(A\) and \(B\text{.}\) Let’s confirm this with our example matrices.
Is there some sort of connection between \((\ttai)^{-1}\) and \(\tta\text{?}\) The answer is pretty obvious: they are equal. The “inverse of the inverse” returns one to the original matrix.
Is there some sort of relationship between \((\tta+\ttb)^{-1}\text{,}\)\(\ttai\) and \(\ttbi\text{?}\) Certainly, if we were forced to make a guess without working any examples, we would guess that
Let’s summarize the results of this example. If \(\tta\) and \(\ttb\) are both invertible matrices, then so is their product, \tta\(\ttb\text{.}\) We demonstrated this with our example, and there is more to be said. Let’s suppose that \(\tta\) and \(\ttb\) are \(n\times n\) matrices, but we don’t yet know if they are invertible. If \tta\(\ttb\) is invertible, then each of \(\tta\) and \(\ttb\) are; if \tta\(\ttb\) is not invertible, then either \(\tta\) or \(\ttb\) is not invertible.
In short, invertibility “works well” with matrix multiplication. However, we saw that it doesn’t work well with matrix addition. Knowing that \(\tta\) and \(\ttb\) are invertible does not help us find the inverse of \((\tta+\ttb)\text{;}\) in fact, the latter matrix may not even be invertible.
Let’s do one more example, then we’ll summarize the results of this section in a theorem.
Example4.5.4.Computing the inverse of a diagonal matrix.
Find the inverse of \(\tta = \bbm 2\amp 0\amp 0\\0\amp 3\amp 0\\0\amp 0\amp -7\ebm\text{.}\)
The matrix \(\tta\) in the previous example is a diagonal matrix: the only nonzero entries of \(\tta\) lie on the diagonal. The relationship between \(\tta\) and \(\ttai\) in the above example seems pretty strong, and it holds true in general. We’ll state this and summarize the results of this section with the following theorem.
Theorem4.5.5.Properties of Invertible Matrices.
Let \(\tta\) and \(\ttb\) be \(n\times n\) invertible matrices. Then:
\tta\(\ttb\) is invertible; \((\tta\ttb)^{-1} = \ttbi\ttai\text{.}\)
\(\ttai\) is invertible; \((\ttai)^{-1} = \tta\text{.}\)
\(n\tta\) is invertible for any nonzero scalar \(n\text{;}\)\((n\tta)^{-1} = \frac 1n \ttai\text{.}\)
If \(\tta\) is a diagonal matrix, with diagonal entries \(d_1, d_2, \cdots , d_n\text{,}\) where none of the diagonal entries are 0, then \(\ttai\) exists and is a diagonal matrix. Furthermore, the diagonal entries of \(\ttai\) are \(1/d_1, 1/d_2, \cdots, 1/d_n\text{.}\)
Furthermore,
If a product \tta\(\ttb\) is not invertible, then \(\tta\) or \(\ttb\) is not invertible.
If \(\tta\) or \(\ttb\) are not invertible, then \tta\(\ttb\) is not invertible.
We end this section with a comment about solving systems of equations “in real life.” Solving a system \(\ttaxb\) by computing \(\ttai\vb\) seems pretty slick, so it would make sense that this is the way it is normally done. However, in practice, this is rarely done. There are two main reasons why this is the case.
First, computing \(\ttai\) and \(\ttai\vb\) is “expensive” in the sense that it takes up a lot of computing time. Certainly, our calculators have no trouble dealing with the \(3 \times 3\) cases we often consider in this textbook, but in real life the matrices being considered are very large (as in, hundreds of thousand rows and columns). Computing \(\ttai\) alone is rather impractical, and we waste a lot of time if we come to find out that \(\ttai\) does not exist. Even if we already know what \(\ttai\) is, computing \(\ttai\vb\) is computationally expensive — Gaussian elimination is faster.
Secondly, computing \(\ttai\) using the method we’ve described often gives rise to numerical roundoff errors. Even though computers often do computations with an accuracy to more than 8 decimal places, after thousands of computations, rounding off can cause big errors. (A “small” \(1,000 \times 1,000\) matrix has \(1,000,000\) entries! That’s a lot of places to have roundoff errors accumulate!) It is not unheard of to have a computer compute \(\ttai\) for a large matrix, and then immediately have it compute \(\tta\ttai\) and not get the identity matrix. (The result is usually very close, with the numbers on the diagonal close to 1 and the other entries near 0. But it isn’t exactly the identity matrix.)
Therefore, in real life, solutions to \(\ttaxb\) are usually found using the methods we learned in Section 3.6. It turns out that even with all of our advances in mathematics, it is hard to beat the basic method that Gauss introduced a long time ago.
ExercisesExercises
Exercise Group.
Matrices \(\tta\) and \(\ttb\) are given. Compute \((\tta\ttb)^{-1}\) and \(\ttb^{-1}\tta^{-1}\text{.}\)
A \(2\times 2\) matrix \(\tta\) is given. Compute \(\ttai\) and \((\ttai)^{-1}\) using Theorem 4.4.10.
5.
\(\tta = \bbm -3 \amp 5 \\1 \amp -2 \ebm\)
6.
\(\tta = \bbm 3 \amp 5\\2 \amp 4 \ebm\)
7.
\(\tta = \bbm 2 \amp 7\\1 \amp 3 \ebm\)
8.
\(\tta = \bbm 9 \amp 0 \\7 \amp 9 \ebm\)
9.
Find \(2\times 2\) matrices \(\tta\) and \(\ttb\) that are each invertible, but \(\tta+\ttb\) is not.
10.
Create a random \(6\times 6\) matrix \(\tta\text{,}\) then have a calculator or computer compute \(\tta\ttai\text{.}\) Was the identity matrix returned exactly? Comment on your results.
11.
Use a calculator or computer to compute \(\tta\ttai\text{,}\) where \(\tta = \bbm 1 \amp 2 \amp 3\amp 4\\1\amp 4\amp 9\amp 16\\1\amp 8\amp 27\amp 64\\1\amp 16\amp 81\amp 256\ebm.\) Was the identity matrix returned exactly? Comment on your results.