In this section we introduce the concept of an elementary matrix. Elementary matrices are relatively simple objects, as their name suggests, but as we will see, they give us a simple method for understanding why our algorithm for computing the inverse of a matrix works. When we reach the discussion of determinants and their properties, weβll also see that elementary matrices provide a simple proof of an important result: the product rule for determinants.
Note that by reversing the elementary row operation used to create an \(n\times n\) elementary matrix \(E\text{,}\) we could equally well say that \(E\) is a matrix of rank \(n\) that is one row operation away from its reduced row-echelon form (namely, the identity matrix).
The main reason that elementary matrices are useful is that they give us a way of encoding (or more to the point, keeping track of) the elementary row operations used to define them. The primary result is the following:
Theorem4.6.2.Effect of Multiplication by an Elementary Matrix.
Let \(A\) be an \(n\times k\) matrix and suppose that the matrix \(B\) is obtained from \(A\) using an elementary row operation. Then \(B=EA\text{,}\) where \(E\) is the elementary matrix obtained from the identity matrix using the same row operation used to obtain \(B\) from \(A\text{.}\)
In other words, if use the notation \(X\xrightarrow{RO}Y\) to denote that a particular row operation is applied to the matrix \(X\) to obtain the matrix \(Y\text{,}\) if
\begin{equation*}
A\xrightarrow{RO} B
\end{equation*}
and
\begin{equation*}
I\xrightarrow{RO} E,
\end{equation*}
Example4.6.3.Multiplication by an elementary matrix.
Let \(A = \bbm 2\amp -1\amp 1\\0\amp 3\amp 2\\4\amp 1\amp -1\ebm\text{.}\) For each of the following elementary row operations, construct the corresponding elementary matrix, and verify that multiplication by this matrix performs the desired row operation.
Applying the same row operation to the \(3\times 3\) identity matrix gives us the elementary matrix \(E_1 = \bbm 0 \amp 0 \amp 1\\0\amp 1\amp 0\\1\amp 0\amp 0\ebm\text{,}\) and we can easily verify that
For the row operation \(2R_1\to R_1\) the corresponding elementary matrix is \(E_2 = \bbm 2 \amp 0 \amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm\text{,}\) and we have
The elementary matrix corresponding to the row operation \(R_3-2R_1\to R_3\) is \(E_3 = \bbm 1 \amp 0 \amp 0\\0\amp 1\amp 0\\-2\amp 0\amp 1\ebm\text{,}\) and
Notice that every elementary row operation is reversible. If we apply a row operation of the type \(R_i\leftrightarrow R_j\) to a matrix, applying it again will return our matrix to its original state. (Swapping two rows and then swapping them back again results in no net change.) It follows that any βType 1β elementary matrix obtained by a row operation of this type is its own inverse. In terms of row operations,
\begin{equation*}
I \xrightarrow{R_i\leftrightarrow R_j} E \xrightarrow{R_i\leftrightarrow R_j} I\text{.}
\end{equation*}
But we know that applying the second row operation is the same as multiplying on the left by \(E\text{;}\) therefore, we have \(E(E) E^2 = I\text{.}\) It follows from the definition of the inverse that \(E=E^{-1}\text{.}\)
Now let us consider the other two types of row operation. For a βType 2β elementary matrix, obtained using a row operation of the type \(kR_i\to R_i\text{,}\) we multiply one of the rows of our matrix by a common constant \(k\neq 0\text{.}\) If we then multiply by \(\dfrac{1}{k}\text{,}\) we will be back to where we started.
Thus, if \(E\) is obtained from the identity using the row operation \(kR_i\to R_i\text{,}\) then \(E^{-1}\) is obtained from the identity using the row operation \(\dfrac{1}{k}R_i\to R_i\text{.}\) For any matrix \(A\) we have
\begin{equation*}
A \xrightarrow{kR_i\to R_i} EA \xrightarrow{\frac{1}{k}R_i\to R_i} E^{-1}(EA) = A\text{.}
\end{equation*}
Finally, for a βType 3β elementary matrix, obtained from the identity using a row operation of the type \(R_i+kR_j\to R_i\text{,}\) we add a multiple of one row to another. If we want to undo the affect of adding row \(j\) to row \(i\text{,}\) we simply subtract the same multiple of row \(j\) to row \(i\text{.}\) Thus, \(E^{-1}\) is obtained from the identity using the row operation \(R_i-kR_j\to R_i\text{.}\)
The matrix \(E_1\) is a Type 1 elementary matrix, obtained by exchanging rows 1 and 2. Thus, we have \(E_1^{-1} = E_1\text{.}\) This is easily verified by checking that \(E_1^2 = I\text{.}\)
The matrix \(E_2\) is a Type 3 elementary matrix, obtained from the identity using the row operation \(R_1-3R_3\to R_1\text{.}\) The opposite row operation is \(R_1+3R_3\to R_1\text{;}\) thus, \(E_2^{-1} = \bbm 1 \amp 0 \amp 3\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm\text{.}\) Again, we an easily check that \(E_2E_2^{-1}=I\text{.}\)
The matrix \(E_3\) is a Type 2 elementary matrix, obtained from the identity matrix using the row operation \(\frac{3}{7}R_3\to R_3\text{.}\) The opposite row operation is \(\frac{7}{3}R_3\to R_3\text{,}\) since \(\dfrac{1}{3/7} = \dfrac{7}{3}\text{.}\) It follows that \(E_3^{-1} = \bbm 1 \amp 0 \amp 0\\0\amp 1\amp 0\\0\amp 0\amp \frac{7}{3}\ebm\text{.}\)
Consider an \(n\times n\) matrix \(A\text{.}\) Suppose we wish to reduce \(A\) to its reduced row-echelon form (to solve a system of equations, perhaps, or determine the null space off \(A\text{,}\) etc.) To do so, we carry out a series of elementary row operations, say
where \(R\) is the reduced row-echelon form of \(A\text{.}\) Let \(E_1, E_2, \ldots, E_k\) be the elementary matrices corresponding to the elementary row operations \(RO_1\text{,}\)\(RO_2, \ldots, RO_k\text{.}\) Then we have
Now, the reduced row-echelon form \(R\) might have one or more rows of zeros, depending on the rank of \(A\text{,}\) but let us focus for now on the case where \(\operatorname{rank}(A)=n\text{,}\) in which case we know that \(R=I_n\text{,}\) the \(n\times n\) identity matrix.
Letting \(B=E_k\cdots E_2E_1\text{,}\) we have \(BA=I_n\text{,}\) and it follows from the Invertible Matrix Theorem that \(B=A^{-1}\text{.}\) We have the following theorem.
Theorem4.6.5.The inverse is a product of elementary matrices.
Let \(A\) be an invertible \(n\times n\) matrix, and let \(E_1, E_2, \ldots, E_k\) be the elementary matrices corresponding (in order) to the elementary row operations used to reduce \(A\) to the identity matrix. Then
This result makes sense in the context of our algorithm for computing the inverse. Recall that to compute \(A^{-1}\text{,}\) we form the augmented matrix \([\begin{array}{c|c}A\amp I\end{array}]\text{,}\) and apply elementary row operations until we reach the reduced row-echelon form \([\begin{array}{c|c} I\amp A^{-1}\end{array}]\text{.}\)
Notice that at each step, applying an elementary row operation to an augmented matrix \([\begin{array}{c|c} M\amp N\end{array}]\) is the same as multiplying both \(M\) and \(N\) by the corresponding elementary matrix. In terms of elementary matrices, our algorithm looks like the following:
We also have the following consequence of our above theorem (which we may view as an additional entry in our list of equivalent statements in the Invertible Matrix Theorem):
To see why this result is true, recall from TheoremΒ 4.5.5 that if \(A\) and \(B\) are invertible \(n\times n\) matrices, then so is \(AB\text{,}\) and \((AB)^{-1} = B^{-1}A^{-1}\text{.}\) We can easily extend this result to products of three or more matrices. If \(A_1, A_2, \ldots, A_k\) are all invertible \(n\times n\) matrices, then \(A_1A_2\cdots A_k\) is invertible, and
We know from our discussion above that every invertible matrix is invertible; therefore, if \(A=E_1E_2\cdots E_k\) is a product of elementary matrices, then \(A\) is invertible.
where \(E_1\text{,}\)\(E_2, \ldots, E_k\) are the elementary matrices corresponding to the elementary row operations used to carry \(A\) to the identity matrix. Thus, we have
Since the inverse of an elementary matrix is another elementary matrix, our result follows. Note that when we take the inverse of the product of elementary matrices, we must reverse the order of multiplication.
We use Gauss-Jordan elimination to carry the matrix \(A\) to its reduced row-echelon form. For each elementary row operation performed, we keep track of the corresponding elementary matrix and its inverse.
\begin{equation*}
A = E_1^{-1}E_2^{-1}E_3^{-1}E_4^{-1}E_5^{-1}E_6^{-1}E_7^{-1}E_8^{-1}E_9^{-1}\text{.}
\end{equation*}
(Ideally weβd actually write out the matrices above, but since we needed nine steps to get \(A\) to the identity, limitations on space prevent us from doing so.)
Find the inverse of the given elementary matrix, and state the row operation corresponding to the inverse. (Note: these are the same matrices as problems ExercisesΒ 1β4.)