The main reason that elementary matrices are useful is that they give us a way of encoding (or more to the point, keeping track of) the elementary row operations used to define them. The primary result is the following:
In other words, if use the notation \(X\xrightarrow{RO}Y\) to denote that a particular row operation is applied to the matrix \(X\) to obtain the matrix \(Y\text{,}\) if
\begin{equation*}
A\xrightarrow{RO} B
\end{equation*}
and
\begin{equation*}
I\xrightarrow{RO} E,
\end{equation*}
then \(B=EA\text{.}\)
Inverses of Elementary Matrices.
Notice that every elementary row operation is reversible. If we apply a row operation of the type \(R_i\leftrightarrow R_j\) to a matrix, applying it again will return our matrix to its original state. (Swapping two rows and then swapping them back again results in no net change.) It follows that any “Type 1” elementary matrix obtained by a row operation of this type is its own inverse. In terms of row operations,
\begin{equation*}
I \xrightarrow{R_i\leftrightarrow R_j} E \xrightarrow{R_i\leftrightarrow R_j} I\text{.}
\end{equation*}
But we know that applying the second row operation is the same as multiplying on the left by \(E\text{;}\) therefore, we have \(E(E) E^2 = I\text{.}\) It follows from the definition of the inverse that \(E=E^{-1}\text{.}\)
Now let us consider the other two types of row operation. For a “Type 2” elementary matrix, obtained using a row operation of the type \(kR_i\to R_i\text{,}\) we multiply one of the rows of our matrix by a common constant \(k\neq 0\text{.}\) If we then multiply by \(\dfrac{1}{k}\text{,}\) we will be back to where we started.
Thus, if \(E\) is obtained from the identity using the row operation \(kR_i\to R_i\text{,}\) then \(E^{-1}\) is obtained from the identity using the row operation \(\dfrac{1}{k}R_i\to R_i\text{.}\) For any matrix \(A\) we have
\begin{equation*}
A \xrightarrow{kR_i\to R_i} EA \xrightarrow{\frac{1}{k}R_i\to R_i} E^{-1}(EA) = A\text{.}
\end{equation*}
Finally, for a “Type 3” elementary matrix, obtained from the identity using a row operation of the type \(R_i+kR_j\to R_i\text{,}\) we add a multiple of one row to another. If we want to undo the affect of adding row \(j\) to row \(i\text{,}\) we simply subtract the same multiple of row \(j\) to row \(i\text{.}\) Thus, \(E^{-1}\) is obtained from the identity using the row operation \(R_i-kR_j\to R_i\text{.}\)
Example 4.6.4. Inverses of elementary matrices.
Determine the inverse of each of the elementary matrices below:
\(E_1 = \bbm 0 \amp 1 \amp 0\\1\amp 0\amp 0\\0\amp 0\amp 1\ebm\text{.}\)
\(E_2 = \bbm 1 \amp 0 \amp -3\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm\text{.}\)
\(E_3 = \bbm 1 \amp 0 \amp 0\\0\amp 1\amp 0\\0\amp 0\amp \frac{3}{7}\ebm\text{.}\)
Solution.
The matrix \(E_1\) is a Type 1 elementary matrix, obtained by exchanging rows 1 and 2. Thus, we have \(E_1^{-1} = E_1\text{.}\) This is easily verified by checking that \(E_1^2 = I\text{.}\)
The matrix \(E_2\) is a Type 3 elementary matrix, obtained from the identity using the row operation \(R_1-3R_3\to R_1\text{.}\) The opposite row operation is \(R_1+3R_3\to R_1\text{;}\) thus, \(E_2^{-1} = \bbm 1 \amp 0 \amp 3\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm\text{.}\) Again, we an easily check that \(E_2E_2^{-1}=I\text{.}\)
The matrix \(E_3\) is a Type 2 elementary matrix, obtained from the identity matrix using the row operation \(\frac{3}{7}R_3\to R_3\text{.}\) The opposite row operation is \(\frac{7}{3}R_3\to R_3\text{,}\) since \(\dfrac{1}{3/7} = \dfrac{7}{3}\text{.}\) It follows that \(E_3^{-1} = \bbm 1 \amp 0 \amp 0\\0\amp 1\amp 0\\0\amp 0\amp \frac{7}{3}\ebm\text{.}\)
Elementary matrices and inverses.
Consider an \(n\times n\) matrix \(A\text{.}\) Suppose we wish to reduce \(A\) to its reduced row-echelon form (to solve a system of equations, perhaps, or determine the null space off \(A\text{,}\) etc.) To do so, we carry out a series of elementary row operations, say
\begin{equation*}
A \xrightarrow{RO_1} A_1 \xrightarrow{RO_2} A_2 \xrightarrow{RO_3} \cdots \xrightarrow{RO_k} A_k = R\text{,}
\end{equation*}
where \(R\) is the reduced row-echelon form of \(A\text{.}\) Let \(E_1, E_2, \ldots, E_k\) be the elementary matrices corresponding to the elementary row operations \(RO_1\text{,}\) \(RO_2, \ldots, RO_k\text{.}\) Then we have
\begin{align*}
A_1 \amp = E_1A\\
A_2 \amp = E_2A_1 = (E_2E_1)A\\
\vdots \amp \quad \vdots\\
R = A_k \amp = E_kA_{k-1} = (E_kE_{k-1}\cdots E_2E_1)A\text{.}
\end{align*}
Now, the reduced row-echelon form \(R\) might have one or more rows of zeros, depending on the rank of \(A\text{,}\) but let us focus for now on the case where \(\operatorname{rank}(A)=n\text{,}\) in which case we know that \(R=I_n\text{,}\) the \(n\times n\) identity matrix.
In this case, we have (putting \(R=I\) in the last equality above):
\begin{equation*}
(E_k\cdots E_2E_1)A = I_n\text{.}
\end{equation*}
Letting \(B=E_k\cdots E_2E_1\text{,}\) we have \(BA=I_n\text{,}\) and it follows from the Invertible Matrix Theorem that \(B=A^{-1}\text{.}\) We have the following theorem.
Theorem 4.6.5. The inverse is a product of elementary matrices.
Let \(A\) be an invertible \(n\times n\) matrix, and let \(E_1, E_2, \ldots, E_k\) be the elementary matrices corresponding (in order) to the elementary row operations used to reduce \(A\) to the identity matrix. Then
\begin{equation*}
A^{-1} = E_k\cdots E_2E_1\text{.}
\end{equation*}
This result makes sense in the context of our algorithm for computing the inverse. Recall that to compute \(A^{-1}\text{,}\) we form the augmented matrix \([\begin{array}{c|c}A\amp I\end{array}]\text{,}\) and apply elementary row operations until we reach the reduced row-echelon form \([\begin{array}{c|c} I\amp A^{-1}\end{array}]\text{.}\)
Notice that at each step, applying an elementary row operation to an augmented matrix \([\begin{array}{c|c} M\amp N\end{array}]\) is the same as multiplying both \(M\) and \(N\) by the corresponding elementary matrix. In terms of elementary matrices, our algorithm looks like the following:
\begin{align*}
[\begin{array}{c|c} E\amp I\end{array}] \amp \xrightarrow{RO_1} [\begin{array}{c|c} E_1A\amp E_1I\end{array}]\\
\amp \xrightarrow{RO_2} [\begin{array}{c|c} E_2(E_1A) \amp E_2(E_1)\end{array}] \quad \text{(Note } E_1I=E_1)\\
\amp \quad \vdots\\
\amp \xrightarrow{RO_k} [\begin{array}{c|c} (E_k\cdots E_2E_1)A \amp E_k\cdots E_2E_1\end{array}] = [\begin{array}{c|c} I_n \amp A^{-1}\end{array}]\text{.}
\end{align*}
Since we have \((E_k\cdots E_2E_1)A=I_n\) on the left, it follows that we must have \(E_k\cdots E_2E_1 = A^{-1}\) on the right.
We also have the following consequence of our above theorem (which we may view as an additional entry in our list of equivalent statements in the Invertible Matrix Theorem):
Theorem 4.6.6. Invertible matrices are products of elementary matrices.
An \(n\times n\) matrix \(A\) is invertible if and only if it is a product of elementary matrices.
To see why this result is true, recall from
Theorem 4.5.5 that if
\(A\) and
\(B\) are invertible
\(n\times n\) matrices, then so is
\(AB\text{,}\) and
\((AB)^{-1} = B^{-1}A^{-1}\text{.}\) We can easily extend this result to products of three or more matrices. If
\(A_1, A_2, \ldots, A_k\) are all invertible
\(n\times n\) matrices, then
\(A_1A_2\cdots A_k\) is invertible, and
\begin{equation*}
(A_1A_2\cdots A_k)^{-1} = A_k^{-1}\cdots A_2^{-1}A_1^{-1}\text{.}
\end{equation*}
We know from our discussion above that every invertible matrix is invertible; therefore, if \(A=E_1E_2\cdots E_k\) is a product of elementary matrices, then \(A\) is invertible.
Conversely, suppose that \(A\) is invertible. From the previous theorem, we know that \(A^{-1}\) is a product of elementary matrices; namely,
\begin{equation*}
A^{-1} = E_k\cdots E_2E_1\text{,}
\end{equation*}
where \(E_1\text{,}\) \(E_2, \ldots, E_k\) are the elementary matrices corresponding to the elementary row operations used to carry \(A\) to the identity matrix. Thus, we have
\begin{align*}
A \amp = (A^{-1})^{-1}\\
\amp = (E_k\cdots E_2E_1)^{-1}\\
\amp = E_1^{-1}E_2^{-1}\cdots E_k^{-1}\text{.}
\end{align*}
Since the inverse of an elementary matrix is another elementary matrix, our result follows. Note that when we take the inverse of the product of elementary matrices, we must reverse the order of multiplication.
Example 4.6.7. Writing an invertible matrix as a product of elementary matrices.
Write the matrix \(A = \bbm 2\amp -1\amp 3\\-1\amp 0\amp 4\\1\amp -1\amp 3\ebm\) as a product of elementary matrices, if possible.
Solution.
We use Gauss-Jordan elimination to carry the matrix \(A\) to its reduced row-echelon form. For each elementary row operation performed, we keep track of the corresponding elementary matrix and its inverse.
\begin{align*}
\bbm 2\amp -1\amp 3\\-1\amp 0\amp 4\\1\amp -1\amp 3\ebm \xrightarrow{R_1\leftrightarrow R_3}\amp \bbm 1\amp -1\amp 3\\-1\amp 0\amp 4\\2\amp -1\amp 3\ebm \amp E_1 \amp = \bbm 0\amp 0\amp 1\\0\amp 1\amp 0\\1\amp 0\amp 0\ebm \amp E_1^{-1} \amp = \bbm 0\amp 0\amp 1\\0\amp 1\amp 0\\1\amp 0\amp 0\ebm\\
\xrightarrow{R_2+R_1\to R_2}\amp \bbm 1\amp -1\amp 3\\0\amp -1\amp 7\\2\amp -1\amp 3\ebm \amp E_2 \amp = \bbm 1\amp 0\amp 0\\1\amp 1\amp 0\\0\amp 0\amp 1\ebm \amp E_2^{-1} \amp =\bbm 1\amp 0\amp 0\\-1\amp 1\amp 0\\0\amp 0\amp 1\ebm\\
\xrightarrow{R_3-2R_1\to R_3} \amp \bbm 1\amp -1\amp 3\\0\amp -1\amp 7\\0\amp 1\amp -3\ebm \amp E_3 \amp =\bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\-2\amp 0\amp 1\ebm \amp E_3^{-1}\amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\2\amp 0\amp 1\ebm\\
\xrightarrow{R_2\leftrightarrow R_3} \amp \bbm 1\amp -1\amp 3\\0\amp 1\amp -3\\0\amp -1\amp 7\ebm \amp E_4 \amp = \bbm 1\amp 0\amp 0\\0\amp 0\amp 1\\0\amp 1\amp 0\ebm \amp E_4^{-1} \amp = \bbm 1\amp 0\amp 0\\0\amp 0\amp 1\\0\amp 1\amp 0\ebm\\
\xrightarrow{R_3+R_2\to R_3}\amp \bbm 1\amp -1\amp 3\\0\amp 1\amp -3\\0\amp 0\amp 4\ebm \amp E_5 \amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 1\amp 1\ebm \amp E_5^{-1} \amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp -1\amp 1\ebm\\
\xrightarrow{\frac{1}{4}R_3\to R_3}\amp \bbm 1\amp -1\amp 3\\0\amp 1\amp -3\\0\amp 0\amp 1\ebm \amp E_6 \amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 0\amp \frac{1}{4}\ebm \amp E_6^{-1} \amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 4\ebm\\
\xrightarrow{R_2+3R_3\to R_2} \amp \bbm 1\amp -1\amp 3\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm \amp E_7 \amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp 3\\ 0\amp 0\amp 1\ebm \amp E_7^{-1} \amp = \bbm 1\amp 0\amp 0\\0\amp 1\amp -3\\0\amp 0\amp 1\ebm\\
\xrightarrow{R_1-3R_3\to R_1}\amp \bbm 1\amp -1\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm \amp E_8 \amp = \bbm 1\amp 0\amp -3\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm \amp E_8^{-1} \amp = \bbm 1\amp 0\amp 3\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm\\
\xrightarrow{R_1+R_2\to R_1}\amp \bbm 1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm \amp E_9 \amp = \bbm 1\amp 1\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm \amp E_9^{-1} \amp = \bbm 1\amp -1\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\ebm\text{.}
\end{align*}
We then have
\begin{equation*}
A^{-1} = E_9E_8E_7E_6E_5E_4E_3E_2E_1
\end{equation*}
and
\begin{equation*}
A = E_1^{-1}E_2^{-1}E_3^{-1}E_4^{-1}E_5^{-1}E_6^{-1}E_7^{-1}E_8^{-1}E_9^{-1}\text{.}
\end{equation*}
(Ideally we’d actually write out the matrices above, but since we needed nine steps to get \(A\) to the identity, limitations on space prevent us from doing so.)