Skip to main content

Section 5.4 Direct Sums and Invariant Subspaces

This section continues the discussion of direct sums (from Section 1.8) and invariant subspaces (from Section 4.1), to better understand the structure of linear operators.

Subsection 5.4.1 Invariant subspaces

Definition 5.4.1.

Given an operator \(T:V\to V\text{,}\) we say that a subspace \(U\subseteq V\) is \(T\)-invariant if \(T(\uu)\in U\) for all \(\uu\in U\text{.}\)
Given a basis \(B=\basis{u}{k}\) of \(U\text{,}\) note that \(U\) is \(T\)-invariant if and only if \(T(\uu_i)\in U\) for each \(i=1,2,\ldots, k\text{.}\)
For any operator \(T:V\to V\text{,}\) there are four subspaces that are always \(T\)-invariant:
\begin{equation*} \{\zer\}, V, \ker T, \text{ and } \im T\text{.} \end{equation*}
Of course, some of these subspaces might be the same; for example, if \(T\) is invertible, then \(\ker T = \{\zer\}\) and \(\im T = V\text{.}\)

Exercise 5.4.2.

Show that for any linear operator \(T\text{,}\) the subspaces \(\ker T\) and \(\im T\) are \(T\)-invariant.
Hint.
In each case, choose an element \(\vv\) of the subspace. What does the definition of the space tell you about that element? (For example, if \(\vv\in\ker T\text{,}\) what is the value of \(T(\vv)\text{?}\)) Then show that \(T(\vv)\) also fits the defintion of that space.
A subspace \(U\) is \(T\)-invariant if \(T\) does not map any vectors in \(U\) outside of \(U\text{.}\) Notice that if we shrink the domain of \(T\) to \(U\text{,}\) then we get an operator from \(U\) to \(U\text{,}\) since the image \(T(U)\) is contained in \(U\text{.}\)

Definition 5.4.3.

Let \(T:V\to V\) be a linear operator, and let \(U\) be a \(T\)-invariant subspace. The restriction of \(T\) to \(U\text{,}\) denoted \(T|_U\text{,}\) is the operator \(T|_U:U\to U\) defined by \(T|_U(\uu)=T(\uu)\) for all \(\uu\in U\text{.}\)

Exercise 5.4.4.

    True or false: the restriction \(T|_U\) is the same function as the operator \(T\text{.}\)
  • True.

  • The definition of a function includes its domain and codomain. Since the domain of \(T|_U\) is different from that of \(T\text{,}\) they are not the same function.
  • False.

  • The definition of a function includes its domain and codomain. Since the domain of \(T|_U\) is different from that of \(T\text{,}\) they are not the same function.
A lot can be learned by studying the restrictions of an operator to invariant subspaces. Indeed, the textbook by Axler does almost everything from this point of view. One reason to study invariant subspaces is that they allow us to put the matrix of \(T\) into simpler forms.
Reducing a matrix to block triangular form is useful, because it simplifies computations such as determinants and eigenvalues (and determinants and eigenvalues are computationally expensive). In particular, if a matrix \(A\) has the block form
\begin{equation*} A = \bbm A_{11} \amp A_{12} \amp \cdots \amp A_{1n}\\ 0\amp A_{22} \amp \cdots \amp A_{2n}\\ \vdots \amp \vdots \amp \ddots \amp \vdots\\ 0 \amp 0 \amp \cdots \amp A_{nn}\ebm\text{,} \end{equation*}
where the diagonal blocks are square matrices, then \(\det(A) = \det(A_{11})\det(A_{22})\cdots \det(A_{nn})\) and \(c_A(x) = c_{A_{11}}(x)c_{A_{22}}(x)\cdots c_{A_{nn}}(x)\text{.}\)

Subsection 5.4.2 Eigenspaces

An important source of invariant subspaces is eigenspaces. Recall that for any real number \(\lambda\text{,}\) and any operator \(T:V\to V\text{,}\) we define
\begin{equation*} E_\lambda(T) = \ker(T-\lambda 1_V) = \{\vv\in V \,|\, T(\vv) = \lambda\vv\}\text{.} \end{equation*}
For most values of \(\lambda\text{,}\) we’ll have \(E_\lambda(T)=\{\zer\}\text{.}\) The values of \(\lambda\) for which \(E_\lambda(T)\) is non-trivial are precisely the eigenvalues of \(T\text{.}\) Note that since similar matrices have the same characteristic polynomial, any matrix representation \(M_B(T)\) will have the same eigenvalues. They do not generally have the same eigenspaces, but we do have the following.
In other words, the two eigenspaces are isomorphic, although the isomorphism depends on a choice of basis.

Subsection 5.4.3 Direct Sums

Recall that for any subspaces \(U,W\) of a vector space \(V\text{,}\) the sets
\begin{align*} U+W \amp =\{\uu+\ww \,|\, \uu\in U \text{ and } \ww\in W\}\\ U\cap W \amp = \{\vv \in V \,|\, \vv\in U \text{ and } \vv\in W\} \end{align*}
are subspaces of \(V\text{.}\) Saying that \(\vv\in U+W\) means that \(\vv\) can be written as a sum of a vector in \(U\) and a vector in \(W\text{.}\) However, this sum may not be unique. If \(\vv\in U\cap W\text{,}\) \(\uu\in U\) and \(\ww\in W\text{,}\) then we can write \((\uu+\vv)+\ww = \uu + (\vv+\ww)\text{,}\) giving two different representations of a vector as an element of \(U+W\text{.}\)
We proved in Theorem 1.8.9 in Section 1.8 that for any \(\vv\in U+W\text{,}\) there exist unique vectors \(\uu\in U\) and \(\ww\in W\) such that \(\vv=\uu+\ww\text{,}\) if and only if \(U\cap W=\{\zer\}\text{.}\)
In Definition 1.8.8, we said that a sum \(U+W\) where \(U\cap W=\{\zer\}\) is called a direct sum, written as \(U\oplus W\text{.}\)
Typically we are interested in the case that the two subspaces sum to \(V\text{.}\) Recall from Definition 1.8.11 that if \(V = U\oplus W\text{,}\) we say that \(W\) is a complement of \(U\text{.}\) We also say that \(U\oplus W\) is a direct sum decomposition of \(V\text{.}\) Of course, the orthogonal complement \(U^\bot\) of a subspace \(U\) is a complement in this sense, if \(V\) is equipped with an inner product. (Without an inner product we have no concept of “orthogonal”.) But even if we don’t have an inner product, finding a complement is not too difficult, as the next example shows.

Example 5.4.7. Finding a complement by extending a basis.

The easiest way to determine a direct sum decomposition (or equivalently, a complement) is through the use of a basis. Suppose \(U\) is a subspace of \(V\) with basis \(\basis{e}{k}\text{,}\) and extend this to a basis
\begin{equation*} B = \{\mathbf{e}_1,\ldots, \mathbf{e}_k,\mathbf{e}_{k+1},\ldots, \mathbf{e}_n\} \end{equation*}
of \(V\text{.}\) Let \(W = \spn\{\mathbf{e}_{k+1},\ldots, \mathbf{e}_n\}\text{.}\) Then clearly \(U+W=V\text{,}\) and \(U\cap W=\{\zer\}\text{,}\) since if \(\vv\in U\cap W\text{,}\) then \(\vv\in U\) and \(\vv\in W\text{,}\) so we have
\begin{equation*} \vv = a_1\mathbf{e}_1+\cdots + a_k\mathbf{e_k} = b_1\mathbf{e}_{k+1}+\cdots+b_{n-k}e_{n}\text{,} \end{equation*}
which gives
\begin{equation*} a_1\mathbf{e}_1+\cdots + a_k\mathbf{e}_k-b_1\mathbf{e}_{k+1}-\cdots - b_{n-k}\mathbf{e}_n=\zer\text{,} \end{equation*}
so \(a_1=\cdots b_{n-k}=0\) by the linear independence of \(B\text{,}\) showing that \(\vv=\zer\text{.}\)
Conversely, if \(V=U\oplus W\text{,}\) and we have bases \(\basis{u}{k}\) of \(U\) and \(\basis{v}{l}\) of \(W\text{,}\) then
\begin{equation*} B = \{\uu_1,\ldots, \uu_k,\ww_1,\ldots, \ww_l\} \end{equation*}
is a basis for \(V\text{.}\) Indeed, \(B\) spans \(V\text{,}\) since every element of \(V\) can be written as \(\vv=\uu+\ww\) with \(\uu\in U,\ww\in W\text{.}\) Independence follows by reversing the argument above: if
\begin{equation*} a_1\uu_1+\cdots + a_k\uu_k+b_1\ww_1+\cdots b_l\ww_l=\zer \end{equation*}
then \(a_1\uu_1+\cdots + a_k\uu_k = -b_1\ww_1-\cdots -b_l\ww_l\text{,}\) and equality is only possible if both sides belong to \(U\cap W = \{\zer\}\text{.}\) Since \(\basis{u}{k}\) is independent, the \(a_i\) have to be zero, and since \(\basis{w}{l}\) is independent, the \(b_j\) have to be zero.
The argument given in the second part of Example 5.4.7 has an immediate, but important consequence.

Example 5.4.9.

Suppose \(V=U\oplus W\text{,}\) where \(U\) and \(W\) are \(T\)-invariant subspaces for some operator \(T:V\to V\text{.}\) Let \(B_U=\basis{u}{m}\) and let \(B_W = \basis{w}{n}\) be bases for \(U\) and \(W\text{,}\) respectively. Determine the matrix of \(T\) with respect to the basis \(B=B_U\cup B_W\) of \(V\text{.}\)
Solution.
Since we don’t know the map \(T\) or anything about the bases \(B_U,B_W\text{,}\) we’re looking for a fairly general statement here. Since \(U\) is \(T\)-invariant, we must have \(T(\uu_i)\in U\) for each \(i=1,\ldots, m\text{.}\) Similarly, \(T(\ww_j)\in W\) for each \(j=1,\ldots, n\text{.}\) This means that we have
\begin{align*} T(\uu_1) \amp = a_{11}\uu_1 + \cdots + a_{m1}\uu_m + 0\ww_1+\cdots + 0\ww_n\\ \amp \vdots \\ T(\uu_m) \amp = a_{1m}\uu_1 + \cdots + a_{mm}\uu_m+0\ww_1+\cdots + 0\ww_n\\ T(\ww_1) \amp = 0\uu_1 + \cdots + 0\uu_m+b_{11}\ww_1 + \cdots + b_{n1}\ww_n \\ \amp \vdots \\ T(\ww_n) \amp = 0\uu_1 + \cdots + 0\uu_m+b_{1n}\ww_1 + \cdots + b_{nn}\ww_n \end{align*}
for some scalars \(a_{ij},b_{ij}\text{.}\) If we set \(A = [a_{ij}]_{m\times m}\) and \(B = [b_{ij}]_{n\times n}\text{,}\) then we have
\begin{equation*} M_B(T) = \bbm A \amp 0\\0\amp B\ebm\text{.} \end{equation*}
Moreover, we can also see that \(A = M_{B_U}(T|_U)\text{,}\) and \(B = M_{B_W}(T|_W)\text{.}\)