Math1410 Null Space and Column Space

Section 5.4 Null Space and Column Space

We close this chapter with some discussion of a theoretical nature. First, we will attempt to gain some additional insight into the (initially mysterious) definition of matrix multiplication by revisiting it from the point of view of linear transformations. We’ll then introduce two fundamental subspaces associated with a matrix transformation.

🔗

Subsection 5.4.1 Matrix multiplication as function composition

Recall that one of the ways we can obtain new functions from old ones is via function composition. Given two functions \(f(x)\) and \(g(x)\) (where \(x\in\R\)), we can form the compositions

\begin{align*} (f\circ g)(x) \amp = f(g(x)) \text{ and}\\ (g\circ f)(x) \amp = g(f(x))\text{,} \end{align*}

as long as we meet certain conditions on the compatibility of the domains and ranges of the two functions.

🔗

If you paid attention in high school, you probably also remember that the order of composition matters: in general,

\begin{equation*} (f\circ g)(x) \neq (g\circ f)(x)\text{.} \end{equation*}

For example, if \(f(x) = 2x+1\) and \(g(x) = x^2\text{,}\) then

\begin{equation*} (f\circ g)(x) = f(g(x)) = 2g(x)+1 = 2x^2+1, \end{equation*}

while

\begin{equation*} (g\circ f)(x) = g(f(x)) = (f(x))^2 = (2x+1)^2 = 4x^2+4x+1\text{.} \end{equation*}

🔗

In this example, both functions are defined from \(\R\) to \(\R\text{,}\) and neither is a linear transformation in the sense of this section. In fact, if \(f:\R\to\R\) satisfies Definition 5.2.3, then we must have \(f(x) = ax\) for some real number \(a\text{.}\) If \(g(x) = bx\) is another linear transformation from \(\R\) to \(\R\text{,}\) notice that we have

\begin{equation*} (f\circ g)(x) = f(g(x))=a(g(x)) = a(bx) = (ab)x\text{.} \end{equation*}

Thus, to compose two linear transformations from \(\R\) to \(\R\text{,}\) we simply multiply the constants used to define the transformations.

🔗

Now, what about a general linear transformation \(S:\R^n\to \R^m\text{?}\) We know that any such transformation is a matrix transformation: we must have

\begin{equation*} S(\vec x) = A\vec x \end{equation*}

for any \(\vec x\in\R^n\text{,}\) where \(A\) is an \(m\times n\) matrix. Since we’re multiplying an \(m\times n\) matrix by an \(n\times 1\) matrix, the rules of matrix multiplication ensure that the output \(\vec y = A\vec x\) is an element of \(\R^m\text{.}\)

🔗

Suppose now that we want to define the composition \((S\circ T)(\vec x)\) for some other linear transformation \(T\text{.}\) Recall the following rule of function composition:

🔗

In order for the composition \(S\circ T\) to be defined, the range of \(T\) must be contained in the domain of \(S\).
🔗

🔗

That is, since \(S\circ T\) is defined by \((S\circ T)(\vec x) = S(T(\vec x))\text{,}\) the vector \(T(\vec x)\) (which by definition is in the range of \(T\)) must belong to the domain of \(S\text{.}\) This means that we must have \(T(\vec x)\in \R^n\text{,}\) so we have

\begin{equation*} T:\R^k\to \R^n \end{equation*}

for some natural number \(k\text{.}\) On the other hand, we know that if \(T\) is a linear transformation, then \(T\) is defined by matrix multiplication: \(T(\vec x) = B\vec x\) for some \(n\times k\) matrix \(B\text{.}\)

🔗

Let us now recall one of the rules of matrix multiplication:

🔗

For the matrix product \(AB\) to be defined, the number of columns of \(A\) must equal the number of rows of \(B\).
🔗

🔗

That is, if \(A\) is an \(m\times n\) matrix, then \(B\) must be an \(n\times k\) matrix for some \(k\text{.}\) But this is the same conclusion as above! What is the connection? Well, if we follow the rules for function composition, if \(T(\vec x) = B\vec x\) and \(S(\vec y) = A\vec y\text{,}\) we must have

\begin{equation*} (S\circ T)(\vec x) = S(T(\vec x)) = A(T(\vec x)) = A(B\vec x) = (AB)\vec x\text{,} \end{equation*}

where the last equality is due to the associative property of matrix multiplication from Theorem 4.2.11. Thus, we see that:

🔗

Composition of linear transformations is the same as multiplication of the corresponding matrices!
🔗

🔗

Looking at things from the point of view of matrix transformations gives us two insights on the nature of matrix multiplication:

When \(A\) and \(B\) are both \(n\times n\) matrices, the transformations \(S(\vec x) = A\vec x\) and \(T(\vec x) = B\vec x\) are both maps from \(\mathbb{R}^n\) to \(\mathbb{R}^n\text{,}\) and we can define both

\begin{equation*} (S\circ T)(\vec x) = (AB)\vec x \end{equation*}

and

\begin{equation*} (T\circ S)(\vec x) = (BA)\vec x\text{.} \end{equation*}

Our experience with functions teaches us that most of the time, \(S\circ T \neq T\circ S\text{,}\) so of course it makes sense that \(AB\neq BA\) in general!

🔗

🔗
The fact that \(AB\) is defined only when the number of columns of \(A\) matches the number of rows of \(B\) is simply a consequence of the fact that \(S\circ T\) is only defined if the range of \(T\) is a subset of the domain of \(A\text{.}\)
🔗

🔗

🔗

What about the “row times column” rule for determining the entries of \(AB\text{?}\) Let’s look at how things work in 2D. Suppose we’ve defined linear transformations

\begin{align*} S\left(\bbm x\\y\ebm\right) \amp = A\bbm x\\y\ebm = \bbm a_{11} \amp a_{12}\\ a_{21} \amp a_{22}\ebm \bbm x\\y\ebm \text{ and}\\ T\left(\bbm x\\y\ebm\right) \amp = B\bbm x\\y\ebm = \bbm b_{11} \amp b_{12}\\ b_{21} \amp b_{22}\ebm \bbm x\\y\ebm\text{.} \end{align*}

🔗

If we write \(T\left(\bbm x\\y\ebm\right) = \bbm u\\v\ebm\text{,}\) where \(u = b_{11}x+b_{12}y\) and \(v = b_{21}x+b_{22}y\text{,}\) then we have

\begin{align*} (S\circ T)\left(\bbm x\\y\ebm\right) = S\left(T\left(\bbm x\\y\ebm\right)\right) \amp = S\left(\bbm u\\v\ebm\right)\\ \amp = \bbm a_{11} \amp a_{12}\\a_{21} \amp a_{22}\ebm \bbm u\\v\ebm\\ \amp = \bbm a_{11}u + a_{12}v\\a_{21}u+a_{22}v\ebm\\ \amp = \bbm a_{11}(b_{11}x+b_{12}y) + a_{12}(b_{21}x+b_{22}y)\\ a_{21}(b_{11}x+b_{12}y) + a_{22}(b_{21}x+b_{22}y)\ebm\\ \amp = \bbm (a_{11}b_{11}+a_{12}b_{21})x + (a_{11}b_{12}+a_{12}b_{22})y\\ (a_{21}b_{11}+a_{22}b_{21})x + (a_{21}b_{12}+a_{22}b_{22})y\ebm\\ \amp = \bbm a_{11}b_{11}+a_{12}b_{21} \amp a_{11}b_{12}+a_{12}b_{22}\\ a_{21}b_{11}+a_{22}b_{21} \amp a_{21}b_{12}+a_{22}b_{22}\ebm \bbm x\\y\ebm\text{.} \end{align*}

🔗

But we also argued above that we should have

\begin{equation*} (S\circ T)\left(\bbm x\\y\ebm\right) = (AB)\bbm x\\y\ebm\text{,} \end{equation*}

from which we’re forced to conclude that

\begin{equation*} AB = \bbm a_{11}b_{11}+a_{12}b_{21} \amp a_{11}b_{12}+a_{12}b_{22}\\ a_{21}b_{11}+a_{22}b_{21} \amp a_{21}b_{12}+a_{22}b_{22}\ebm\text{,} \end{equation*}

and this is exactly the rule for multiplying two \(2\times 2\) matrices! Of course, we can repeat the above argument for the general case where \(A\) is \(m\times n\) and \(B\) is \(n\times k\text{,}\) but you can probably guess that the algebra gets a bit messy on that one, so we’ll spare you the details.

🔗

Subsection 5.4.2 Column space

When we discussed the composition of linear transformations above, we briefly mentioned that this involves the consideration of the range. Recall that the range of a function is the set of all possible outputs when every input in the domain is considered. For example, with the function \(f(x)=x^2\text{,}\) where \(x\) can be any real number, the range is the set of all real numbers \(y\geq 0\text{.}\) (If \(y=x^2\) and \(x\) is real, then \(y\) can’t be negative.)

🔗

If we’re given a linear transformation \(T:\R^n\to \R^m\text{,}\) we might want to know what sort of vectors \(\vec y \in \R^m\) can be obtained from \(T\text{.}\) Consider Examples Example 5.1.3 and Example 5.1.5 from way back at the beginning of the section. In Example 5.1.3, the vectors \(A\vec x\) and \(A\vec y\) were non-parallel, and therefore independent. It follows that for any other vector \(\vec z\in\R^2\text{,}\) we can find scalars \(a\) and \(b\) such that

\begin{equation*} \vec z = a(A\vec x) + b(A\vec y) = A(a\vec x) + A(b\vec y) = A(a\vec x+b\vec y)\text{,} \end{equation*}

so every vector in \(\R^2\) can be written as the output of the transformation \(T(\vec x) = A\vec x\text{.}\) On the other hand, using the matrix \(A = \bbm 1\amp -1\\1\amp -1\ebm\) in Example 5.1.5, for any vector \(\bbm a\\b\ebm\in\R^2\text{,}\) we have

\begin{equation*} A\bbm a\\b\ebm = \bbm 1 \amp -1\\1 \amp -1\ebm \bbm a\\b\ebm = \bbm a-b\\a-b\ebm = (a-b)\bbm 1\\1\ebm\text{,} \end{equation*}

so the only vectors in the range of \(T(\vec x) = A\vec x\) are those parallel to the vector \(\bbm 1\\1\ebm\text{.}\)

🔗

Next, we’re going to consider a general matrix transformation \(T:\R^n\to \R^m\) given by \(T(\vec x) = A\vec x\text{,}\) but we’ll play around with the multiplication a little bit. By definition (and a bit of manipulation), we have

\begin{align*} T(\vec x) = A\vec x \amp = \bbm a_{11} \amp a_{12} \amp \cdots \amp a_{1n}\\ a_{21} \amp a_{22} \amp \cdots \amp a_{2n}\\ \vdots \amp \vdots \amp \ddots \amp \vdots\\ a_{m1} \amp a_{m2} \amp \cdots \amp a_{mn}\ebm \bbm x_1\\x_2\\ \vdots \\ x_n\ebm\\ \amp = \bbm a_{11}x_1+a_{12}x_2 + \cdots + a_{1n}x_n\\ a_{21}x_1+a_{22}x_2 + \cdots + a_{2n}x_n\\ \vdots\\ a_{m1}x_1+a_{m2}x_2 + \cdots + a_{mn}x_n\ebm\\ \amp = \bbm a_{11}x_1\\a_{21}x_1\\ \vdots \\ a_{m1}x_1\ebm + \bbm a_{12}x_2\\a_{22}x_2\\ \vdots \\ a_{m2}x_2\ebm + \cdots + \bbm a_{1n}x_n\\a_{2n}x_n\\ \vdots \\ a_{mn}x_n\ebm\\ \amp = x_1 \bbm a_{11}\\a_{21}\\ \vdots \\ a_{m1}\ebm + x_2 \bbm a_{12}\\a_{22}\\ \vdots \\ a_{m2}\ebm + \cdots + x_n \bbm a_{1n}\\a_{2n}\\ \vdots \\ a_{mn}\ebm\text{.} \end{align*}

Thus, whenever we multiply a vector by a matrix, the result is a linear combination of the columns of \(A\text{!}\) If we think of each column of \(A\) as a column vector in \(\R^m\text{,}\) we can make the following definition:

🔗

Note: Although we didn’t say so at the time, we already encountered this rule for multiplying a vector by a matrix in the argument we gave in support of Theorem 5.2.12. Some textbooks actually use this observation to give an alternative definition of matrix multiplication. Once we know how the product \(A\vec x\) is defined for an \(m\times n\) matrix \(A\) and \(n\times 1\) vector \(\vec x\text{,}\) we can define \(AB\) for an \(n\times p\) matrix \(B\) as follows: first, we write

\begin{equation*} B = \bbm \vec{b}_1 \amp \vec{b}_2 \amp \cdots \amp \vec{b}_p\ebm\text{,} \end{equation*}

where the \(n\times 1\) vectors \(\vec{b}_1,\ldots, \vec{b}_p\) are the columns of \(B\text{.}\) We then define

\begin{align*} AB \amp = A\bbm \vec{b}_1 \amp \vec{b}_2 \amp \cdots \amp \vec{b}_p\ebm\\ \amp = \bbm A\vec{b}_1 \amp A\vec{b}_2 \amp \cdots \amp A\vec{b}_p\ebm\text{.} \end{align*}

It’s a good exercise to verify (with a few examples) that this definition of the product \(AB\) is the same as the “row times column” definition we gave earlier.

🔗

Definition 5.4.1. The column space of a matrix.

The column space of an \(m\times n\) matrix \(A\) is the subspace of \(\R^m\) spanned by the columns of \(A\text{:}\)

\begin{equation*} \operatorname{col}(A) = \operatorname{span}\left\{\bbm a_{11}\\a_{21}\\\vdots \\ a_{m1}\ebm, \bbm a_{12}\\a_{22}\\ \vdots \\ a_{m2}\ebm, \ldots , \bbm a_{1n}\\a_{2n}\\ \vdots \\ a_{mn}\ebm\right\}\text{.} \end{equation*}

🔗

From the discussion above, we can make two conclusions. First, if \(T(\vec x) = A\vec x\) is a linear transformation, we have

\begin{equation*} \operatorname{range}(T) = \operatorname{col}(A)\text{.} \end{equation*}

Second, as mentioned in Definition 5.4.1, since the range of \(T\) can be written as a span, it is automatically a subspace of \(\R^m\) according to Theorem 5.3.5. The range of a linear transformation is one of the more important examples of a subspace.

🔗

To give a more useful description of the column space, we rely Theorem 5.4.2 below, whose proof is too technical for this text. To help with the statement of this theorem, we first introduce one more bit of terminology. We will call a column of a matrix \(A\) a pivot column if the corresponding column in the reduced row echelon form of \(A\) contains a leading 1.

🔗

Theorem 5.4.2. Basis for the column space of a matrix.

A basis for the column space of an \(m\times n\) matrix \(A\) is given by the set of pivot columns of \(A\text{.}\)

🔗

We will illustrate Theorem 5.4.2 with an example. It’s important to note that while we need to find the reduced row echelon form of \(A\) in order to find the pivot columns, the columns we want are those of the original matrix \(A\text{,}\) not its RREF.

🔗

Example 5.4.3. Finding a basis for the column space.

Determine a basis for the column space of the matrix

\begin{equation*} A = \bbm 1 \amp 0 \amp 2\amp -3\\2\amp -1\amp 0\amp 4\\-1\amp 1\amp 3\amp 0\ebm\text{.} \end{equation*}

🔗

Solution.

We begin by computing the reduced row echelon form \(R\) of \(A\text{.}\) We find

\begin{equation*} R = \bbm 1 \amp 0 \amp 0 \amp -17\\ 0 \amp 1 \amp 0 \amp -38\\ 0 \amp 0 \amp 1 \amp 7\ebm\text{,} \end{equation*}

and note that \(R\) has leading 1s in columns 1, 2, and 3. It follows that

\begin{equation*} B = \left\{\bbm 1\\2\\-1\ebm, \bbm 0\\-1\\1\ebm, \bbm 2\\0\\3\ebm\right\} \end{equation*}

is a basis for \(\operatorname{col}(A)\text{.}\)

🔗

Let’s make a few observations about the previous example. Notice that we have three leading 1s, so \(\operatorname{rank}(A) = 3\text{.}\) In particular, there is a leading 1 in each row, so we’re guaranteed that the system \(A\vx = \vb\) is consistent, no matter what the vector \(\vec{b}\) is. Since the number of pivot columns of \(A\) is equal to the number of leading 1s, we obtain the following result.

🔗

Theorem 5.4.4. Dimension of the column space.

The dimension of the column space of a matrix \(A\) (or equivalently, the dimension of range of the matrix transformation defined by \(A\)) is equal to the rank of \(A\text{.}\)

🔗

To see why this result can be useful, notice that in our previous example, the matrix transformation \(T(\vx) = A\vx\) determines a linear transformation \(T:\R^4\to \R^3\text{.}\) Notice that there are three vectors in the basis for \(\operatorname{col}(A)\text{;}\) this means that the column space of \(A\) (and thus, the range of \(T\)) is three-dimensional, and therefore the range of \(T\) is all of \(\R^3\text{,}\) and thus, no matter what vector \(\vec b\in\R^3\) we choose, we’re guaranteed to be able to find a vector \(\vx \in \R^4\) such that \(A\vx = \vec b\text{.}\)

🔗

The key observation here is that the question “Does \(A\vx = \vb\) have a solution?” is equivalent to the question “Does the vector \(\vec{b}\) belong to \(\operatorname{col}(A)\text{?}\)” Unfortunately, while we may gain some insight from noticing that these questions are the same, we are no further ahead when it comes to answering them. Whatever version we prefer, the only way to get an answer is to compute the reduced row echelon form of \(\left[\begin{array}{c|c} A\amp \vec{b}\end{array}\right]\text{.}\)

🔗

Suppose we repeated Example 5.4.3 using the matrix \(A\) from Example 3.6.9. Both cases involved a matrix of size \(3\times 4\text{,}\) but the matrix from Example 3.6.9 had rank 2, so the column space of \(A\) is only two-dimensional. In this case, the system \(A\vx = \vb\) will be consistent if \(\vec{b}\) belongs to the span of the first two columns of \(A\text{,}\) and inconsistent otherwise.

🔗

Reading off the first two columns of \(A\text{,}\) we find that

\begin{equation*} \operatorname{col}(A) = \operatorname{span}\left\{\bbm 1\\3\\-2\ebm, \bbm -2\\-1\\-6\ebm\right\}\text{.} \end{equation*}

We know that this is a plane through the origin in \(\mathbb{R}^3\text{,}\) but how do we quickly determine what vectors belong to this plane? There’s an easy way and a hard way. The easy way is to compute the cross product, as we did in many of the problems from Section 2.6. We find

\begin{equation*} \bbm 1\\3\\-2\ebm\times\bbm -2\\-1\\-6\ebm = \bbm -20\\10\\5\ebm = 5\bbm -4\\2\\1\ebm = 5\vec n\text{,} \end{equation*}

where we’ve chosen to factor out the scalar multiple of 5 to simplify our normal vector.

🔗

From this we know that a vector

\begin{equation*} \vb = \bbm a\\b\\c\ebm \end{equation*}

belongs to the column space of \(A\) if and only if

\begin{equation*} -4a+2b+c=0, \end{equation*}

using the scalar form for the equation of a plane in \(\mathbb{R}^3\text{.}\) Having done it the easy way, let us do things once more the hard way. (Why do it the hard way if the easy way works? Because if we’re in any other case than a two-dimensional subspace of \(\R^3\text{,}\) the hard way is the only option we have!) The hard way is to solve the equation \(A\vx = \vb\) for an arbitrary vector \(\vb = \bbm a\\b\\c\ebm\text{.}\) As with the previous examples, we set up the augmented matrix and reduce:

\begin{equation*} \left[\begin{array}{cccc|c}1 \amp -2 \amp 0\amp 4\amp a\\3\amp -1\amp 5\amp 2\amp b\\-2\amp -6\amp -10\amp 12\amp c\end{array}\right] \quad \longrightarrow \quad \left[\begin{array}{cccc|c}1\amp -2\amp 0\amp 4\amp a\\0\amp 1\amp 1\amp -2\amp (b-3a)/5\\0\amp 0\amp 0\amp 0\amp (c-4a+2b)/10\end{array}\right]\text{.} \end{equation*}

🔗

We stopped before getting all the way to the reduced row echelon form, but we’re far enough along to realize that the only way our system can be consistent is if the last entry in the third row is equal to zero. This gives us the condition

\begin{equation*} \frac{c-4a+2b}{10}=0\text{,} \end{equation*}

which (after multiplying both sides by 10) is exactly the same as what we found using the cross product.

🔗

Subsection 5.4.3 Null space

The other important example is the null space of a matrix. The null space of an \(m\times n\) matrix \(A\) is simply the set of all those vectors \(\vec{x}\in\R^n\) such that \(A\vec x = \vec 0\text{.}\)

🔗

Definition 5.4.5. The null space of a matrix.

The null space of an \(m\times n\) matrix \(A\) is denoted by \(\operatorname{null}(A)\text{,}\) and defined by

\begin{equation*} \operatorname{null}(A) = \{\vec x\in\R^n \, | \, A\vec x = \vec 0\}\text{.} \end{equation*}

🔗

For example, we saw in Example 5.1.5 that the vector \(\vec x = \bbm 1\\1\ebm\) belongs to the null space of the matrix \(A = \bbm 1\amp -1\\1\amp -1\ebm\text{,}\) since

\begin{equation*} A\vec x = \bbm 1 \amp -1\\1 \amp -1\ebm \bbm 1\\1\ebm = \bbm 1-1\\1-1\ebm = \bbm 0\\0\ebm\text{.} \end{equation*}

🔗

Given a general \(m\times n\) matrix \(A\text{,}\) we know from Section 3.6 that determining the null space amounts to simply solving a homogeneous system of linear equations. Let us see how this works in an example.

🔗

Example 5.4.6. Determining the null space of a matrix.

Determine the null space of the matrix

\begin{equation*} A = \bbm 2 \amp -3 \\ -2 \amp 3\ebm\text{.} \end{equation*}

🔗

Solution.

Since the null space of \(A\) is equal to the set of all solutions \(\vx\) to the matrix equation \(A\vec x = \vec 0\text{,}\) we proceed by forming the proper augmented matrix and putting it into reduced row echelon form, which we do below.

\begin{equation*} \left[\begin{array}{cc|c} 2 \amp -3 \amp 0\\-2 \amp 3\amp 0\end{array}\right] \quad \arref \quad \left[\begin{array}{cc|c} 1 \amp -3/2 \amp 0 \\0\amp 0\amp 0\end{array}\right] \end{equation*}

🔗

We interpret the reduced row echelon form of this matrix to find that

\begin{align*} x_1 \amp = 3/2 t \\ x_2 \amp = t \text{ is free}\text{.} \end{align*}

🔗

We can say that \(\vx\in\operatorname{null}(A)\) provided that

\begin{equation*} \vx = \bbm x_1\\x_2\ebm = \bbm \frac{3}{2}t\\ t\ebm = t\bbm \frac{3}{2}\\1\ebm\text{.} \end{equation*}

If we set

\begin{equation*} \vvv = \bbm 3/2\\1\ebm\text{,} \end{equation*}

then we can write our solution as

\begin{equation*} \operatorname{null}(A) = \left\{t\vvv \,|\, t\in\R \text{ and } \vvv = \bbm 3/2\\1\ebm\right\}\text{.} \end{equation*}

🔗

We see that the null space of \(A\) contains infinitely many solutions to the equation \(A\vx = \vec 0\text{;}\) any choice of \(x_2\) gives us one of these solutions. For instance, picking \(x_2=2\) gives the solution

\begin{equation*} \vx = \bbm 3\\2\ebm\text{.} \end{equation*}

🔗

This is a particularly nice solution, since there are no fractions! In fact, since the parameter \(t\) can take on any real value, there is nothing preventing us from defining a new parameter \(s = t/2\text{,}\) and then

\begin{equation*} \vx = t\bbm 3/2 \\1\ebm = t\left(\frac{1}{2}\bbm 3\\2\ebm \right) = \frac{t}{2}\bbm 3\\2\ebm = s\bbm 3\\2\ebm = s\vec w, \end{equation*}

where \(\vec w = 2\vvv\text{.}\)

🔗

Our solutions are multiples of a vector, and hence we can graph this, as done in Figure 5.4.7.

🔗

A line through the origin with slope 1, and its direction vector v. — Figure 5.4.7. The solution, as a line, to \(A\vx=\zero\) in Example 5.4.6
🔗

🔗

In Example 5.4.6, we saw that the solution is a line through the origin, and thus, we can conclude that \(\operatorname{null}(A)\) is a subspace! In fact, this is no coincidence: it is guaranteed by our next theorem.

🔗

Theorem 5.4.8. The null space of a matrix is a subspace.

For any \(m\times n\) matrix \(A\text{,}\) \(\operatorname{null}(A)\) is a subspace of \(\R^n\text{.}\)

🔗

The proof of this theorem is simple. Suppose \(\vec x, \vec y\in \operatorname{null}(A)\text{.}\) By definition, this means \(A\vec x = \vec 0\) and \(A\vec y = \vec 0\text{.}\) Using the properties of matrix multiplication, we have

\begin{equation*} A(\vec x + \vec y) = A\vec x + A\vec y = \vec 0 + \vec 0 = \vec 0\text{,} \end{equation*}

so \(\vec x + \vec y\in \operatorname{null}(A)\text{,}\) and

\begin{equation*} A(c\vec x) = c(A\vec x) = c\vec 0 = \vec 0\text{,} \end{equation*}

so \(c\vec x\in \operatorname{null}(A)\text{.}\) It follows from the definition of a subspace that \(\operatorname{null}(A)\) is a subspace of \(\R^n\text{.}\)

🔗

In Section 2.7 we discussed the fact that whenever we have a subspace of \(\R^n\text{,}\) it can be useful to determine a basis for our subspace. Recall from Definition 3.6.8 that the general solution to a homogeneous system of linear equations can be written in terms of certain basic solutions. In the context of null spaces, these basic solutions are just such a basis.

🔗

Although we will not prove it here, the basic solutions to a homogeneous system are always linearly independent. Moreover, it follows from the definition of \(\operatorname{null}(A)\) that any \(\vx \in\operatorname{null}(A)\) can be written as a linear combination of the basic solutions. In the language of Section 2.7, the basic solutions to a homogeneous system \(A\vx = \vec 0\) form a basis for the null space of \(A\text{.}\) This is an important point to remember, so we emphasize it in the following Key Idea.

🔗

Key Idea 5.4.9. Basis for the null space of a matrix.

The basic solutions to the homogeneous system \(A\vec{x}=\vec{0}\) form a basis for the null space of \(A\text{.}\) That is, if \(\vec{v}_1, \vec{v}_2, \ldots, \vec{v}_k\) are the basic solutions to \(A\vec{x}=\vec{0}\text{,}\) then

\begin{equation*} \operatorname{null}(A)=\operatorname{span}\{\vec{v}_1,\vec{v}_2,\ldots, \vec{v}_k\}\text{.} \end{equation*}

🔗

To illustrate Key Idea 5.4.9, let’s revisit an example from Section 3.6 using the language of null space and basis.

🔗

Example 5.4.10. A two-dimensional null space.

Find a basis for the null space of \(A\text{,}\) where

\begin{equation*} A = \bbm 1 \amp -2 \amp 0\amp 4\\3\amp -1\amp 5\amp 2\\-2\amp -6\amp -10\amp 12\ebm\text{.} \end{equation*}

🔗

Solution.

Again, determining \(\operatorname{null}(A)\) is the same as solving the homogeneous system \(A\vx = \vec 0\text{,}\) and by Key Idea 5.4.9, a basis for \(\operatorname{null}(A)\) is given by the basic solutions to this system. As usual, to find the basic solutions, we set up the augmented matrix of the system and reduce:

\begin{equation*} \left[\begin{array}{cccc|c}1\amp -2\amp 0\amp 4\amp 0\\3\amp -1\amp 5\amp 2\amp 0\\-2\amp -6\amp -10\amp 12\amp 0\end{array}\right] \quad \arref \quad \left[\begin{array}{cccc|c}1\amp 0\amp 2\amp 0\amp 0\\0\amp 1\amp 1\amp -2\amp 0\\0\amp 0\amp 0\amp 0\amp 0\end{array}\right]\text{.} \end{equation*}

🔗

From the reduced row echelon form of the augmented matrix, we can read off the following general solution:

\begin{align*} x_1 \amp = -2s\\ x_2 \amp = -s+2t\\ x_3 \amp = s \text{ is free}\\ x_4 \amp = t \text{ is free}\text{.} \end{align*}

🔗

In this case, we have two parameters, so we expect two basic solutions. To find these, we write our solution in vector form:

\begin{equation*} \vx = \bbm x_1\\x_2\\x_3\\x_4\ebm = \bbm -2s\\-s+2t\\s\\t\ebm = s\bbm -2\\-1\\1\\0\ebm + t\bbm 0\\2\\0\\1\ebm\text{.} \end{equation*}

From the above, we see that the general solution can be written as \(\vx = s\vvv +t\vec{w}\text{,}\) where

\begin{equation*} \vvv = \bbm -2\\-1\\1\\0\ebm \quad \text{ and } \quad \vec{w} = \bbm 0\\2\\0\\1\ebm \end{equation*}

are the basic solutions to \(A\vx = \vec 0\text{.}\) Since the null space of \(A\) is equal to the set of solutions to \(A\vx = \vec 0\text{,}\) and since every solution to \(A\vx = \vec 0\) can be written in terms of \(\vvv\) and \(\vec{w}\text{,}\) it follows that

\begin{equation*} \operatorname{null}(A) = \operatorname{span}\{\vvv, \vec{w}\}, \end{equation*}

and that \(\{\vvv, \vec{w}\}\) is a basis for \(\operatorname{null}(A)\text{.}\)

🔗

Another reason the null space is interesting is that it lets us determine whether or not a linear transformation is one-to-one. Suppose \(T:\R^n\to\R^m\) is a linear transformation defined by \(T(\vec x) = A\vec x\text{.}\) We know that \(T(\vec 0) = \vec 0\text{,}\) so \(\vec 0\in \operatorname{null}(A)\) (as it must be, since \(\operatorname{null}(A)\) is a subspace). If we have any non-zero vector \(\vec v\in\operatorname{null}(A)\text{,}\) then \(T\) cannot be one-to-one, since we’d have

\begin{equation*} T(\vec v) = A\vec v = \vec 0 = T(\vec 0)\text{.} \end{equation*}

Thus, if \(\operatorname{null}(A)\neq \{\vec 0\}\text{,}\) then \(T\) is not one-to-one. On the other hand, suppose \(\operatorname{null}(A)=\{\vec 0\}\text{,}\) and that \(T(\vec x) = T(\vec y)\) for vectors \(\vec x, \vec y\in R^n\text{.}\) Then we have

\begin{equation*} \vec 0 = T(\vec x) - T(\vec y) = T(\vec x - \vec y) = A(\vec x - \vec y)\text{,} \end{equation*}

so that \(\vec x - \vec y\in \operatorname{null}(A) = \{0\}\text{,}\) which means that \(\vec x - \vec y = \vec{0}\text{,}\) and thus \(\vec x = \vec y\text{.}\) We have proved the following:

🔗

Theorem 5.4.11. Null space and one-to-one transformations.

Let \(T:\R^n\to\R^m\) be defined by \(T(\vec x) = A\vec x\) for some \(m\times n\) matrix \(A\text{.}\) Then \(T\) is one-to-one if and only if \(\operatorname{null}(A) = \{\vec 0\}\text{.}\)

🔗

The final result we’ll state provides an interesting (and powerful) relationship between the null and column spaces.

🔗

Theorem 5.4.12. The Fundamental Theorem of Linear Transformations.

Let \(T:\R^n\to\R^m\) be a linear transformation defined by \(T(\vec x) = A\vec x\) for some \(m\times n\) matrix \(A\text{.}\) Then

\begin{equation*} \dim \operatorname{null}(A) + \dim \operatorname{col}(A) = n\text{.} \end{equation*}

🔗

This result is sometimes known as the “rank-nullity theorem”; it gives the relationship between the rank of a matrix \(A\text{,}\) which is equal to the dimension of its column space, and the nullity of \(A\text{,}\) which is defined to be the dimension of its null space.

🔗

A formal proof of this result is beyond the scope of this course, but the intuition we’ve gained from solving systems should make it plausible. Recall that Item 3 in Theorem 3.6.15 on the relationship between the rank of a matrix and types of solutions gives us the equation

\begin{equation*} k + \operatorname{rank}(A) = n\text{,} \end{equation*}

where \(k\) is the number of parameters in the general solution of \(A\vx = \vec b\text{.}\) Now, we know from Definition 3.6.8 that the number of parameters in the general solution to \(A\vx = \vb\) is equal to the number of basic solutions to the system \(A\vx = \vec 0\text{,}\) and that the basic solutions to \(A\vx = \zero\) form a basis for \(\operatorname{null}(A)\text{.}\) From this, we can conclude that

\begin{equation*} k = \dim \operatorname{null}(A)\text{.} \end{equation*}

We also claimed in Theorem 5.4.4 above that the rank of \(A\) is equal to the dimension of its column space. Putting these facts together, we can see why the rank-nullity theorem must hold. Let’s confirm that the result holds in one more example.

🔗

Example 5.4.13. Null space and column space.

Let

\begin{equation*} A = \bbm 1 \amp -1 \amp 1\amp 3\\4\amp 2\amp 4\amp 6\ebm \quad \text{and} \quad \vb = \bbm 1\\10\ebm\text{.} \end{equation*}

Determine:

The null space of \(A\text{.}\)
🔗

🔗
Whether or not the vector \(\vec{b}\) belongs to the column space of \(A\text{.}\)
🔗

🔗

🔗

Solution.

We’ll tackle the null space first. We form the augmented matrix for the system \(A\vx = \zero\text{,}\) put it into reduced row echelon form, and interpret the result.

\begin{equation*} \left[\begin{array}{cccc|c}1 \amp -1 \amp 1\amp 3\amp 0\\4\amp 2\amp 4\amp 6\amp 0\end{array}\right] \quad \arref \quad \left[\begin{array}{cccc|c}1\amp 0\amp 1\amp 2\amp 0\\0\amp 1\amp 0\amp -1\amp 0\end{array}\right] \end{equation*}

\begin{align*} x_1 \amp =-x_3-2x_4\\ x_2 \amp = x_4\\ x_3 \amp = s \text{ is free}\\ x_4 \amp = t \text{ is free} \text{.} \end{align*}

🔗

We now obtain our vector solution

\begin{equation*} \vx = \bbm x_1\\x_2\\x_3\\x_4\ebm = \bbm-s-2t\\t\\s\\t\ebm\text{.} \end{equation*}

🔗

Finally, we “pull apart” this vector into two vectors, one with the “\(s\) stuff” and one with the “\(t\) stuff.”

\begin{align*} \vx \amp = \bbm-x_3-2x_4\\x_4\\x_3\\x_4\ebm\\ \amp = \bbm-x_3\\0\\x_3\\0\ebm + \bbm-2x_4\\x_4\\0\\x_4\ebm\\ \amp =x_3\bbm-10\\1\\0\ebm + x_4\bbm-2\\1\\0\\1\ebm\\ \amp =x_3\vu + x_4\vvv \text{.} \end{align*}

🔗

We use \(\vu\) and \(\vvv\) simply to give these vectors names (and save some space). In terms of these names, we can write

\begin{equation*} \operatorname{null}(A) = \operatorname{span}\{\vu, \vvv\}\text{.} \end{equation*}

It is easy to confirm that both \(\vu\) and \(\vvv\) are solutions to the linear system \(A\vx=\zero\text{.}\) (Just multiply \(A\vu\) and \(A\vvv\) and see that both are \zero.) Since both are solutions to a homogeneous system of linear equations, any linear combination of \(\vu\) and \(\vvv\) will be a solution, too, so the vectors \(\vu\) and \(\vvv\) form a basis for the null space of \(A\text{.}\)

🔗

Now let’s tackle the column space. Determining whether or not \(\vec b\) belongs to the column space is the same as solving the system \(A\vx = \vb\text{.}\) Once again we put the associated augmented matrix into reduced row echelon form and interpret the results.

\begin{equation*} \left[\begin{array}{cccc|c}1 \amp -1 \amp 1\amp 3\amp 1\\4\amp 2\amp 4\amp 6\amp 10\end{array}\right] \quad \arref \quad \left[\begin{array}{cccc|c}1\amp 0\amp 1\amp 2\amp 2\\0\amp 1\amp 0\amp -1\amp 1\end{array}\right] \end{equation*}

\begin{align*} x_1 \amp =2-s-2t\\ x_2 \amp = 1+t\\ x_3 \amp = s \text{ is free}\\ x_4 \amp = t\text{ is free} \text{.} \end{align*}

Since our system is consistent, we can conclude that \(\vec{b}\in\operatorname{col}(A)\text{.}\) Let us expand on this result a bit.

🔗

Writing this solution in vector form gives

\begin{equation*} \vx = \bbm x_1\\x_2\\x_3\\x_4\ebm = \bbm 2-s-2t\\1+t\\s\\t\ebm\text{.} \end{equation*}

Again, we pull apart this vector, but this time we break it into three vectors: one with “\(s\)” stuff, one with “\(t\)” stuff, and one with just constants.

\begin{align*} \vx \amp = \bbm 2-s-2t\\1+t\\s\\t\ebm\\ \amp = \bbm 2\\1\\0\\0\ebm + \bbm-s\\0\\s\\0\ebm + \bbm-2t\\t\\0\\t\ebm\\ \amp =\bbm 2\\1\\0\\0\ebm+s\bbm-1\\0\\1\\0\ebm + t\bbm-2\\1\\0\\1\ebm\\ \amp =\underbrace{\vec{x}_p}_{\text{particular solution}}+\underbrace{s\vu + t\vvv}_{\text{solution to homogeneous system}}\text{.} \end{align*}

Note that \(A\vec{x}_p = \vb\text{;}\) by itself, \(\vec{x}_p\) is a solution. The fact that we have at least one vector \(\vec{x}_p\) such that \(A\vec{x}_p=\vec{b}\) tells us that \(\vb\) belongs to the range of the transformation \(T(\vx) = A\vx\text{.}\) The fact that there is more than one solution corresponds to the fact that the null space of \(A\) is non-trivial.

🔗

Why don’t we graph this solution as we did in the past? Before we had only two variables, meaning the solution could be graphed in 2D. Here we have four variables, meaning that our solution “lives” in 4D. You can draw this on paper, but it is very confusing.

🔗

For further verification of Theorem 5.4.12, the reader is encouraged to revisit the examples of Section 3.6 and re-interpret them in the context of null space and column space.

🔗

Exercises 5.4.4 Exercises

Exercise Group.

Determine a basis for the null space and column space of the given matrix, and verify Theorem 5.4.12.

🔗

1.

\(A = \bbm 1 \amp 2 \amp 0\amp 3\\2\amp 4\amp -1\amp 6\\-3\amp -6\amp 2\amp -4\ebm\)

🔗

2.

\(A = \bbm 3 \amp 2 \amp 0\amp -5\-1\amp 6\amp -4\amp 3\\2\amp 8\amp -4\amp -2\ebm\)

🔗

3.

\(A = \bbm 1 \amp -2 \amp 1\-2\amp -4\amp 6\\0\amp -8\amp 8\ebm\)

🔗

4.

\(A = \bbm 2 \amp -1 \amp 4\\1\amp 2\amp 5\\3\amp -4\amp 2\ebm\)

🔗

5.

\(A = \bbm 2 \amp 0 \amp 3\amp 5\\1\amp -1\amp 4\amp 2\\0\amp 3\amp -6\amp 2\ebm\)

🔗

6.

\(A = \bbm 2 \amp 3 \amp -1\-1\amp 5\amp 2\\1\amp 8\amp 1\ebm\)

🔗

7.

\(A = \bbm 1 \amp -2 \amp 4\amp 7\\2\amp -4\amp 3\amp 6\\-1\amp 2\amp 1\amp 1\ebm\)

🔗

8.

\(A = \bbm 1 \amp -3 \amp 5\\2\amp -1\amp 2\\-3\amp 0\amp 2\ebm\)

🔗

Prev Top Next