In the previous section we discussed standard transformations of the Cartesian plane — rotations, reflections, etc.. As a motivational example for this section’s study, let’s consider another transformation. Let’s find the matrix that moves the unit square one unit to the right (see Figure 5.2.1). This is called a translation.
Our work from the previous section allows us to find the matrix quickly. By looking at the picture, it is easy to see that \(\veone\) is moved to \(\bbm 2\\0\ebm\) and \(\vetwo\) is moved to \(\bbm 1\\1\ebm\text{.}\) Therefore, the transformation matrix should be
However, look at Figure 5.2.2 where the unit square is drawn after being transformed by \(\tta\text{.}\) It is clear that we did not get the desired result; the unit square was not translated, but rather stretched/sheared in some way.
What did we do wrong? We will answer this question, but first we need to develop a few thoughts and vocabulary terms.
We’ve been using the term “transformation” to describe how we’ve changed vectors. In fact, “transformation” is synonymous to “function.” We are used to functions like \(f(x) = x^2\text{,}\) where the input is a number and the output is another number. In the previous section, we learned about transformations (functions) where the input was a vector and the output was another vector. If \(\tta\) is a “transformation matrix,” then we could create a function of the form \(T(\vx) = \tta\vx\text{.}\) That is, a vector \(\vx\) is the input, and the output is \(\vx\) multiplied by \(\tta\text{.}\)
When we defined \(f(x) = x^2\) above, we let the reader assume that the input was indeed a number. If we wanted to be complete, we should have stated
\begin{equation*}
f:\R\to\R \quad \text{ where } \quad f(x)=x^2\text{.}
\end{equation*}
The first part of that line told us that the input was a real number (that was the first\(\R\)) and the output was also a real number (the second\(\R\)).
To define a transformation where a 2D vector is transformed into another 2D vector via multiplication by a \(2\times 2\) matrix \(\tta\text{,}\) we should write
We now define a special type of transformation (function).
Definition5.2.3.Linear Transformation.
A transformation \(T:\R^n\to\R^m\) is a linear transformation if it satisfies the following two properties:
\(T(\vx + \vy) = T(\vx) + T(\vy)\) for all vectors \(\vx\) and \(\vy\text{,}\) and
\(T(k\vx)= kT(\vx)\) for all vectors \(\vx\) and all scalars \(k\text{.}\)
If \(T\) is a linear transformation, it is often said that “\(T\) is linear.”
Note that the two defining properties of a linear transformation, when combined, tell us that linear transformations “map linear combinations to linear combinations.” That is, if we know the values of \(T(\vec{v}_1), T(\vec{v}_2), \ldots, T(\vec{v}_k)\text{,}\) and we’re given \(\vec{v} = c_1\vec{v}_1+c_2\vec{v}_2+\cdots+c_k\vec{v}_k\text{,}\) then
Compute the value of \(T\left(\bbm 1\\11\ebm\right)\text{.}\)
Solution.
This problem takes some more work, since we first need to figure out how to write the vector \(\bbm 1\\11\ebm\) as a linear combination of the vectors \(\bbm 2\\1\ebm\) and \(\bbm -1\\3\ebm\text{.}\) That is, we need to find scalars \(a\) and \(b\) such that
If we multiply the first equation by 3 and add it to the second, we get \(7a+0b=14\text{,}\) so \(a=2\text{.}\) Plugging this value back into either equation gives us \(b=3\text{,}\) so we have
Notice that in Example 5.2.5, in order to make use of the properties of our linear transformation, we had to first solve a system of linear equations. With two equations in two unknowns, it’s not too hard to come up with the answer. As the size and number of the vectors involved increases, such problems cannot be tackled without a systematic method for solving systems of linear equations. Fortunately, we will be introducing just such a method in Chapter 3.
The previous two examples show us what we can do if we know in advance that our transformation is linear. The next two examples show us how to determine if a given transformation is indeed a linear transformation.
Example5.2.6.Identifying linear transformations.
Determine whether or not the transformation \(T:\R^{2}\to\R^{3}\) is a linear transformation, where
So far it seems that \(T\) is indeed linear, for it worked in one example with arbitrarily chosen vectors and scalar. Now we need to try to show it is always true.
Consider \(T(\vx+\vy)\text{.}\) By the definition of \(T\text{,}\) we have
By Theorem 4.2.11, part Item 2 we state that the Distributive Property holds for matrix multiplication. (Recall that a vector is just a special type of matrix, so this theorem applies to matrix–vector multiplication as well.) So \(\tta(\vx+\vy) = \tta\vx + \tta\vy\text{.}\) Recognize now that this last part is just \(T(\vx) + T(\vy)\text{!}\) We repeat the above steps, all together:
\begin{align*}
T(\vx+\vy) \amp = \tta(\vx+\vy) \quad \text{ (by the definition of } T \text{ in this example)}\\
\amp = \tta\vx + \tta\vy \quad \text{ (by the Distributive Property)}\\
\amp = T(\vx) + T(\vy) \quad \text{ (again, by the definition of } T)\text{.}
\end{align*}
Therefore, no matter what \(\vx\) and \(\vy\) are chosen, \(T(\vx+\vy) = T(\vx) + T(\vy)\text{.}\) Thus the first part of the linearity definition is satisfied.
The second part is satisfied in a similar fashion. Let \(k\) be a scalar, and consider:
\begin{align*}
T(k\vx) \amp = \tta(k\vx) \quad \text{ (by the definition of } T \text{ in this example)}\\
\amp = k\tta\vx \quad \text{ (by \knowl{./knowl/xref/thm-matrix_multiplication.html}{\text{Theorem 4.2.11}} part \knowl{./knowl/xref/thm-mat_mult_scalar.html}{\text{Item 3}})}\\
\amp = kT(\vx) \quad \text{ (again, by the definition of } T)
\end{align*}
Since \(T\) satisfies both parts of the definition, we conclude that \(T\) is a linear transformation.
In the previous two examples of transformations, we saw one transformation that was not linear and one that was. One might wonder “Why is linearity important?”, which we’ll address shortly.
First, consider how we proved the transformation in Example 5.2.7 was linear. We defined \(T\) by matrix multiplication, that is, \(T(\vx) = \tta\vx\text{.}\) We proved \(T\) was linear using properties of matrix multiplication — we never considered the specific values of \(\tta\text{!}\) That is, we didn’t just choose a good matrix for \(T\text{;}\)any matrix \(\tta\) would have worked. This leads us to an important theorem. The first part we have essentially just proved; the second part we won’t prove, although its truth is very powerful.
Theorem5.2.8.Matrices and Linear Transformations.
Define \(T:\R^{n}\to\R^{m}\) by \(T(\vx) = \tta\vx\text{,}\) where \(\tta\) is an \(m\times n\) matrix. Then \(T\) is a linear transformation.
Let \(T:\R^{n}\to\R^{m}\) be any linear transformation. Then there exists an unique \(m\times n\) matrix \(\tta\) such that \(T(\vx) = \tta\vx\text{.}\)
The second part of the theorem says that all linear transformations can be described using matrix multiplication. Given any linear transformation, there is a matrix that completely defines that transformation. This important matrix gets its own name.
Definition5.2.9.Standard Matrix of a Linear Transformation.
Let \(T:\R^{n}\to\R^{m}\) be a linear transformation. By Theorem 5.2.8, there is a matrix \(\tta\) such that \(T(\vx) = \tta\vx\text{.}\) This matrix \(\tta\) is called the standard matrix of the linear transformation \(T\), and is denoted \([\, T\, ]\text{.}\)
While exploring all of the ramifications of Theorem 5.2.8 is outside the scope of this text, let it suffice to say that since 1) linear transformations are very, very important in economics, science, engineering and mathematics, and 2) the theory of matrices is well developed and easy to implement by hand and on computers, then 3) it is great news that these two concepts go hand in hand.
We have already used the second part of this theorem in a small way. In the previous section we looked at transformations graphically and found the matrices that produced them. At the time, we didn’t realize that these transformations were linear, but indeed they were.
This brings us back to the motivating example with which we started this section. We tried to find the matrix that translated the unit square one unit to the right. Our attempt failed, and we have yet to determine why. Given our link between matrices and linear transformations, the answer is likely “the translation transformation is not a linear transformation.” While that is a true statement, it doesn’t really explain things all that well. Is there some way we could have recognized that this transformation wasn’t linear? (That is, apart from applying the definition directly?)
Yes, there is. Consider the second part of the linear transformation definition. It states that \(T(k\vx) = kT(\vx)\) for all scalars \(k\text{.}\) If we let \(k=0\text{,}\) we have \(T(0\vx) = 0\cdot T(\vx)\text{,}\) or more simply, \(T(\zero) = \zero\text{.}\) That is, if \(T\) is to be a linear transformation, it must send the zero vector to the zero vector.
This is a quick way to see that the translation transformation fails to be linear. By shifting the unit square to the right one unit, the corner at the point \((0,0)\) was sent to the point \((1,0)\text{,}\) i.e.,
\begin{equation*}
\text{the vector } \bbm 0\\0\ebm \text{ was sent to the vector } \bbm 1\\0\ebm\text{.}
\end{equation*}
This property relating to \(\zero\) is important, so we highlight it here.
Key Idea5.2.10.Linear Transformations and \(\zero\).
Let \(T:\R^{n}\to\R^{m}\) be a linear transformation. Then:
That is, the zero vector in\(\R^{n}\) gets sent to the zero vector in\(\R^{m}\text{.}\)
The Standard Matrix of a Linear Transformation.
It is often the case that while one can describe a linear transformation, one doesn’t know what matrix performs that transformation (i.e., one doesn’t know the standard matrix of that linear transformation). How do we systematically find it? We’ll need a new definition.
Definition5.2.11.Standard Unit Vectors.
In\(\R^{n}\text{,}\) the standard unit vectors \(\vei\) are the vectors with a \(1\) in the \(i\)th entry and \(0\)s everywhere else.
We’ve already seen these vectors in the previous section. In\(\R^{2}\text{,}\) we identified
How do these vectors help us find the standard matrix of a linear transformation? Recall again our work in the previous section. There, we practised looking at the transformed unit square and deducing the standard transformation matrix \(\tta\text{.}\) We did this by making the first column of \(\tta\) the vector where \(\veone\) ended up and making the second column of \(\tta\) the vector where \(\vetwo\) ended up. One could represent this with:
That is, \(T(\veone)\) is the vector where \(\veone\) ends up, and \(T(\vetwo)\) is the vector where \(\vetwo\) ends up.
The same holds true in general. Given a linear transformation \(T:\R^{n}\to\R^{m}\text{,}\) the standard matrix of \(T\) is the matrix whose \(i\)th column is the vector where \(\vei\) ends up. To see that this is the case, note that any vector \(\vec x\in\R^n\) can be written as
where \(A = \bbm T(\veone) \amp T(\vetwo) \amp \cdots \amp T(\ven{n})\ebm\) is the \(m\times n\) matrix whose columns are given by the vectors \(T(\ven{i})\text{,}\) for \(i=1,2,\ldots, n\text{.}\) Thus, we have the following theorem.
Theorem5.2.12.The Standard Matrix of a Linear Transformation.
Let \(T:\R^{n}\to\R^{m}\) be a linear transformation. Then \([\, T \, ]\) is the \(m\times n\) matrix:
We find the columns of \(\TT\) by finding where \(\veone\text{,}\)\(\vetwo\) and \(\vethree\) are sent, that is, we find \(T(\veone)\text{,}\)\(T(\vetwo)\) and \(T(\vethree)\text{.}\)
They match! (Of course they do. That was the whole point.)
Let’s do another example, one that is more application oriented.
Example5.2.14.An application to baseball.
A baseball team manager has collected basic data concerning his hitters. He has the number of singles, doubles, triples, and home runs they have hit over the past year. For each player, he wants two more pieces of information: the total number of hits and the total number of bases.
Using the techniques developed in this section, devise a method for the manager to accomplish his goal.
Solution.
If the manager only wants to compute this for a few players, then he could do it by hand fairly easily. After all:
total # hits = # of singles + # of doubles + # of triples + # of home runs,
and
total # bases = # of singles + \(2\,\times\)# of doubles + \(3\,\times\)# of triples + \(4\,\times\)# of home runs.
However, if he has a lot of players to do this for, he would likely want a way to automate the work. One way of approaching the problem starts with recognizing that he wants to input four numbers into a function (i.e., the number of singles, doubles, etc.) and he wants two numbers as output (i.e., number of hits and bases). Thus he wants a transformation \(T:\R^{4}\to\R^{2}\) where each vector in \(\R^{4}\) can be interpreted as
\begin{equation*}
\bbm \text{\#\ of singles} \\\text{\#\ of doubles} \\ \text{\#\ of triples} \\ \text{\#\ of home runs} \ebm,
\end{equation*}
and each vector in \(\R^{2}\) can be interpreted as
\begin{equation*}
\bbm \text{\#\ of hits} \\\text{\#\ of bases} \ebm\text{.}
\end{equation*}
To find \(\TT\text{,}\) he computes \(T(\veone)\text{,}\)\(T(\vetwo)\text{,}\)\(T(\vethree)\) and \(T(\ven{4})\text{.}\)
meaning the player had 154 hits and 242 total bases.
A question that we should ask concerning the previous example is “How do we know that the function the manager used was actually a linear transformation? After all, we were wrong before — the translation example at the beginning of this section had us fooled at first.”
This is a good point; the answer is fairly easy. Recall from Example 5.2.6 the transformation
where we use the subscripts for \(T\) to remind us which example they came from.
We found that \(T_{\text{6.2.3}}\) was not a linear transformation, but stated that \(T_{\text{6.2.5}}\) was (although we didn’t prove this). What made the difference?
Look at the entries of \(T_{\text{6.2.3}}(\vx)\) and \(T_{\text{6.2.5}}(\vx)\text{.}\)\(T_{\text{6.2.3}}\) contains entries where a variable is squared, and where 2 variables are multiplied together — these prevent \(T_{\text{6.2.3}}\) from being linear. On the other hand, the entries of \(T_{\text{6.2.5}}\) are all of the form \(a_1x_1 + \cdots + a_nx_n\text{;}\) that is, they are just sums of the variables multiplied by coefficients. \(T\) is linear if and only if the entries of \(T(\vx)\) are of this form. (Hence linear transformations are related to linear equations, as defined in Section 3.1.) This idea is important.
Key Idea5.2.15.Conditions on Linear Transformations.
Let \(T:\R^{n}\to\R^{m}\) be a transformation and consider the entries of
Since that fits the model shown in Key Idea 5.2.15, the transformation \(T\) is indeed linear and hence we can find a matrix \(\TT\) that represents it.
Let’s practice this concept further in an example.
Example5.2.16.Using Key Idea 5.2.15 to identify linear transformations.
Using Key Idea 5.2.15, determine whether or not each of the following transformations is linear.
\(T_1\) is not linear! This may come as a surprise, but we are not allowed to add constants to the variables. By thinking about this, we can see that this transformation is trying to accomplish the translation that got us started in this section — it adds 1 to all the \(x\) values and leaves the \(y\) values alone, shifting everything to the right one unit. However, this is not linear; again, notice how \(\zero\) does not get mapped to \(\zero\text{.}\)
\(T_2\) is also not linear. We cannot divide variables, nor can we put variables inside the square root function (among other other things; again, see Section 3.1). This means that the baseball manager would not be able to use matrices to compute a batting average, which is (number of hits)/(number of at bats).
\(T_3\) is linear. Recall that the coefficients \(\sqrt{7}\) and \(\pi\) are just numbers.
In the next section we introduce the concept of a subspace. This subject is closely related to the concepts of span and linear independence covered in Section 2.7. We introduce it here in preparation for our final section, where we will define two important subspaces associated to a matrix transformation: the null space and the column space.
ExercisesExercises
Exercise Group.
A transformation \(T\) is given. Determine whether or not \(T\) is linear; if not, state why not.