We already looked at the basics of graphing vectors, and the arithmetic of multiplying matrices. In this section and the next, we will return to the geometric interpretation of vectors given in Chapter 2. Our goal in doing so is to obtain a visual understanding of the definition of matrix multiplication given in Section 4.2. Although the algebraic definition of matrix multiplication appears strange at first, we’ll see that our definition of matrix multiplication allows us to use matrices to define functions that transform one vector into another, just as the functions you’re familiar with from high school or Calculus transform one number into another. We can then visualize matrices and matrix multiplication in terms of their effect on vectors.
Given an \(m\times n\) matrix \(A\) we can define a function \(T\) that takes an \(n\times 1\) column vector \(\vec x\in \mathbb{R}^n\) as input, and produces an \(m\times 1\) column vector \(\vec y\in \mathbb{R}^m\) as output, according to the relationship
\begin{equation*}
\vec y = T(\vec x) = A\vec x\text{.}
\end{equation*}
Such a function is known as a matrix transformation; it is an example of a more general class of functions between vector spaces known as linear transformations.
The graphical representation of vectors allows us to visualize matrix transformations (at least, in lower dimensions). This visualization plays a key role in applications such as computer graphics. We’ll also see that the desire to define functions via matrix multiplication provides some justification for defining matrix multiplication the way we do.
Subsection5.1.1Matrix – Vector Multiplication
To simplify the discussion, and to make it easier for us to picture what’s going on, we’ll restrict ourselves (for now) to vectors in \(\mathbb{R}^2\text{.}\) We want to visualize the result of multiplying a vector by a matrix. In order to multiply a 2D vector by a matrix and get a 2D vector back, our matrix must be a square, \(2\times 2\) matrix.
We’ll start with an example. Given a matrix \(\tta\) and several vectors, we’ll graph the vectors before and after they’ve been multiplied by \(\tta\) and see what we learn.
Example5.1.1.Multiplying a vector by a matrix.
Let \(\tta\) be a matrix, and\(\vx\text{,}\)\(\vy\text{,}\) and \(\vz\) be vectors as given below.
There are several things to notice. When each vector is multiplied by \(\tta\text{,}\) the result is a vector with a different length (in this example, always longer), and in two of the cases (for \(\vy\) and \vz), the resulting vector points in a different direction.
This isn’t surprising. In the previous section we learned about matrix multiplication, which is a strange and seemingly unpredictable operation. Would you expect to see some sort of immediately recognizable pattern appear from multiplying a matrix and a vector? (This is a rhetorical question; the expected answer is “No.”) In fact, the surprising thing from the example is that \(\vx\) and \(\tta\vx\) point in the same direction! Why does the direction of \(\vx\) not change after multiplication by \(\tta\text{?}\) (We’ll answer this in Section 7.1 when we learn about something called “eigenvectors.”)
Different matrices act on vectors in different ways. (That’s one reason we call them “different.”) Some always increase the length of a vector through multiplication, others always decrease the length, others increase the length of some vectors and decrease the length of others, and others still don’t change the length at all. A similar statement can be made about how matrices affect the direction of vectors through multiplication: some change every vector’s direction, some change “most” vector’s direction but leave some the same, and others still don’t change the direction of any vector.
How do we set about studying how matrix multiplication affects vectors? We could just create lots of different matrices and lots of different vectors, multiply, then graph, but this would be a lot of work with very little useful result. It would be too hard to find a pattern of behaviour in this. (Remember, that’s what mathematicians do. We look for patterns.)
Instead, we’ll begin by using a technique we’ve employed often in the past. We have a “new” operation; let’s explore how it behaves with “old” operations. Specifically, we know how to sketch vector addition. What happens when we throw matrix multiplication into the mix? Let’s try an example.
Example5.1.3.Combining addition and matrix multiplication.
Let \(\tta\) be a matrix and \(\vx\) and \(\vy\) be vectors as given below.
In Figure 5.1.4, we have graphed the above vectors and have included dashed gray vectors to highlight the additive nature of \(\vx+\vy\) and \(\tta(\vx+\vy)\text{.}\) Does anything strike you as interesting?
Let’s not focus on things which don’t matter right now: let’s not focus on how long certain vectors became, nor necessarily how their direction changed. Rather, think about how matrix multiplication interacted with the vector addition.
In some sense, we started with three vectors,\(\vx\text{,}\)\(\vy\text{,}\) and \(\vx+\vy\text{.}\) This last vector is special; it is the sum of the previous two. Now, multiply all three by \(\tta\text{.}\) What happens? We get three new vectors, but the significant thing is this: the last vector is still the sum of the previous two! (We emphasize this by drawing dotted vectors to represent part of the Parallelogram Law.)
Of course, we knew this already: we already knew that \(\tta\vx + \tta\vy = \tta(\vx+\vy)\text{,}\) for this is just the Distributive Property of matrix multiplication given in Theorem 4.2.11. However, now we get to see this graphically.
Let’s do one more example.
Example5.1.5.Sketching the effect of matrix multiplicaiton.
Let \(\tta\text{,}\)\(\vx\text{,}\)\(\vy\text{,}\) and \(\vz\) be as given below.
These results are interesting. While we won’t explore them in great detail here, notice how \(\vx\) got sent to the zero vector. Notice also that \(\tta\vx\text{,}\)\(\tta\vy\) and \(\tta\vz\) are all in a line (as well as \(\vx\text{!}\)). Why is that? Are \(\vx\text{,}\)\(\vy\) and \(\vz\) just special vectors, or would any other vector get sent to the same line when multiplied by \(\tta\text{?}\) (Don’t just sit there, try it out!)
Subsection5.1.2Transformations of the Cartesian Plane
We studied in Chapter 2 how to visualize vectors and how the matrix arithmetic operations of addition and scalar multiplication can be graphically represented for vectors. In the discussion above, we limited our visual understanding of matrix multiplication to graphing a vector, multiplying it by a matrix, then graphing the resulting vector. In rest of this section we’ll explore these multiplication ideas in greater depth. Instead of multiplying individual vectors by a matrix \(\tta\text{,}\) we’ll study what happens when we multiply every vector in the Cartesian plans by \(\tta\text{.}\) (No, we won’t do them one by one.)
Because of the Distributive Property, as illustrated in Example 5.1.3, we know that the Cartesian plane will be transformed in a very nice, predictable way. Straight lines will be transformed into other straight lines (and they won’t become curvy, or jagged, or broken). Curved lines will be transformed into other curved lines (perhaps the curve will become “straight,” but it won’t become jagged or broken).
Example 5.1.3 has very significant implications. We usually think of the Cartesian plane as a set of points; we can adjust this thought just slightly and think of it as a set of vectors that point to each of these points. What happens to the Cartesian plane if we multiply every vector in the plane by the same matrix \(\tta\text{?}\)
Checking every single vector in the plane isn’t practical, so we have to look for other ways to visualize the effects of matrix multiplication. One way of studying how the whole Cartesian plane is affected by multiplication by a matrix \(\tta\) is to study how the unit square is affected. The unit square is the square with corners at the points \((0,0)\text{,}\)\((1,0)\text{,}\)\((1,1)\text{,}\) and \((0,1)\text{.}\) Each corner can be represented by the vector that points to it; multiply each of these vectors by \(\tta\) and we can get an idea of how \(\tta\) affects the whole Cartesian plane.
Let’s try an example.
Example5.1.7.Visualizing a matrix transformation using vectors.
Plot the vectors of the unit square before and after they have been multiplied by \(\tta\text{,}\) where
(Hint: one way of using your calculator to do this for you quickly is to make a \(2\times 4\) matrix whose columns are each of these vectors. In this case, create a matrix
This saves time, especially if you do a similar procedure for multiple matrices \(\tta\text{.}\) Of course, we can save more time by skipping the first column; since it is the column of zeros, it will stay the column of zeros after multiplication by \(\tta\text{.}\))
The unit square and its transformation are graphed in Figure 5.1.8, where the shaped vertices correspond to each other across the two graphs. Note how the square got turned into some sort of quadrilateral (it’s actually a parallelogram). A really interesting thing is how the triangular and square vertices seem to have changed places — it is as though the square, in addition to being stretched out of shape, was flipped.
To stress how “straight lines get transformed to straight lines,” consider Figure 5.1.9. Here, the unit square has some additional points drawn on it which correspond to the shaded dots on the transformed parallelogram. Note how relative distances are also preserved; the dot halfway between the black and square dots is transformed to a position along the line, halfway between the black and square dots.
Much more can be said about this example. Before we delve into this, though, let’s try one more example.
Example5.1.10.Visualizing a matrix transformation using a region.
Plot the transformed unit square after it has been transformed by \(\tta\text{,}\) where
We’ll put the vectors that correspond to each corner in a matrix \(\ttb\) as before and then multiply it on the left by \(\tta\text{.}\) Doing so gives:
In Figure 5.1.11 the unit square is again drawn along with its transformation by \(\tta\text{.}\)
Make note of how the square moved. It did not simply “slide” to the left; (mathematically, that is called a translation) nor did it “flip” across the \(y\) axis. Rather, it was rotated counterclockwise about the origin \(90^\circ\text{.}\) In a rotation, the shape of an object does not change; in our example, the square remained a square of the same size.
We have broached the topic of how the Cartesian plane can be transformed via multiplication by a \(2\times 2\) matrix \(\tta\text{.}\) We have seen a few examples so far, and our intuition as to how the plane is changed has been informed only by seeing how the unit square changes. Let’s explore this further by investigating two questions:
Suppose we want to transform the Cartesian plane in a known way (for instance, we may want to rotate the plane counterclockwise \(180^\circ\)). How do we find the matrix (if one even exists) which performs this transformation?
How does knowing how the unit square is transformed really help in understanding how the entire plane is transformed?
These questions are closely related, and as we answer one, we will help answer the other.
To get started with the first question, look back at Example 5.1.7 and Example 5.1.10 and consider again how the unit square was transformed. In particular, is there any correlation between where the vertices ended up and the matrix \(\tta\text{?}\)
If you are just reading on, and haven’t actually gone back and looked at the examples, go back now and try to make some sort of connection. Otherwise -- you may have noted some of the following things:
The zero vector (\(\zero\text{,}\) the “black” corner) never moved. That makes sense, though; \(\tta\zero = \zero\text{.}\)
The “square” corner, i.e., the corner corresponding to the vector \(\bbm 1\\0\ebm\text{,}\) is always transformed to the vector in the first column of \(\tta\text{!}\)
Likewise, the “triangular” corner, i.e., the corner corresponding to the vector \(\bbm 0\\1\ebm\text{,}\) is always transformed to the vector in the second column of \(\tta\text{!}\) (This is less of a surprise, given the result of the previous point.)
The “white dot” corner is always transformed to the sum of the two column vectors of \(\tta\text{.}\) (This observation is a bit more obscure than the first three. It follows from the fact that this corner of the unit square is the “sum” of the other two nonzero corners.)
Let’s now take the time to understand these four points. The first point should be clear; \(\zero\) will always be transformed to \(\zero\) via matrix multiplication. (Hence the hint in the middle of Example 5.1.7, where we are told that we can ignore entering in the column of zeros in the matrix \(\ttb\text{.}\))
We can understand the second and third points simultaneously. Let
So by mere mechanics of matrix multiplication, the square corner \(\veone\) is transformed to the first column of \(\tta\text{,}\) and the triangular corner \(\vetwo\) is transformed to the second column of \(\tta\text{.}\) A similar argument demonstrates why the white dot corner is transformed to the sum of the columns of \(\tta\text{.}\) (Another way of looking at all of this is to consider what \(\tta\cdot\tti\) is: of course, it is just \(\tta\text{.}\) What are the columns of \(\tti\text{?}\) Just \(\veone\) and \(\vetwo\text{.}\))
Revisit now the question “How do we find the matrix that performs a given transformation on the Cartesian plane?” The answer follows from what we just did. Think about the given transformation and how it would transform the corners of the unit square. Make the first column of \(\tta\) the vector where \(\veone\) goes, and make the second column of \(\tta\) the vector where \(\vetwo\) goes.
Let’s practice this in the context of an example.
Example5.1.12.Determining a matrix transformation.
Find the matrix \(\tta\) that flips the Cartesian plane about the \(x\) axis and then stretches the plane horizontally by a factor of two.
Solution.
We first consider \(\veone = \bbm 1\\0\ebm\text{.}\) Where does this corner go to under the given transformation? Flipping the plane across the \(x\) axis does not change \(\veone\) at all; stretching the plane sends \(\veone\) to \(\bbm 2\\0\ebm\text{.}\) Therefore, the first column of \(\tta\) is \(\bbm 2\\0\ebm\text{.}\)
Now consider \(\vetwo = \bbm 0\\1\ebm\text{.}\) Flipping the plane about the \(x\) axis sends \(\vetwo\) to the vector \(\bbm 0\\-1\ebm\text{;}\) subsequently stretching the plane horizontally does not affect this vector. Therefore the second column of \(\tta\) is \(\bbm 0\\-1\ebm\text{.}\)
To help visualize this, consider Figure 5.1.13 where a shape is transformed under this matrix. Notice how it is turned upside down and is stretched horizontally by a factor of two. (The gridlines are given as a visual aid.)
A while ago we asked two questions. The first was “How do we find the matrix that performs a given transformation?” We have just answered that question (although we will do more to explore it in the future). The second question was “How does knowing how the unit square is transformed really help us understand how the entire plane is transformed?”
Consider Figure 5.1.14 where the unit square (with vertices marked with shapes as before) is shown transformed under an unknown matrix. How does this help us understand how the whole Cartesian plane is transformed? For instance, how can we use this picture to figure out how the point \((2,3)\) will be transformed?
There are two ways to consider the solution to this question. First, we know now how to compute the transformation matrix; the new position of \(\veone\) is the first column of \(\tta\text{,}\) and the new position of \(\vetwo\) is the second column of \(\tta\text{.}\) Therefore, by looking at the figure, we can deduce that
There is another way of doing this which isn’t as computational — it doesn’t involve computing the transformation matrix. Consider the following equalities:
This last equality states something that is somewhat obvious: to arrive at the vector \(\bbm 2\\3\ebm\text{,}\) ne needs to go \(2\) units in the \(\veone\) direction and \(3\) units in the \(\vetwo\) direction. To find where the point \((2,3)\) is transformed, one needs to go \(2\) units in the new\(\veone\) direction and \(3\) units in the new\(\vetwo\) direction. This is demonstrated in Figure 5.1.15.
We are coming to grips with how matrix transformations work. We asked two basic questions: “How do we find the matrix for a given transformation?” and “How do we understand the transformation without the matrix?”, and we’ve answered each accompanied by one example. Let’s do another example that demonstrates both techniques at once.
Example5.1.16.Determining and analyzing a matrix transformation.
First, find the matrix \(\tta\) that transforms the Cartesian plane by stretching it vertically by a factor of \(1.5\text{,}\) then stretches it horizontally by a factor of \(0.5\text{,}\) then rotates it clockwise about the origin \(90^\circ\text{.}\) Secondly, using the new locations of \(\veone\) and \(\vetwo\text{,}\) find the transformed location of the point \((-1,2)\text{.}\)
Solution.
To find \(\tta\text{,}\) first consider the new location of \(\veone\text{.}\) Stretching the plane vertically does not affect \(\veone\text{;}\) stretching the plane horizontally by a factor of \(0.5\) changes \veone to \(\bbm 1/2\\0\ebm\text{,}\) and then rotating it \(90^\circ\) about the origin moves it to \(\bbm 0\\-1/2\ebm\text{.}\) This is the first column of \(\tta\text{.}\)
Now consider the new location of \(\vetwo\text{.}\) Stretching the plane vertically changes it to \(\bbm 0\\ 3/2\ebm\text{;}\) stretching horizontally does not affect it, and rotating \(90^\circ\) moves it to \(\bbm 3/2\\0\ebm\text{.}\) This is then the second column of \(\tta\text{.}\) This gives
Where does the point \((-1,2)\) get sent to? The corresponding vector \(\bbm-1\\2\ebm\) is found by going \(-1\) units in the \(\veone\) direction and \(2\) units in the \(\vetwo\) direction. Therefore, the transformation will send the vector to \(-1\) units in the new \(\veone\) direction and \(2\) units in the new \(\vetwo\) direction. This is sketched in Figure 5.1.17, along with the transformed unit square. We can also check this multiplicatively:
Figure 5.1.18 shows the effects of the transformation on another shape.
Right now we are focusing on transforming the Cartesian plane — we are making 2D transformations. Knowing how to do this provides a foundation for transforming 3D space, which, among other things, is very important when producing 3D computer graphics. Basic shapes can be drawn and then rotated, stretched, and/or moved to other regions of space. This also allows for things like “moving the camera view.” Of course, algebraically, there is nothing stopping us from working with transformations of vectors in \(\mathbb{R}^n\) for any value of \(n\text{.}\) The limitation to two and three dimensions is strictly one of visualization.
What kinds of transformations are possible? We have already seen some of the things that are possible: rotations, stretches, and flips. We have also mentioned some things that are not possible. For instance, we stated that straight lines always get transformed to straight lines. Therefore, we cannot transform the unit square into a circle using a matrix.
Let’s look at some common transformations of the Cartesian plane and the matrices that perform these operations. In the following figures, a transformation matrix will be given alongside a picture of the transformed unit square. (The original unit square is drawn lightly as well to serve as a reference.)
Subsection5.1.32D Matrix Transformations
Horizontal stretch by a factor of \(k\text{.}\)
\begin{equation*}
\bbm k \amp 0\\0 \amp 1\ebm
\end{equation*}
Now that we have seen a healthy list of transformations that we can perform on the Cartesian plane, let’s practice a few more times creating the matrix that gives the desired transformation. In the following example, we develop our understanding one more critical step.
Example5.1.19.Determining the matrix of a transformation.
Find the matrix \(\tta\) that transforms the Cartesian plane by performing the following operations in order:
Vertical shear by a factor of 0.5
Counterclockwise rotation about the origin by an angle of \(\theta = 30^\circ\)
Horizontal stretch by a factor of 2
Diagonal reflection across the line \(y=x\)
Solution.
Wow! We already know how to do this — sort of. We know we can find the columns of \(\tta\) by tracing where \(\veone\) and \(\vetwo\) end up, but this also seems difficult. There is so much that is going on. Fortunately, we can accomplish what we need without much difficulty by being systematic.
First, let’s perform the vertical shear. The matrix that performs this is
In order to do both of these operations, in order, we multiply \(\tta_2\tta_1\text{.}\)
Let’s consider this closely. Suppose I want to know where a vector \(\vx\) ends up. We claim we can find the answer by multiplying \(\tta\vx\text{.}\) Why does this work? Consider:
\begin{align*}
\tta\vx \amp = \tta_4\tta_3\tta_2\tta_1\vx \amp \amp\\
\amp = \tta_4\tta_3\tta_2(\tta_1\vx) \amp \amp \text{(performs the vertical shear)}\\
\amp = \tta_4\tta_3(\tta_2\vx[1]) \amp \amp \text{(performs the rotation)}\\
\amp = \tta_4(\tta_3\vx[2]) \amp \amp \text{(performs the horizontal stretch)}\\
\amp = \tta_4\vx[3] \amp \amp \text{(performs the diagonal reflection)}\\
\amp = \vx[4] \amp \amp \text{(the result of transforming } \vx)
\end{align*}
Most readers are not able to visualize exactly what the given list of operations does to the Cartesian plane. In Figure 5.1.20 we sketch the transformed unit square; in Figure 5.1.21 we sketch a shape and its transformation.
Once we know what matrices perform the basic transformations, (or know where to find them) performing complex transformations on the Cartesian plane really isn’t that \(\ldots\) complex. It boils down to multiplying by a series of matrices.
We’ve shown many examples of transformations that we can do, and we’ve mentioned just a few that we can’t — for instance, we can’t turn a square into a circle. Why not? Why is it that straight lines get sent to straight lines?
All these questions require us to think like mathematicians — we are being asked to study the properties of an object we just learned about and their connections to things we’ve already learned. We’ll do all this (and more!) in the following section.
Exercises5.1.4Exercises
Exercise Group.
A matrix \(\tta\) is given. Sketch\(\vx\text{,}\)\(\vy\text{,}\)\(\tta\vx\text{,}\) and \(\tta\vy\) on the same Cartesian axes, where
A sketch of transformed unit square is given. Find the matrix \(\tta\) that performs this transformation.
5.
6.
7.
8.
Exercise Group.
A list of transformations is given. Find the matrix \(\tta\) that performs those transformations, in order, on the Cartesian plane.
9.
vertical shear by a factor of 2
horizontal shear by a factor of 2
10.
horizontal shear by a factor of 2
vertical shear by a factor of 2
11.
horizontal stretch by a factor of 3
reflection across the line \(y=x\)
12.
counterclockwise rotation by an angle of \(45^\circ\)
vertical stretch by a factor of \(1/2\)
13.
clockwise rotation by an angle of \(90^\circ\)
horizontal reflection across the \(y\) axis
vertical shear by a factor of 1
14.
vertical reflection across the \(x\) axis
horizontal reflection across the \(y\) axis
diagonal reflection across the line \(y=x\)
Exercise Group.
Two sets of transformations are given. Sketch the transformed unit square under each set of transformations. Are the transformations the same? Explain why/why not.
15.
a horizontal reflection across the \(y\) axis, followed by a vertical reflection across the \(x\) axis, compared to
a counterclockise rotation of \(180^\circ\)
16.
a horizontal stretch by a factor of 2 followed by a reflection across the line \(y=x\text{,}\) compared to
a vertical stretch by a factor of 2
17.
a horizontal stretch by a factor of 1/2 followed by a vertical stretch by a factor of 3, compared to
the same operations but in opposite order
18.
a reflection across the line \(y=x\) followed by a reflection across the \(x\) axis, compared to
a reflection across the the \(y\) axis, followed by a reflection across the line \(y=x\text{.}\)