Wow! It looks like multiplying \(\tta\vx\) is the same as \(5\vx\text{!}\) This makes us wonder lots of things: is this the only case in the world where something like this happens? (Probably not.) Is \(\tta\) somehow a special matrix, and \(\tta\vx = 5\vx\) for any vector \(\vx\) we pick? (Probably not.) Or maybe \(\vx\) was a special vector, and no matter what \(2\times 2\) matrix \(\tta\) we picked, we would have \(\tta\vx =5\vx\text{.}\) (Again, probably not.)
A more likely explanation is this: given the matrix \(\tta\text{,}\) the number 5 and the vector \(\vx\) formed a special pair that happened to work together in a nice way. It is then natural to wonder if other “special” pairs exist. For instance, could we find a vector \(\vx\) where \(\tta\vx=3\vx\text{?}\)
This equation is hard to solve at first; we are not used to matrix equations where \(\vx\) appears on both sides of “\(=\text{.}\)” Therefore we put off solving this for just a moment to state a definition and make a few comments.
Definition7.1.1.Eigenvalues and Eigenvectors.
Let \(\tta\) be an \(n\times n\) matrix, \(\vx\) a nonzero \(n\times 1\) column vector and \(\lambda\) a scalar. If
then \(\vx\) is an eigenvector of \(\tta\) and is an eigenvalue of \(\tta\text{.}\)
The word “eigen” is German for “proper” or “characteristic.” Therefore, an eigenvector of \(\tta\) is a “characteristic vector of \(\tta\text{.}\)” This vector tells us something about \(\tta\text{.}\)
Why do we use the Greek letter (lambda)? It is pure tradition. Above, we used \(a\) to represent the unknown scalar, since we are used to that notation. We now switch to because that is how everyone else does it. (An example of mathematical peer pressure.) Don’t get hung up on this; it is just a number.
Note that our definition requires that \(\tta\) be a square matrix. If \(\tta\) isn’t square then \(\tta\vx\) and \(\lambda\vx\) will have different sizes, and so they cannot be equal. Also note that \(\vx\) must be nonzero. Why? What if \(\vx = \zero\text{?}\) Then no matter what is, \(\tta\vx = \lda\vx\text{.}\) This would then imply that every number is an eigenvalue; if every number is an eigenvalue, then we wouldn’t need a definition for it. Therefore we specify that \(\vx\neq \zero\text{.}\)
Our last comment before trying to find eigenvalues and eigenvectors for given matrices deals with “why we care.” Did we stumble upon a mathematical curiosity, or does this somehow help us build better bridges, heal the sick, send astronauts into orbit, design optical equipment, and understand quantum mechanics? The answer, of course, is “Yes.” (Except for the “understand quantum mechanics” part. Nobody truly understands that stuff; they just probably understand it.) This is a wonderful topic in and of itself: we need no external application to appreciate its worth. At the same time, it has many, many applications to “the real world.”
Back to our math. Given a square matrix \(\tta\text{,}\) we want to find a nonzero vector \(\vx\) and a scalar such that \(\tta\vx = \lda\vx\text{.}\) We will solve this using the skills we developed in Chapter 4.
but this really doesn’t make sense. After all, what does “a matrix minus a number” mean? We need the identity matrix in order for this to be logical.
Let us now think about the equation \((\tta-\lda\tti)\vx=\zero\text{.}\) While it looks complicated, it really is just matrix equation of the type we solved in Section 3.6. We are just trying to solve \(\ttb\vx=\zero\text{,}\) where \(\ttb = (\tta-\lda\tti)\text{.}\)
We know from our previous work that this type of equation always has a solution, namely, \(\vx = \zero\text{.}\) (Recall this is a homogeneous system of equations.) However, we want \(\vx\) to be an eigenvector and, by the definition, eigenvectors cannot be \(\zero\text{.}\)
This means that we want solutions to \((\tta-\lda\tti)\vx=\zero\) other than \(\vx=\zero\text{.}\) Recall that Theorem 4.4.12 says that if the matrix \((\tta-\lda\tti)\) is invertible, then the only solution to \((\tta-\lda\tti)\vx=\zero\) is \(\vx=\zero\text{.}\) Therefore, in order to have other solutions, we need \((\tta-\lda\tti)\) to not be invertible.
Finally, recall from Theorem 6.4.12 that noninvertible matrices all have a determinant of 0. Therefore, if we want to find eigenvalues and eigenvectors\(\vx\text{,}\) we need \(\det(\tta-\lda\tti) = 0\text{.}\)
Let’s start our practice of this theory by finding such that \(\det(\tta-\lda\tti) = 0\text{;}\) that is, let’s find the eigenvalues of a matrix.
Example7.1.2.Computing the eigenvalues of a matrix.
Find the eigenvalues of \(\tta\text{,}\) that is, find such that \(\det(\tta-\lda\tti) = 0\text{,}\) where
According to our above work, \(\det(\tta-\lda\tti)=0\) when \(\lda = -1,\ 5\text{.}\) Thus, the eigenvalues of \(\tta\) are \(-1\) and \(5\text{.}\)
Earlier, when looking at the same matrix as used in our example, we wondered if we could find a vector \(\vx\) such that \(\tta\vx=3\vx\text{.}\) According to this example, the answer is “No.” With this matrix \(\tta\text{,}\) the only values of \(\lda\) that work are \(-1\) and \(5\text{.}\)
Let’s restate the above in a different way: It is pointless to try to find \(\vx\) where \(\tta\vx=3\vx\text{,}\) for there is no such \(\vx\text{.}\) There are only 2 equations of this form that have a solution, namely
As we introduced this section, we gave a vector \(\vx\) such that \(\tta\vx = 5\vx\text{.}\) Is this the only one? Let’s find out while calling our work an example; this will amount to finding the eigenvectors of \(\tta\) that correspond to the eigenvector of 5.
Example7.1.3.Computing an eigenvector corresponding to a given eigenvalue.
Find \(\vx\) such that \(\tta\vx=5\vx\text{,}\) where
We have infinitely many solutions to the equation \(\tta\vx = 5\vx\text{;}\) any nonzero scalar multiple of the vector \(\bbm 1\\1\ebm\) is a solution. We can do a few examples to confirm this:
Our method of finding the eigenvalues of a matrix \(\tta\) boils down to determining which values of \(\lambda\) give the matrix \((\tta - \lambda\tti)\) a determinant of 0. In computing \(\det(\tta-\lambda\tti)\text{,}\) we get a polynomial in \(\lambda\) whose roots are the eigenvalues of \(\tta\text{.}\) This polynomial is important and so it gets its own name.
Definition7.1.4.Characteristic Polynomial.
Let \(\tta\) be an \(n\times n\) matrix. The characteristic polynomial of \(\tta\) is the \(n\)th degree polynomial \(p(\lambda) = \det(\tta-\lambda\tti)\text{.}\)
Our definition just states what the characteristic polynomial is. We know from our work so far why we care: the roots of the characteristic polynomial of an \(n\times n\) matrix \(\tta\) are the eigenvalues of \(\tta\text{.}\)
In Examples 2 and 3, we found eigenvalues and eigenvectors, respectively, of a given matrix. That is, given a matrix \(\tta\text{,}\) we found values \(\lambda\) and vectors \(\vx\) such that \(\tta\vx = \lambda\vx\text{.}\) The steps that follow outline the general procedure for finding eigenvalues and eigenvectors; we’ll follow this up with some examples.
Key Idea7.1.5.Finding Eigenvalues and Eigenvectors.
Let \(\tta\) be an \(n\times n\) matrix.
To find the eigenvalues of \(\tta\text{,}\) compute \(p(\lambda)\text{,}\) the characteristic polynomial of \(\tta\text{,}\) set it equal to 0, then solve for \(\lambda\text{.}\)
To find the eigenvectors of \(\tta\text{,}\)for each eigenvalue solve the homogeneous system \((\tta-\lambda\tti)\vx = \zero\text{.}\)
Example7.1.6.Computing eigenvalues and eigenvectors.
Find the eigenvalues of \(\tta\text{,}\) and for each eigenvalue, find an eigenvector where
Therefore, \(\det(\tta-\lambda\tti) = 0\) when \(\lambda = -6\) and \(12\text{;}\) these are our eigenvalues. (We should note that \(p(\lambda) =\lambda^2-6\lambda-72\) is our characteristic polynomial.)
It sometimes helps to give them names, so we’ll say \(\lambda_1 = -6\) and \(\lambda_2 = 12\text{.}\) Now we find eigenvectors.
For \(\lambda_1=-6\text{,}\) we need to solve the equation \((\tta - (-6)\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We’ve seen a matrix like this before, but we may need a bit of a refreshing. Our first row tells us that \(x_1 = 0\text{,}\) and we see that no rows/equations involve \(x_2\text{.}\) We conclude that \(x_2\) is free. Therefore, our solution, in vector form, is
Notice that in both of our examples so far, we were able to completely factor the characteristic polynomial and obtain two distinct eigenvalues. For \(2\times 2\) matrices, the characteristic polynomial will always be quadratic, and we know that finding roots of quadratic polynomials falls into three categories: those with two distinct roots, like in the examples above, those with one repeated root (for example, \(x^2-2x+1=(x-1)^2\)), and those with no real roots (for example, \(x^2+1\)). In the case of a repeated root, we will have only one eigenvalue. Will we have only one eigenvector, or could there be two? (We’ll have more to say about this later.) What if there are no real roots? Then there are no (real) eigenvalues, so presumably there are no eigenvectors, either. What if we allow for complex roots? Let’s look at some examples.
Example7.1.8.A matrix with only one eigenvalue.
Find the eigenvalues and eigenvectors of the matrix
\begin{equation*}
A = \bbm 1 \amp 4\\0 \amp 1\ebm\text{.}
\end{equation*}
Solution.
The transformation \(T(\vx)=A\vx\) defined by \(A\) is an example of a horizontal shear. (See Section 5.1.) Such a transformation leaves horizontal vectors unaffected, but vectors with a nonzero vertical component get pulled to the right: see Figure 7.1.9.
From the diagram we can probably guess that the horizontal vector \(\bbm 1\\0\ebm\) will be an eigenvector with eigenvalue 1, since it is left untouched by the shear transformation. Let’s confirm this analytically.
We begin as usual by finding the characteristic polynomial. We have
In each of our examples to this point, every eigenvalue corresponded to a single (independent) eigenvector. Is this always the case? We will not prove it in this textbook, but it turns out that in general, the power to which the factor \((\lambda - x)\) appears in the characteristic polynomial (called the multiplicity of the eigenvalue) places an upper limit on the number of independent eigenvectors that can correspond to that eigenvalue.
In Example 7.1.7, we had \(\det(A-\lambda I) = (-3-\lambda)^1(1-\lambda)^1\text{,}\) so the two eigenvalues \(\lambda = -3\) and \(\lambda = 1\) each have multiplicity one, and therefore they each have one corresponding eigenvector. In Example 7.1.8, the eigenvalue \(\lambda=3\) has multiplicity two, but we still had only one corresponding eigenvector. Can we ever have such an eigenvalue with two corresponding eigenvectors?
Example7.1.10.An eigenvalue of multiplicity two.
Find the eigenvalues and eigenvectors of the matrix
\begin{equation*}
A = \bbm 4 \amp 0\\0 \amp 4\ebm\text{.}
\end{equation*}
Solution.
Here, we notice that \(A\) is a scalar multiple of the identity. As a transformation of the Cartesian plane, the transformation \(T(\vx)=A\vx\) is a dilation: it expands the size of every vector in the plane by a factor of 4. Knowing that this is a transformation that stretches, but does not rotate, we might expect that every nonzero vector is an eigenvector of \(A\text{!}\) Indeed, given \(\vx\neq \zero\text{,}\) we have
so \(\vx\) is an eigenvector corresponding to the eigenvalue 4.
Of course, this is pretty much the end of the story here, but let’s get some practice with our algorithm for finding eigenvalues and eigenvectors and confirm our results. We can immediately see that
the zero matrix. Again, we see that literally any nonzero vector \(\vx \in\R^2\) qualifies as an eigenvector. We know that we can find at most two independent vectors in \(\R^2\text{,}\) so a simple choice is to take the standard basis vectors \(\ven{1}\) and \(\ven{2}\text{.}\)
Notice that we could have proceeded as usual and attempted to solve the system \((A-4I)\vx=\zero\text{.}\) In this case we get a rather strange augmented matrix:
It might seem like there’s absolutely nothing to do here, but we can read off a solution. In this case neither row places any conditions on the variables \(x_1\) and \(x_2\text{,}\) so both are free: \(x_1=s\) and \(x_2=t\) are both parameters, and
Setting \(s=1\) and \(t=0\) gives us the eigenvector \(\ven{1}\text{,}\) and setting \(s=0\text{,}\)\(t=1\) gives us the eigenvector \(\ven{2}\text{.}\)
We mentioned above that another possibility is that the characteristic polynomial has no real zeros at all, in which case our matrix has no (real) eigenvalues. Let’s see what we can say in such a situation.
Example7.1.11.A matrix with complex eigenvalues.
Find the eigenvalues and eigenvectors of the matrix
\begin{equation*}
A = \bbm 0 \amp -1\\1 \amp 0\ebm\text{.}
\end{equation*}
Solution.
Before we proceed, let’s pause and think about this in the context of matrix transformations. If we define the transformation \(T(\vx) = A\vx\text{,}\) we have
This is because the transformation \(T\) represents a rotation through an angle of \(\frac{\pi}{2}\) (90 degrees). Indeed, \(A\) is a rotation matrix (see Section 5.1) of the form
Now, think about the eigenvalue equation \(A\vx = \lambda\vx\text{.}\) In this case, an eigenvector \(\vx\) would be a vector in the plane such that rotating it by 90 degrees produces a parallel vector! Clearly, this is nonsense, and indeed, we find that
which has no real roots, so the matrix \(A\) has no eigenvalues, which makes sense from a geometric point of view.
However, this is not the end of the story, provided that we’re willing to work with complex numbers. Over the complex numbers, we do have two eigenvalues:
so \(\lambda = i\) and \(\lambda = -i\) are eigenvalues. What are the eigenvectors? We proceed as always, except that the arithmetic in the row operations is a bit trickier with complex numbers. For \(\lambda=i\text{,}\) we have the system \((A-iI)\vx = \zero\text{.}\) We set up the augmented matrix below, and in this case, we’ll proceed step-by-step to the reduced row echelon form.
Notice in the last step that \(-i+i(1) = 0\) gives the zero in the first column, and \(-1+i(-i)=-1+1=0\) gives the zero in the second column. This tells us that \(x_2=t\) is a free (complex!) parameter while \(x_1-ix_2=0\text{,}\) so \(x_1 = ix_2=it\text{.}\) Our vector solution is thus
In this context we’re free to chose any complex value for \(t\text{.}\) Choosing \(t=i\) gives us the solution \(\vx[2] = \bbm 1\\i\ebm\text{.}\)
Our last few examples provided interesting departures from the earlier ones where we had two distinct eigenvalues; they also provided examples where we were able to analyze the situation geometrically, by considering the linear transformations defined by the matrix. The reader is encouraged to consider the other examples of transformations given in Section 5.1 and attempt a similar analysis.
So far, our examples have involved \(2\times 2\) matrices. Let’s do an example with a \(3\times 3\) matrix. The only real additional complication here is that our characteristic polynomial will now be a cubic polynomial, so factoring it is going to take some more work.
Example7.1.12.Eigenvalues and eigenvectors for a \(3\times 3\) matrix.
Find the eigenvalues of \(\tta\text{,}\) and for each eigenvalue, give one eigenvector, where
We first compute the characteristic polynomial, set it equal to 0, then solve for \(\lda\text{.}\) A warning: this process is rather long. We’ll use cofactor expansion along the first row; don’t get bogged down with the arithmetic that comes from each step; just try to get the basic idea of what was done from step to step.
In the last step we factored the characteristic polynomial \(-\lda^3+4\lda^2-\lda -6\text{.}\) Factoring polynomials of degree \(\gt 2\) is not trivial; we’ll assume the reader has access to methods for doing this accurately.
One could also graph this polynomial to find the roots. Graphing will show us that \(\lda = 3\)looks like a root, and a simple calculation will confirm that it is.
Our eigenvalues are \(\lda_1 = -1\text{,}\)\(\lda_2 = 2\) and \(\lda_3 = 3\text{.}\) We now find corresponding eigenvectors.
For \(\lda_1 = -1\text{:}\)
We need to solve the equation \((\tta - (-1)\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We can pick any nonzero value for \(x_3\text{;}\) a nice choice would get rid of the fractions. So we’ll set \(x_3 = 2\) and choose \(\vx[1]=\bbm 3\\1\\2\ebm\) as our eigenvector.
For \(\lda_2 = 2\text{:}\)
We need to solve the equation \((\tta - 2\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We can pick any nonzero value for \(x_3\text{;}\) again, a nice choice would get rid of the fractions. So we’ll set \(x_3 = 2\) and choose \(\vx[2]=\bbm 2\\1\\2\ebm\) as our eigenvector.
For \(\lda_3 = 3\text{:}\)
We need to solve the equation \((\tta - 3\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We first compute the characteristic polynomial, set it equal to 0, then solve for \(\lda\text{.}\) We’ll use cofactor expansion down the first column (since it has lots of zeros).
Notice that while the characteristic polynomial is cubic, we never actually saw a cubic; we never distributed the \((2-\lda)\) across the quadratic. Instead, we realized that this was a factor of the cubic, and just factored the remaining quadratic. (This makes this example quite a bit simpler than the previous example.)
Our eigenvalues are \(\lda_1 = -2\text{,}\)\(\lda_2 = 2\) and \(\lda_3 = 7\text{.}\) We now find corresponding eigenvectors.
For \(\lda_1 = -2\text{:}\)
We need to solve the equation \((\tta - (-2)\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We can pick any nonzero value for \(x_3\text{;}\) a nice choice would get rid of the fractions. So we’ll set \(x_3 = 4\) and choose \(\vx[1]=\bbm -3\\-8\\4\ebm\) as our eigenvector.
For \(\lda_2 = 2\text{:}\)
We need to solve the equation \((\tta - 2\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
This looks funny, so we’ll look remind ourselves how to solve this. The first two rows tell us that \(x_2 = 0\) and \(x_3 = 0\text{,}\) respectively. Notice that no row/equation uses \(x_1\text{;}\) we conclude that it is free. Therefore, our solution in vector form is
We can pick any nonzero value for \(x_1\text{;}\) an easy choice is \(x_1 = 1\text{,}\) which gives \(\vx[2]=\bbm 1\\0\\0\ebm\) as our eigenvector.
For \(\lda_3 = 7\text{:}\)
We need to solve the equation \((\tta - 7\tti)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
In this section we have learned about a new concept: given a matrix \(\tta\) we can find certain values and vectors \(\vx\) where \(\tta\vx =\lda\vx\text{.}\) In the next section we will continue to the pattern we have established in this text: after learning a new concept, we see how it interacts with other concepts we know about. That is, we’ll look for connections between eigenvalues and eigenvectors and things like the inverse, determinants, the trace, the transpose, etc..
ExercisesExercises
Exercise Group.
A matrix \(\tta\) and one of its eigenvectors are given. Find the eigenvalue of \(\tta\) for the given eigenvector.