Wow! It looks like multiplying \(A\vx\) is the same as \(5\vx\text{!}\) This makes us wonder lots of things: is this the only case in the world where something like this happens? (Probably not.) Is \(A\) somehow a special matrix, and \(A\vx = 5\vx\) for any vector \(\vx\) we pick? (Probably not.) Or maybe \(\vx\) was a special vector, and no matter what \(2\times 2\) matrix \(A\) we picked, we would have \(A\vx =5\vx\text{.}\) (Again, probably not.)
A more likely explanation is this: given the matrix \(A\text{,}\) the number 5 and the vector \(\vx\) formed a special pair that happened to work together in a nice way. It is then natural to wonder if other βspecialβ pairs exist. For instance, could we find a vector \(\vx\) where \(A\vx=3\vx\text{?}\)
This equation is hard to solve at first; we are not used to matrix equations where \(\vx\) appears on both sides of β\(=\text{.}\)β Therefore we put off solving this for just a moment to state a definition and make a few comments.
The word βeigenβ is German for βproperβ or βcharacteristic.β Therefore, an eigenvector of \(A\) is a βcharacteristic vector of \(A\text{.}\)β This vector tells us something about \(A\text{.}\)
Why do we use the Greek letter (lambda)? It is pure tradition. Above, we used \(a\) to represent the unknown scalar, since we are used to that notation. We now switch to because that is how everyone else does it. (An example of mathematical peer pressure.) Donβt get hung up on this; it is just a number.
Note that our definition requires that \(A\) be a square matrix. If \(A\) isnβt square then \(A\vx\) and \(\lambda\vx\) will have different sizes, and so they cannot be equal. Also note that \(\vx\) must be nonzero. Why? What if \(\vx = \zero\text{?}\) Then no matter what is, \(A\vx = \lambda \vx\text{.}\) This would then imply that every number is an eigenvalue; if every number is an eigenvalue, then we wouldnβt need a definition for it. Therefore we specify that \(\vx\neq \zero\text{.}\)
Our last comment before trying to find eigenvalues and eigenvectors for given matrices deals with βwhy we care.β Did we stumble upon a mathematical curiosity, or does this somehow help us build better bridges, heal the sick, send astronauts into orbit, design optical equipment, and understand quantum mechanics? The answer, of course, is βYes.β (Except for the βunderstand quantum mechanicsβ part. Nobody truly understands that stuff; they just probably understand it.) This is a wonderful topic in and of itself: we need no external application to appreciate its worth. At the same time, it has many, many applications to βthe real world.β
Back to our math. Given a square matrix \(A\text{,}\) we want to find a nonzero vector \(\vx\) and a scalar such that \(A\vx = \lambda \vx\text{.}\) We will solve this using the skills we developed in ChapterΒ 4.
but this really doesnβt make sense. After all, what does βa matrix minus a numberβ mean? We need the identity matrix in order for this to be logical.
Let us now think about the equation \((A-\lambda I)\vx=\zero\text{.}\) While it looks complicated, it really is just matrix equation of the type we solved in SectionΒ 3.6. We are just trying to solve \(B\vx=\zero\text{,}\) where \(B = (A-\lambda I)\text{.}\)
We know from our previous work that this type of equation always has a solution, namely, \(\vx = \zero\text{.}\) (Recall this is a homogeneous system of equations.) However, we want \(\vx\) to be an eigenvector and, by the definition, eigenvectors cannot be \(\zero\text{.}\)
This means that we want solutions to \((A-\lambda I)\vx=\zero\) other than \(\vx=\zero\text{.}\) Recall that TheoremΒ 4.4.12 says that if the matrix \((A-\lambda I)\) is invertible, then the only solution to \((A-\lambda I)\vx=\zero\) is \(\vx=\zero\text{.}\) Therefore, in order to have other solutions, we need \((A-\lambda I)\) to not be invertible.
Finally, recall from TheoremΒ 6.4.12 that noninvertible matrices all have a determinant of 0. Therefore, if we want to find eigenvalues and eigenvectors\(\vx\text{,}\) we need \(\det(A-\lambda I) = 0\text{.}\)
Earlier, when looking at the same matrix as used in our example, we wondered if we could find a vector \(\vx\) such that \(A\vx=3\vx\text{.}\) According to this example, the answer is βNo.β With this matrix \(A\text{,}\) the only values of \(\lambda \) that work are \(-1\) and \(5\text{.}\)
Letβs restate the above in a different way: It is pointless to try to find \(\vx\) where \(A\vx=3\vx\text{,}\) for there is no such \(\vx\text{.}\) There are only 2 equations of this form that have a solution, namely
As we introduced this section, we gave a vector \(\vx\) such that \(A\vx = 5\vx\text{.}\) Is this the only one? Letβs find out while calling our work an example; this will amount to finding the eigenvectors of \(A\) that correspond to the eigenvector of 5.
We have infinitely many solutions to the equation \(A\vx = 5\vx\text{;}\) any nonzero scalar multiple of the vector \(\bbm 1\\1\ebm\) is a solution. We can do a few examples to confirm this:
Our method of finding the eigenvalues of a matrix \(A\) boils down to determining which values of \(\lambda\) give the matrix \((A - \lambda I)\) a determinant of 0. In computing \(\det(A-\lambda I)\text{,}\) we get a polynomial in \(\lambda\) whose roots are the eigenvalues of \(A\text{.}\) This polynomial is important and so it gets its own name.
Let \(A\) be an \(n\times n\) matrix. The characteristic polynomial of \(A\) is the \(n\)th degree polynomial \(p(\lambda) = \det(A-\lambda I)\text{.}\)
Our definition just states what the characteristic polynomial is. We know from our work so far why we care: the roots of the characteristic polynomial of an \(n\times n\) matrix \(A\) are the eigenvalues of \(A\text{.}\)
In Examples 2 and 3, we found eigenvalues and eigenvectors, respectively, of a given matrix. That is, given a matrix \(A\text{,}\) we found values \(\lambda\) and vectors \(\vx\) such that \(A\vx = \lambda\vx\text{.}\) The steps that follow outline the general procedure for finding eigenvalues and eigenvectors; weβll follow this up with some examples.
Key Idea7.1.5.Finding Eigenvalues and Eigenvectors.
Let \(A\) be an \(n\times n\) matrix.
To find the eigenvalues of \(A\text{,}\) compute \(p(\lambda)\text{,}\) the characteristic polynomial of \(A\text{,}\) set it equal to 0, then solve for \(\lambda\text{.}\)
Therefore, \(\det(A-\lambda I) = 0\) when \(\lambda = -6\) and \(12\text{;}\) these are our eigenvalues. (We should note that \(p(\lambda) =\lambda^2-6\lambda-72\) is our characteristic polynomial.)
For \(\lambda_1=-6\text{,}\) we need to solve the equation \((A - (-6)I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
Weβve seen a matrix like this before, but we may need a bit of a refreshing. Our first row tells us that \(x_1 = 0\text{,}\) and we see that no rows/equations involve \(x_2\text{.}\) We conclude that \(x_2\) is free. Therefore, our solution, in vector form, is
Notice that in both of our examples so far, we were able to completely factor the characteristic polynomial and obtain two distinct eigenvalues. For \(2\times 2\) matrices, the characteristic polynomial will always be quadratic, and we know that finding roots of quadratic polynomials falls into three categories: those with two distinct roots, like in the examples above, those with one repeated root (for example, \(x^2-2x+1=(x-1)^2\)), and those with no real roots (for example, \(x^2+1\)). In the case of a repeated root, we will have only one eigenvalue. Will we have only one eigenvector, or could there be two? (Weβll have more to say about this later.) What if there are no real roots? Then there are no (real) eigenvalues, so presumably there are no eigenvectors, either. What if we allow for complex roots? Letβs look at some examples.
The transformation \(T(\vx)=A\vx\) defined by \(A\) is an example of a horizontal shear. (See SectionΒ 5.1.) Such a transformation leaves horizontal vectors unaffected, but vectors with a nonzero vertical component get pulled to the right: see FigureΒ 7.1.9.
From the diagram we can probably guess that the horizontal vector \(\bbm 1\\0\ebm\) will be an eigenvector with eigenvalue 1, since it is left untouched by the shear transformation. Letβs confirm this analytically.
In each of our examples to this point, every eigenvalue corresponded to a single (independent) eigenvector. Is this always the case? We will not prove it in this textbook, but it turns out that in general, the power to which the factor \((\lambda - x)\) appears in the characteristic polynomial (called the multiplicity of the eigenvalue) places an upper limit on the number of independent eigenvectors that can correspond to that eigenvalue.
In ExampleΒ 7.1.7, we had \(\det(A-\lambda I) = (-3-\lambda)^1(1-\lambda)^1\text{,}\) so the two eigenvalues \(\lambda = -3\) and \(\lambda = 1\) each have multiplicity one, and therefore they each have one corresponding eigenvector. In ExampleΒ 7.1.8, the eigenvalue \(\lambda=3\) has multiplicity two, but we still had only one corresponding eigenvector. Can we ever have such an eigenvalue with two corresponding eigenvectors?
Here, we notice that \(A\) is a scalar multiple of the identity. As a transformation of the Cartesian plane, the transformation \(T(\vx)=A\vx\) is a dilation: it expands the size of every vector in the plane by a factor of 4. Knowing that this is a transformation that stretches, but does not rotate, we might expect that every nonzero vector is an eigenvector of \(A\text{!}\) Indeed, given \(\vx\neq \zero\text{,}\) we have
Of course, this is pretty much the end of the story here, but letβs get some practice with our algorithm for finding eigenvalues and eigenvectors and confirm our results. We can immediately see that
the zero matrix. Again, we see that literally any nonzero vector \(\vx \in\R^2\) qualifies as an eigenvector. We know that we can find at most two independent vectors in \(\R^2\text{,}\) so a simple choice is to take the standard basis vectors \(\ven{1}\) and \(\ven{2}\text{.}\)
Notice that we could have proceeded as usual and attempted to solve the system \((A-4I)\vx=\zero\text{.}\) In this case we get a rather strange augmented matrix:
It might seem like thereβs absolutely nothing to do here, but we can read off a solution. In this case neither row places any conditions on the variables \(x_1\) and \(x_2\text{,}\) so both are free: \(x_1=s\) and \(x_2=t\) are both parameters, and
Setting \(s=1\) and \(t=0\) gives us the eigenvector \(\ven{1}\text{,}\) and setting \(s=0\text{,}\)\(t=1\) gives us the eigenvector \(\ven{2}\text{.}\)
We mentioned above that another possibility is that the characteristic polynomial has no real zeros at all, in which case our matrix has no (real) eigenvalues. Letβs see what we can say in such a situation.
Before we proceed, letβs pause and think about this in the context of matrix transformations. If we define the transformation \(T(\vx) = A\vx\text{,}\) we have
This is because the transformation \(T\) represents a rotation through an angle of \(\frac{\pi}{2}\) (90 degrees). Indeed, \(A\) is a rotation matrix (see SectionΒ 5.1) of the form
Now, think about the eigenvalue equation \(A\vx = \lambda\vx\text{.}\) In this case, an eigenvector \(\vx\) would be a vector in the plane such that rotating it by 90 degrees produces a parallel vector! Clearly, this is nonsense, and indeed, we find that
However, this is not the end of the story, provided that weβre willing to work with complex numbers. Over the complex numbers, we do have two eigenvalues:
so \(\lambda = i\) and \(\lambda = -i\) are eigenvalues. What are the eigenvectors? We proceed as always, except that the arithmetic in the row operations is a bit trickier with complex numbers. For \(\lambda=i\text{,}\) we have the system \((A-iI)\vx = \zero\text{.}\) We set up the augmented matrix below, and in this case, weβll proceed step-by-step to the reduced row echelon form.
Notice in the last step that \(-i+i(1) = 0\) gives the zero in the first column, and \(-1+i(-i)=-1+1=0\) gives the zero in the second column. This tells us that \(x_2=t\) is a free (complex!) parameter while \(x_1-ix_2=0\text{,}\) so \(x_1 = ix_2=it\text{.}\) Our vector solution is thus
Our last few examples provided interesting departures from the earlier ones where we had two distinct eigenvalues; they also provided examples where we were able to analyze the situation geometrically, by considering the linear transformations defined by the matrix. The reader is encouraged to consider the other examples of transformations given in SectionΒ 5.1 and attempt a similar analysis.
So far, our examples have involved \(2\times 2\) matrices. Letβs do an example with a \(3\times 3\) matrix. The only real additional complication here is that our characteristic polynomial will now be a cubic polynomial, so factoring it is going to take some more work.
We first compute the characteristic polynomial, set it equal to 0, then solve for \(\lambda \text{.}\) A warning: this process is rather long. Weβll use cofactor expansion along the first row; donβt get bogged down with the arithmetic that comes from each step; just try to get the basic idea of what was done from step to step.
In the last step we factored the characteristic polynomial \(-\lambda ^3+4\lambda ^2-\lambda -6\text{.}\) Factoring polynomials of degree \(\gt 2\) is not trivial; weβll assume the reader has access to methods for doing this accurately.
One could also graph this polynomial to find the roots. Graphing will show us that \(\lambda = 3\)looks like a root, and a simple calculation will confirm that it is.
We need to solve the equation \((A - (-1)I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We can pick any nonzero value for \(x_3\text{;}\) a nice choice would get rid of the fractions. So weβll set \(x_3 = 2\) and choose \(\vx[1]=\bbm 3\\1\\2\ebm\) as our eigenvector.
We need to solve the equation \((A - 2I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We can pick any nonzero value for \(x_3\text{;}\) again, a nice choice would get rid of the fractions. So weβll set \(x_3 = 2\) and choose \(\vx[2]=\bbm 2\\1\\2\ebm\) as our eigenvector.
We need to solve the equation \((A - 3I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We first compute the characteristic polynomial, set it equal to 0, then solve for \(\lambda \text{.}\) Weβll use cofactor expansion down the first column (since it has lots of zeros).
Notice that while the characteristic polynomial is cubic, we never actually saw a cubic; we never distributed the \((2-\lambda )\) across the quadratic. Instead, we realized that this was a factor of the cubic, and just factored the remaining quadratic. (This makes this example quite a bit simpler than the previous example.)
We need to solve the equation \((A - (-2)I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
We can pick any nonzero value for \(x_3\text{;}\) a nice choice would get rid of the fractions. So weβll set \(x_3 = 4\) and choose \(\vx[1]=\bbm -3\\-8\\4\ebm\) as our eigenvector.
We need to solve the equation \((A - 2I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
This looks funny, so weβll look remind ourselves how to solve this. The first two rows tell us that \(x_2 = 0\) and \(x_3 = 0\text{,}\) respectively. Notice that no row/equation uses \(x_1\text{;}\) we conclude that it is free. Therefore, our solution in vector form is
We need to solve the equation \((A - 7I)\vx = \zero\text{.}\) To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
In this section we have learned about a new concept: given a matrix \(A\) we can find certain values and vectors \(\vx\) where \(A\vx =\lambda \vx\text{.}\) In the next section we will continue to the pattern we have established in this text: after learning a new concept, we see how it interacts with other concepts we know about. That is, weβll look for connections between eigenvalues and eigenvectors and things like the inverse, determinants, the trace, the transpose, etc..