Section 4.3 Quadratic forms
If you’ve done a couple of calculus courses, you’ve probably encountered conic sections, like the ellipse \(\frac{x^2}{a^2}+\frac{y^2}{b^2}=1\) or the parabola \(\frac{y}{b}=\frac{x^2}{a^2}\text{.}\) You might also recall that your instructor was careful to avoid conic sections with equations including “cross-terms” like \(xy\text{.}\) The reason for this is that sketching a conic section like \(x^2+4xy+y^2=1\) requires the techniques of the previous section.
A basic fact about orthogonal matrices is that they preserve length. Indeed, for any vector \(\xx\) in \(\R^n\) and any orthogonal matrix \(P\text{,}\)
\begin{equation*}
\len{P\xx}^2 = (P\xx)\dotp (P\xx) = (P\xx)^T(P\xx) = (\xx^TP^T)(P\xx) = \xx^T\xx=\len{\xx}^2\text{,}
\end{equation*}
since \(P^TP=I_n\text{.}\)
Note also that since \(P^TP=I_n\) and \(\det P^T=\det P\text{,}\) we have
\begin{equation*}
\det(P)^2=\det(P^TP)=\det(I_n)=1\text{,}
\end{equation*}
so \(\det(P)=\pm 1\text{.}\) If \(\det P=1\text{,}\) we have what is called a special orthogonal matrix. In \(\R^2\) or \(\R^3\text{,}\) multiplication by a special orthogonal matrix is simply a rotation. (If \(\det P=-1\text{,}\) there is also a reflection.)
We mentioned in the previous section that the
Real Spectral Theorem is also referred to as the principal axes theorem. The name comes from the fact that one way to interpret the orthogonal diagonalization of a symmetric matrix is that we are rotating our coordinate system. The original coordinate axes are rotated to new coordinate axes, with respect to which the matrix
\(A\) is diagonal. This will become more clear once we apply these ideas to the problem of conic sections mentioned above. First, a definition.
Definition 4.3.1.
For example,
\(q_1(x,y)=4 x^2-4xy+4y^2\) and
\(q_2(x,y,z)=9x^2-4 y^2-4xy-2xz+z^2\) are quadratic forms. Note that each term in a quadratic form is of degree two. We omit linear terms, since these can be absorbed by completing the square. The important observation is that every quadratic form can be associated to a symmetric matrix. The diagonal entries are the coefficients
\(a_{ii}\) appearing in
Definition 4.3.1, while the off-diagonal entries are
half the corresponding coefficients
\(a_{ij}\text{.}\)
For example the two quadratic forms given above have the following associated matrices:
\begin{equation*}
A_1 = \bbm 4 \amp -2\\-2\amp 4\ebm \text{ and } A_2 = \bbm 9 \amp -2 \amp -1\\-2\amp 4\amp 0\\-1\amp 0\amp 1\ebm\text{.}
\end{equation*}
The reason for this is that we can then write
\begin{equation*}
q_1(x,y)=\bbm x\amp y\ebm\bbm 4 \amp -1\\-1\amp 1\ebm\bbm x\\y\ebm
\end{equation*}
and
\begin{equation*}
q_2(x,y,z)=\bbm x\amp y\amp z\ebm\bbm 9 \amp -2 \amp -1\\-2\amp 4\amp 0\\-1\amp 0\amp 1\ebm\bbm x\\y\\z\ebm\text{.}
\end{equation*}
Of course, the reason for wanting to associate a symmetric matrix to a quadratic form is that it can be orthogonally diagonalized. Consider the matrix \(A_1\text{.}\)
We find distinct eigenvalues \(\lambda_1=2\) and \(\lambda_2=6\text{.}\) Since \(A\) is symmetric, we know the corresponding eigenvectors will be orthogonal.
The resulting orthogonal matrix is \(P=\frac{1}{\sqrt{2}}\bbm 1\amp -1\\1\amp 1\ebm\text{,}\) and we find
\begin{equation*}
P^TAP = \bbm 2\amp 0\\0\amp 6\ebm, \text{ or } A = PDP^T,
\end{equation*}
where \(D = \bbm 2\amp 0\\0\amp 6\ebm\text{.}\) If we define new variables \(y_1,y_2\) by
\begin{equation*}
\bbm y_1\\y_2\ebm = P^T\bbm x_1\\x_2\ebm\text{,}
\end{equation*}
then we find that
\begin{align*}
\bbm x_1\amp x_2\ebm A\bbm x_1\\x_2\ebm \amp = (\bbm x_1\amp x_2\ebm P)D\left(P^T\bbm x_1\\x_2\ebm\right) \\
\amp = \bbm y_1 \amp y_2\ebm\bbm 2\amp 0\\0\amp 6\ebm\bbm y_1\\y_2\ebm\\
\amp = 2y_1^2+6y_2^2\text{.}
\end{align*}
Note that there is no longer any cross term.
Now, suppose we want to graph the conic \(4x_1^2-4x_1x_2+4x_2^2=12\text{.}\) By changing to the variables \(y_1,y_2\) this becomes \(2y_1^2+6y_2^2=12\text{,}\) or \(\frac{y_1^2}{6}+\frac{y_2^2}{2}=1\text{.}\) This is the standard from of an ellipse, but in terms of new variables. How do we graph it? Returning to the definition of our new variables, we find \(y_1=\frac{1}{\sqrt{2}}(x_1+x_2)\) and \(y_2=\frac{1}{\sqrt{2}}(-x_1+x_2)\text{.}\) The \(y_1\) axis should be the line \(y_2=0\text{,}\) or \(x_1=x_2\text{.}\) (Note that this line points in the direction of the eigenvector \(\bbm 1\\1\ebm\text{.}\)) The \(y_2\) axis should be the line \(y_1=0\text{,}\) or \(x_1=-x_2\text{,}\) which is in the direction of the eigenvector \(\bbm -1\\1\ebm\text{.}\)
This lets us see that our new coordinate axes are simply a rotation (by \(\pi/4\)) of the old coordinate axes, and our conic section is, accordingly, an ellipse that has been rotated by the same angle.
Exercises Exercises
1.
2.
3.