Math1410 The Matrix Trace

Section 6.2 The Matrix Trace

In the previous section, we learned about an operation we can peform on matrices, namely the transpose. Given a matrix \(A\text{,}\) we can “find the transpose of \(A\text{,}\)” which is another matrix. In this section we learn about a new operation called the trace. It is a different type of operation than the transpose. Given a matrix \(A\text{,}\) we can “find the trace of \(A\text{,}\)” which is not a matrix but rather a number. We formally define it here.

🔗

Definition 6.2.1. The Trace.

Let \(A\) be an \(n\times n\) matrix. The trace of \(A\text{,}\) denoted \(\tr(A)\text{,}\) is the sum of the diagonal elements of \(A\text{.}\) That is,

\begin{equation*} \tr(A) = a_{11}+a_{22}+\cdots+a_{nn}\text{.} \end{equation*}

🔗

This seems like a simple definition, and it really is. Just to make sure it is clear, let’s practice.

🔗

Example 6.2.2. Computing the trace of a matrix.

Find the trace of \(A\text{,}\) \(B\text{,}\) \(C\) and \(I_4\text{,}\) where

\begin{equation*} A = \bbm 1 \amp 2\\3 \amp 4\ebm,\ B = \bbm 1\amp 2\amp 0\\3\amp 8\amp 1\\-2\amp 7\amp -5\ebm \ \text{and}\ C=\bbm 1\amp 2\amp 3\\4\amp 5\amp 6\ebm\text{.} \end{equation*}

🔗

Solution.

To find the trace of \(A\text{,}\) note that the diagonal elements of \(A\) are 1 and 4. Therefore, \(\tr(A) = 1+4 = 5\text{.}\)

🔗

We see that the diagonal elements of \(B\) are 1, 8 and -5, so \(\tr(B) = 1+8-5 = 4\text{.}\)

🔗

The matrix \(C\) is not a square matrix, and our definition states that we must start with a square matrix. Therefore \(\tr(C)\) is not defined.

🔗

Finally, the diagonal of \(I_4\) consists of four 1s. Therefore \(\tr(I_4) = 4\text{.}\)

🔗

Now that we have defined the trace of a matrix, we should think like mathematicians and ask some questions. The first questions that should pop into our minds should be along the lines of “How does the trace work with other matrix operations?” (Recall that we asked a similar question once we learned about the transpose.) We should think about how the trace works with matrix addition, scalar multiplication, matrix multiplication, matrix inverses, and the transpose.

🔗

We’ll give a theorem that will formally tell us what is true in a moment, but first let’s play with two sample matrices and see if we can see what will happen. Let

\begin{equation*} A = \bbm 2 \amp 1 \amp 3\\2\amp 0\amp -1\\3\amp -1\amp 3\ebm \ \text{and} \ B = \bbm 2\amp 0\amp 1\\-1\amp 2\amp 0\\0\amp 2\amp -1\ebm\text{.} \end{equation*}

🔗

It should be clear that \(\tr(A) = 5\) and \(\tr(B) = 3\text{.}\) What is \(\tr(A+B)\text{?}\)

\begin{align*} \tr(A+B) \amp = \tr\left(\bbm 2\amp 1\amp 3\\2\amp 0\amp -1\\3\amp -1\amp 3\ebm + \bbm 2\amp 0\amp 1\\-1\amp 2\amp 0\\0\amp 2\amp -1\ebm\right)\\ \amp = \tr\left(\bbm 4\amp 1\amp 4\\1\amp 2\amp -1\\3\amp 1\amp 2\ebm\right)\\ \amp = 8\text{.} \end{align*}

So we notice that \(\tr(A+B) = \tr(A) + \tr(B)\text{.}\) This probably isn’t a coincidence.

🔗

How does the trace work with scalar multiplication? If we multiply \(A\) by 4, then the diagonal elements will be 8, 0 and 12, so \(\tr(4A) = 20\text{.}\) Is it a coincidence that this is 4 times the trace of \(A\text{?}\)

🔗

Let’s move on to matrix multiplication. How will the trace of \(AB\) relate to the traces of \(A\) and \(B\text{?}\) Let’s see:

\begin{align*} \tr(AB) \amp =\tr\left(\bbm 2\amp 1\amp 3\\2\amp 0\amp -1\\3\amp -1\amp 3\ebm \bbm 2\amp 0\amp 1\\-1\amp 2\amp 0\\0\amp 2\amp -1\ebm\right)\\ \amp = \tr\left(\bbm 3\amp 8\amp -1\\4\amp -2\amp 3\\7\amp 4\amp 0\ebm\right)\\ \amp = 1\text{.} \end{align*}

🔗

It isn’t exactly clear what the relationship is among \(\tr(A)\text{,}\) \(\tr(B)\) and \(\tr(AB)\text{.}\) Before moving on, let’s find \(\tr(BA)\text{:}\)

\begin{align*} \tr(BA) \amp = \tr\left(\bbm 2 \amp 0\amp 1\\-1\amp 2\amp 0\\0\amp 2\amp -1\ebm\bbm 2\amp 1\amp 3\\2\amp 0\amp -1\\3\amp -1\amp 3\ebm\right)\\ \amp = \tr \left(\bbm 7 \amp 1 \amp 9 \\2 \amp -1 \amp -5\\ 1 \amp 1 \amp -5 \ebm\right)\\ \amp = 1\text{.} \end{align*}

🔗

We notice that \(\tr(AB) = \tr(BA)\text{.}\) Is this coincidental?

🔗

How are the traces of \(A\) and \(A^{-1}\) related? We compute \(A^{-1}\) and find that

\begin{equation*} A^{-1} = \bbm 1/17 \amp 6/17 \amp 1/17\\9/17 \amp 3/17 \amp -8/17\\ 2/17 \amp -5/17 \amp 2/17 \ebm\text{.} \end{equation*}

Therefore \(\tr(A^{-1}) = 6/17\text{.}\) Again, the relationship isn’t clear.

🔗

This example brings to light many interesting ideas that we’ll flesh out just a little bit here.

Notice that the elements of \(A\) are \(1\text{,}\) \(-2\text{,}\) \(1\) and \(1\text{.}\) Add the squares of these numbers: \(1^2 + (-2)^2 + 1^2 + 1^2 = 7 = \tr(A^TA)\text{.}\)
🔗

Notice that the elements of \(B\) are \(6\text{,}\) \(7\text{,}\) \(11\) and \(-4\text{.}\) Add the squares of these numbers: \(6^2 + 7^2 + 11^2 + (-4)^2 = 222 =\tr(B^TB)\text{.}\)
🔗

Can you see why this is true? When looking at multiplying \(A^TA\text{,}\) focus only on where the elements on the diagonal come from since they are the only ones that matter when taking the trace.
🔗

🔗
You can confirm on your own that regardless of the dimensions of \(A\text{,}\) \(\tr(A^TA) = \tr(AA^T)\text{.}\) To see why this is true, consider the previous point. (Recall also that \(A^TA\) and \(AA^T\) are always square, regardless of the dimensions of \(A\text{.}\))
🔗

🔗
Mathematicians are actually more interested in \(\sqrt{\tr(A^TA)}\) than just \(\tr(A^TA)\text{.}\) The reason for this is a bit complicated; the short answer is that “it works better.” The reason “it works better” is related to the Pythagorean Theorem, all of all things. If we know that the legs of a right triangle have length \(a\) and \(b\text{,}\) we are more interested in \(\sqrt{a^2+b^2}\) than just \(a^2+b^2\text{.}\) Of course, this explanation raises more questions than it answers; our goal here is just to whet your appetite and get you to do some more reading. A Numerical Linear Algebra book would be a good place to start.
🔗

🔗

🔗

Finally, let’s see how the trace is related to the transpose. We actually don’t have to formally compute anything. Recall from the previous section that the diagonals of \(A\) and \(A^T\) are identical; therefore, \(\tr(A) = \tr(A^T)\text{.}\) That, we know for sure, isn’t a coincidence.

🔗

We now formally state what equalities are true when considering the interaction of the trace with other matrix operations.

🔗

Theorem 6.2.3. Properties of the Matrix Trace.

Let \(A\) and \(B\) be \(n\times n\) matrices. Then:

\(\displaystyle \tr(A+B) = \tr(A) + \tr(B) \)

🔗
\(\displaystyle \tr(A-B) = \tr(A) - \tr(B)\)

🔗
\(\displaystyle \tr(kA) = k\cdot \tr(A)\)

🔗
\(\displaystyle \tr(AB) = \tr(BA)\)

🔗
\(\displaystyle \tr(A^T) = \tr(A)\)

🔗

🔗

One of the key things to note here is what this theorem does not say. It says nothing about how the trace relates to matrix multiplication; that is, we can’t figure out what \(\tr(AB)\) is just by knowing what \(\tr(A)\) and \(\tr(B)\) are. The theorem also says nothing about how the trace relates to inverses. The reason for the silence in these areas is that there simply is not a relationship.

🔗

We end this section by again wondering why anyone would care about the trace of matrix. One reason mathematicians are interested in it is that it can give a measurement of the “size” of a matrix.

🔗

Consider the following \(2 \times 2\) matrices:

\begin{equation*} A = \bbm 1 \amp -2\\1 \amp 1\ebm \ \text{and}\ B = \bbm 6\amp 7\\11\amp -4\ebm\text{.} \end{equation*}

🔗

These matrices have the same trace, yet \(B\) clearly has bigger elements in it. So how can we use the trace to determine a “size” of these matrices? We can consider \(\tr(A^TA)\) and \(\tr(B^TB)\text{.}\)

\begin{align*} \tr(A^TA) \amp = \tr\left(\bbm 1\amp 1\\-2\amp 1\ebm\bbm 1\amp -2\\1\amp 1\ebm\right)\\ \amp = \tr\left( \bbm 2\amp -1\\-1\amp 5\ebm\right)\\ \amp = 7\\ \tr(B^TB) \amp = \tr\left(\bbm 6\amp 11\\7\amp -4\ebm\bbm 6\amp 7\\11\amp -4\ebm\right)\\ \amp = \tr\left(\bbm 157 \amp -2\\-2 \amp 65\ebm \right)\\ \amp = 222\text{.} \end{align*}

🔗

Our concern is not how to interpret what this “size” measurement means, but rather to demonstrate that the trace (along with the transpose) can be used to give (perhaps useful) information about a matrix.

🔗

Exercises Exercises

Exercise Group.

Find the trace of the given matrix.

🔗

1.

\(\bbm 6\amp 5\\ 2\amp 10\\ 3\amp 3\ebm\)

🔗

2.

\(\bbm 1\amp -5\\ 9\amp 5\ebm\)

🔗

3.

\(\bbm 7\amp 5\\ -5\amp -4\ebm\)

🔗

4.

\(\bbm -10\amp 6\amp -7\amp -9\\ -2\amp 1\amp 6\amp -9\\ 0\amp 4\amp -4\amp 0\\ -3\amp -9\amp 3\amp -10\ebm\)

🔗

5.

\(\bbm -4\amp 1\amp 1\\ -2\amp 0\amp 0\\ -1\amp -2\amp -5\ebm\)

🔗

6.

\(\bbm 5\amp 2\amp 2\amp 2\\ -7\amp 4\amp -7\amp -3\\ 9\amp -9\amp -7\amp 2\\ -4\amp 8\amp -8\amp -2\ebm\)

🔗

7.

\(\bbm 0\amp -3\amp 1\\ 5\amp -5\amp 5\\ -4\amp 1\amp 0\ebm\)

🔗

8.

\(\bbm -2\amp -3\amp 5\\ 5\amp 2\amp 0\\ -1\amp -3\amp 1\ebm\)

🔗

9.

\(\bbm -3\amp -10\\ -6\amp 4\ebm\)

🔗

10.

\(\bbm 4\amp 2\amp -1\\ -4\amp 1\amp 4\\ 0\amp -5\amp 5\ebm\)

🔗

11.

\(\bbm 2\amp 6\amp 4\\ -1\amp 8\amp -10\ebm\)

🔗

12.

\(\bbm -6\amp 0\\ -10\amp 9\ebm\)

🔗

13.

A matrix \(A\) that is skew symmetric.

🔗

14.

\(I_4\)

🔗

15.

\(I_n\)

🔗

Exercise Group.

Verify Theorem 6.2.3 by:

Showing that \(\tr(A)+\tr(B) = \tr(A+B)\)

🔗
Showing that \(\tr(AB) = \tr(BA)\)

🔗

🔗

16.

\(A = \bbm -8\amp -10\amp 10\\ 10\amp 5\amp -6\\ -10\amp 1\amp 3\ebm\text{,}\) \(B = \bbm -10\amp -4\amp -3\\ -4\amp -5\amp 4\\ 3\amp 7\amp 3\ebm\)

🔗

17.

\(A = \bbm -10\amp 7\amp 5\\ 7\amp 7\amp -5\\ 8\amp -9\amp 2\ebm\text{,}\) \(B = \bbm -3\amp -4\amp 9\\ 4\amp -1\amp -9\\ -7\amp -8\amp 10\ebm\)

🔗

18.

\(A = \bbm 0\amp -8\\ 1\amp 8\ebm\text{,}\) \(B = \bbm -4\amp 5\\ -4\amp 2\ebm\)

🔗

19.

\(A = \bbm 1\amp -1\\ 9\amp -6\ebm\text{,}\) \(B = \bbm -1\amp 0\\ -6\amp 3\ebm\)

🔗

Prev Top Next