In the previous section, we learned about an operation we can peform on matrices, namely the transpose. Given a matrix \(\tta\text{,}\) we can “find the transpose of \(\tta\text{,}\)” which is another matrix. In this section we learn about a new operation called the trace. It is a different type of operation than the transpose. Given a matrix \(\tta\text{,}\) we can “find the trace of \(\tta\text{,}\)” which is not a matrix but rather a number. We formally define it here.
Definition6.2.1.The Trace.
Let \(\tta\) be an \(n\times n\) matrix. The trace of \(\tta\), denoted \(\tr(\tta)\text{,}\) is the sum of the diagonal elements of \(\tta\text{.}\) That is,
To find the trace of \(\tta\text{,}\) note that the diagonal elements of \(\tta\) are 1 and 4. Therefore, \(\tr(\tta) = 1+4 = 5\text{.}\)
We see that the diagonal elements of \(\ttb\) are 1, 8 and -5, so \(\tr(\ttb) = 1+8-5 = 4\text{.}\)
The matrix \(\ttc\) is not a square matrix, and our definition states that we must start with a square matrix. Therefore \(\tr(\ttc)\) is not defined.
Finally, the diagonal of \(\tti_4\) consists of four 1s. Therefore \(\tr(\tti_4) = 4\text{.}\)
Now that we have defined the trace of a matrix, we should think like mathematicians and ask some questions. The first questions that should pop into our minds should be along the lines of “How does the trace work with other matrix operations?” (Recall that we asked a similar question once we learned about the transpose.) We should think about how the trace works with matrix addition, scalar multiplication, matrix multiplication, matrix inverses, and the transpose.
We’ll give a theorem that will formally tell us what is true in a moment, but first let’s play with two sample matrices and see if we can see what will happen. Let
So we notice that \(\tr(\tta+\ttb) = \tr(\tta) + \tr(\ttb)\text{.}\) This probably isn’t a coincidence.
How does the trace work with scalar multiplication? If we multiply \(\tta\) by 4, then the diagonal elements will be 8, 0 and 12, so \(\tr(4\tta) = 20\text{.}\) Is it a coincidence that this is 4 times the trace of \(\tta\text{?}\)
Let’s move on to matrix multiplication. How will the trace of \(\tta\ttb\) relate to the traces of \(\tta\) and \(\ttb\text{?}\) Let’s see:
It isn’t exactly clear what the relationship is among \(\tr(\tta)\text{,}\)\(\tr(\ttb)\) and \(\tr(\tta\ttb)\text{.}\) Before moving on, let’s find \(\tr(\ttb\tta)\text{:}\)
Therefore \(\tr(\ttai) = 6/17\text{.}\) Again, the relationship isn’t clear.
Finally, let’s see how the trace is related to the transpose. We actually don’t have to formally compute anything. Recall from the previous section that the diagonals of \(\tta\) and \(\ttat\) are identical; therefore, \(\tr(\tta) = \tr(\ttat)\text{.}\) That, we know for sure, isn’t a coincidence.
We now formally state what equalities are true when considering the interaction of the trace with other matrix operations.
Theorem6.2.3.Properties of the Matrix Trace.
Let \(\tta\) and \(\ttb\) be \(n\times n\) matrices. Then:
One of the key things to note here is what this theorem does not say. It says nothing about how the trace relates to matrix multiplication; that is, we can’t figure out what \(\tr(\tta\ttb)\) is just by knowing what \(\tr(\tta)\) and \(\tr(\ttb)\) are. The theorem also says nothing about how the trace relates to inverses. The reason for the silence in these areas is that there simply is not a relationship.
We end this section by again wondering why anyone would care about the trace of matrix. One reason mathematicians are interested in it is that it can give a measurement of the “size” of a matrix.
These matrices have the same trace, yet \(\ttb\) clearly has bigger elements in it. So how can we use the trace to determine a “size” of these matrices? We can consider \(\tr(\ttat\tta)\) and \(\tr(\ttbt\ttb)\text{.}\)
Our concern is not how to interpret what this “size” measurement means, but rather to demonstrate that the trace (along with the transpose) can be used to give (perhaps useful) information about a matrix.