Math1410 The Matrix Transpose

Section 6.1 The Matrix Transpose

This section introduces the transpose of a matrix: a simple, but useful operation. We jump right in with a definition.

Definition 6.1.1. Transpose.

Let \(A\) be an \(m\times n\) matrix. The tranpsose of \(A\text{,}\) denoted \(A^T\text{,}\) is the \(n\times m\) matrix whose columns are the respective rows of \(A\text{.}\)

🔗

If we write \(A=[a_{ij}]\) to emphasize the entries of \(A\text{,}\) then the transpose of \(A\) is the matrix \(A^T = [a^T_{ij}]\) where \(a^T_{ij} = a_{ji}\text{;}\) that is, the \((i,j)\)-entry of \(A^T\) is the \((j,i)\)-entry of \(A\text{.}\) Examples will make this definition clear.

🔗

Example 6.1.2. Taking the transpose of a matrix.

Find the transpose of \(A = \bbm 1\amp 2\amp 3\\4\amp 5\amp 6\ebm\text{.}\)

🔗

Solution.

Note that \(A\) is a \(2\times 3 \) matrix, so \(A^T\) will be a \(3 \times 2\) matrix. By the definition, the first column of \(A^T\) is the first row of \(A\text{;}\) the second column of \(A^T\) is the second row of \(A\text{.}\) Therefore,

\begin{equation*} A^T = \bbm 1 \amp 4\\2 \amp 5\\3\amp 6\ebm\text{.} \end{equation*}

🔗

Example 6.1.3. Computing transposes.

Find the transpose of the following matrices.

\begin{equation*} A = \bbm 7 \amp 2 \amp 9\amp 1\\2\amp -1\amp 3\amp 0\\-5\amp 3\amp 0\amp 11\ebm \quad B = \bbm 1\amp 10\amp -2\\3\amp -5\amp 7\\4\amp 2\amp -3\ebm \quad C = \bbm 1\amp -1\amp 7\amp 8\amp 3\ebm\text{.} \end{equation*}

🔗

Solution.

We find each transpose using the definition without explanation. Make note of the dimensions of the original matrix and the dimensions of its transpose.

\begin{equation*} A^T = \bbm 7 \amp 2 \amp -5\\2\amp -1\amp 3\\9\amp 3\amp 0\\1\amp 0\amp 11\ebm \quad B^T = \bbm 1\amp 3\amp 4\\10\amp -5\amp 2\\-2\amp 7\amp -3\ebm\quad C^T = \bbm 1\\-1\\7\\8\\3\ebm\text{.} \end{equation*}

🔗

Notice that with matrix \(B\text{,}\) when we took the transpose, the diagonal did not change. We can see what the diagonal is below where we rewrite \(B\) and \(B^T\) with the diagonal in bold. We’ll follow this by a definition of what we mean by “the diagonal of a matrix,” along with a few other related definitions.

\begin{equation*} B = \bbm\mathbf{1} \amp 10 \amp -2\\3\amp \mathbf{-5}\amp 7\\4\amp 2\amp \mathbf{-3}\ebm \quad B^T = \bbm\mathbf{1}\amp 3\amp 4\\10\amp \mathbf{-5}\amp 2\\-2\amp 7\amp \mathbf{-3}\ebm\text{.} \end{equation*}

🔗

It is probably pretty clear why we call those entries “the diagonal.” Here is the formal definition.

🔗

Definition 6.1.4. The Diagonal, a Diagonal Matrix, Triangular Matrices.

Let \(A\) be an \(m\times n\) matrix. The diagonal of \(A\) consists of the entries \(a_{11}\text{,}\) \(a_{22}\text{,}\) \(\dots\) of \(A\text{.}\)

🔗

A diagonal matrix is an \(n\times n\) matrix in which the only nonzero entries lie on the diagonal.

🔗

An upper (lower) triangular matrix is a matrix in which any nonzero entries lie on or above (below) the diagonal.

🔗

Example 6.1.5. Classifying matrices.

Consider the matrices \(A\text{,}\) \(B\text{,}\) \(C\) and \(I_4\text{,}\) as well as their transposes, where

\begin{equation*} A = \bbm 1 \amp 2 \amp 3\\0\amp 4\amp 5\\0\amp 0\amp 6\ebm\quad B = \bbm 3\amp 0\amp 0\\0\amp 7\amp 0\\0\amp 0\amp -1\ebm\quad C = \bbm 1\amp 2\amp 3\\0\amp 4\amp 5\\0\amp 0\amp 6\\0\amp 0\amp 0\ebm\text{.} \end{equation*}

Identify the diagonal of each matrix, and state whether each matrix is diagonal, upper triangular, lower triangular, or none of the above.

🔗

Solution.

We first compute the transpose of each matrix.

\begin{equation*} A^T = \bbm 1\amp 0\amp 0\\2\amp 4\amp 0\\3\amp 5\amp 6\ebm\quad B^T = \bbm 3\amp 0\amp 0\\0\amp 7\amp 0\\0\amp 0\amp -1\ebm\quad C^T = \bbm 1\amp 0\amp 0\amp 0\\2\amp 4\amp 0\amp 0\\3\amp 5\amp 6\amp 0\ebm\text{.} \end{equation*}

Note that \(I^T_4 = I_4\text{.}\)

🔗

The diagonals of \(A\) and \(A^T\) are the same, consisting of the entries 1, 4 and 6. The diagonals of \(B\) and \(B^T\) are also the same, consisting of the entries 3, 7 and \(-1\text{.}\) Finally, the diagonals of \(C\) and \(C^T\) are the same, consisting of the entries 1, 4 and 6.

🔗

The matrix \(A\) is upper triangular; the only nonzero entries lie on or above the diagonal. Likewise, \(A^T\) is lower triangular.

🔗

The matrix \(B\) is diagonal. By their definitions, we can also see that \(B\) is both upper and lower triangular. Likewise, \(I_4\) is diagonal, as well as upper and lower triangular.

🔗

Finally, \(C\) is upper triangular, with \(C^T\) being lower triangular.

🔗

Make note of the definitions of diagonal and triangular matrices. We specify that a diagonal matrix must be square, but triangular matrices don’t have to be. (“Most” of the time, however, the ones we study are.) Also, as we mentioned before in the example, by definition a diagonal matrix is also both upper and lower triangular. Finally, notice that by definition, the transpose of an upper triangular matrix is a lower triangular matrix, and vice-versa.

🔗

There are many questions to probe concerning the transpose operations. The first set of questions we’ll investigate involve the matrix arithmetic we learned from last chapter. We do this investigation by way of examples, and then summarize what we have learned at the end.

🔗

Example 6.1.6. Adding transposed matrices.

Let

\begin{equation*} A = \bbm 1 \amp 2 \amp 3\\4\amp 5\amp 6\ebm \ \text{and}\ B = \bbm 1\amp 2\amp 1\\3\amp -1\amp 0\ebm\text{.} \end{equation*}

Find \(A^T+B^T\) and \((A + B)^T\text{.}\)

🔗

Solution.

We note that

\begin{equation*} A^T = \bbm 1 \amp 4\\2 \amp 5\\3\amp 6\ebm\ \text{and}\ B^T = \bbm 1\amp 3\\2\amp -1\\1\amp 0\ebm\text{.} \end{equation*}

Therefore

\begin{align*} A^T + B^T \amp = \bbm 1\amp 4\\2\amp 5\\3\amp 6\ebm+\bbm 1\amp 3\\2\amp -1\\1\amp 0\ebm\\ \amp = \bbm 2\amp 7\\4\amp 4\\4\amp 6\ebm\text{.} \end{align*}

Also,

\begin{align*} (A+B)^T \amp = \left(\bbm 1\amp 2\amp 3\\4\amp 5\amp 6\ebm+\bbm 1\amp 2\amp 1\\3\amp -1\amp 0\ebm \right)^T\\ \amp = \left( \bbm 2\amp 4\amp 4\\7\amp 4\amp 6\ebm\right)^T\\ \amp = \bbm 2\amp 7\\4\amp 4\\4\amp 6\ebm\text{.} \end{align*}

🔗

It looks like “the sum of the transposes is the transpose of the sum.” (This is kind of fun to say, especially when said fast. Regardless of how fast we say it, we should think about this statement. The “is” represents “equals.” The stuff before “is” equals the stuff afterwards.) This should lead us to wonder how the transpose works with multiplication.

🔗

Example 6.1.7. Multiplying transposed matrices.

Let

\begin{equation*} A = \bbm 1 \amp 2\\3 \amp 4\ebm \ \text{and} \ B = \bbm 1\amp 2\amp -1\\1\amp 0\amp 1\ebm\text{.} \end{equation*}

Find \((AB)^T\text{,}\) \(A^TB^T\) and \(B^TA^T\text{.}\)

🔗

Solution.

We first note that

\begin{equation*} A^T = \bbm 1 \amp 3\\2 \amp 4\ebm \ \text{and} \ B^T = \bbm 1\amp 1\\2\amp 0\\-1\amp 1\ebm\text{.} \end{equation*}

Find \((AB)^T\text{:}\)

\begin{align*} (AB)^T \amp = \left(\bbm 1\amp 2\\3\amp 4\ebm \bbm 1\amp 2\amp -1\\1\amp 0\amp 1\ebm\right)^T\\ \amp = \left(\bbm 3\amp 2\amp 1\\7\amp 6\amp 1\ebm\right)^T\\ \amp = \bbm 3\amp 7\\2\amp 6\\1\amp 1\ebm\text{.} \end{align*}

🔗

Now find \(A^TB^T\text{:}\)

\begin{align*} A^TB^T \amp = \bbm 1\amp 3\\2\amp 4\ebm\bbm 1\amp 1\\2\amp 0\\-1\amp 1\ebm\\ \amp = \text{Not defined!}\text{.} \end{align*}

So we can’t compute \(A^TB^T\text{.}\) Let’s finish by computing \(B^TA^T\text{:}\)

\begin{align*} B^TA^T \amp = \bbm 1\amp 1\\2\amp 0\\-1\amp 1\ebm\bbm 1\amp 3\\2\amp 4\ebm\\ \amp =\bbm 3\amp 7\\2\amp 6\\1\amp 1\ebm\text{.} \end{align*}

🔗

We may have suspected that \((AB)^T = A^TB^T\text{.}\) We saw that this wasn’t the case, though — and not only was it not equal, the second product wasn’t even defined! Oddly enough, though, we saw that \((AB)^T = B^TA^T\text{.}\) (Then again, maybe this isn’t all that “odd.” It is reminiscent of the fact that, when invertible, \((AB)^{-1} = B^{-1}A^{-1}\text{.}\)) To help understand why this is true, look back at the work above and confirm the steps of each multiplication.

🔗

We have one more arithmetic operation to look at: the inverse.

🔗

Example 6.1.8. Inverting a transposed matrix.

Let

\begin{equation*} A = \bbm 2 \amp 7\\1 \amp 4\ebm\text{.} \end{equation*}

Find \((A^{-1})^T\) and \((A^T)^{-1}\text{.}\)

🔗

Solution.

We first find \(A^{-1}\) and \(A^T\text{:}\)

\begin{equation*} A^{-1} = \bbm 4 \amp -7\\-1 \amp 2\ebm\ \text{and}\ A^T = \bbm 2\amp 1\\7\amp 4\ebm\text{.} \end{equation*}

🔗

Finding \((A^{-1})^T\text{:}\)

\begin{align*} (A^{-1})^T \amp = \bbm 4\amp -7\\-1\amp 2\ebm ^T\\ \amp = \bbm 4\amp -1\\-7\amp 2\ebm\text{.} \end{align*}

🔗

Finding \((A^T)^{-1}\text{:}\)

\begin{align*} (A^T)^{-1} \amp = \bbm 2\amp 1\\7\amp 4\ebm^{-1}\\ \amp = \bbm 4\amp -1\\-7\amp 2\ebm\text{.} \end{align*}

🔗

It seems that “the inverse of the transpose is the transpose of the inverse.” (Again, we should think about this statement. The part before “is” states that we take the transpose of a matrix, then find the inverse. The part after “is” states that we find the inverse of the matrix, then take the transpose. Since these two statements are linked by an “is,” they are equal.)

🔗

We have just looked at some examples of how the transpose operation interacts with matrix arithmetic operations. (These examples don’t prove anything, other than it worked in specific examples.) We now give a theorem that tells us that what we saw wasn’t a coincidence, but rather is always true.

🔗

Theorem 6.1.9. Properties of the Matrix Transpose.

Let \(A\) and \(B\) be matrices where the following operations are defined. Then:

\((A+B)^T = A^T+B^T\) and \((A-B)^T = A^T-B^T\)

🔗
\(\displaystyle (kA)^T = kA^T\)

🔗
\(\displaystyle (AB)^T = B^TA^T\)

🔗
\(\displaystyle (A^{-1})^T = (A^T)^{-1}\)

🔗
\(\displaystyle (A^T)^T = A\)

🔗

🔗

We included in the theorem two ideas we didn’t discuss already. First, that \((kA)^T = kA^T\text{.}\) This is probably obvious. It doesn’t matter when you multiply a matrix by a scalar when dealing with transposes.

🔗

The second “new” item is that \((A^T)^T = A\text{.}\) That is, if we take the transpose of a matrix, then take its transpose again, what do we have? The original matrix.

🔗

Now that we know some properties of the transpose operation, we are tempted to play around with it and see what happens. For instance, if \(A\) is an \(m\times n\) matrix, we know that \(A^T\) is an \(n\times m\) matrix. So no matter what matrix \(A\) we start with, we can always perform the multiplication \(AA^T\) (and also \(A^TA\)) and the result is a square matrix!

🔗

Another thing to ask ourselves as we “play around” with the transpose: suppose \(A\) is a square matrix. Is there anything special about \(A+A^T\text{?}\) The following example has us try out these ideas.

🔗

Example 6.1.10. The matrices \(AA^T\text{,}\) \(A+A^T\text{,}\) and \(A-A^T\).

Let

\begin{equation*} A = \bbm 2 \amp 1 \amp 3\\2\amp -1\amp 1\\1\amp 0\amp 1\ebm\text{.} \end{equation*}

Find \(AA^T\text{,}\) \(A+A^T\) and \(A - A^T\text{.}\)

🔗

Solution.

Finding \(AA^T\text{:}\)

\begin{align*} AA^T \amp = \bbm 2\amp 1\amp 3\\2\amp -1\amp 1\\1\amp 0\amp 1\ebm\bbm 2\amp 2\amp 1\\1\amp -1\amp 0\\3\amp 1\amp 1\ebm\\ \amp = \bbm 14 \amp 6 \amp 5\\ 6\amp 4\amp 3\\ 5\amp 3\amp 2\ebm\text{.} \end{align*}

🔗

Finding \(A+A^T\text{:}\)

\begin{align*} A+A^T \amp = \bbm 2\amp 1\amp 3\\2\amp -1\amp 1\\1\amp 0\amp 1\ebm+\bbm 2\amp 2\amp 1\\1\amp -1\amp 0\\3\amp 1\amp 1\ebm\\ \amp = \bbm 4\amp 3\amp 4\\3\amp -2\amp 1\\4\amp 1\amp 2\ebm\text{.} \end{align*}

🔗

Finding \(A-A^T\text{:}\)

\begin{align*} A-A^T \amp = \bbm 2\amp 1\amp 3\\2\amp -1\amp 1\\1\amp 0\amp 1\ebm-\bbm 2\amp 2\amp 1\\1\amp -1\amp 0\\3\amp 1\amp 1\ebm\\ \amp = \bbm 0\amp -1\amp 2\\1\amp 0\amp 1\\-2\amp -1\amp 0\ebm\text{.} \end{align*}

🔗

Let’s look at the matrices we’ve formed in this example. First, consider \(AA^T\text{.}\) Something seems to be nice about this matrix — look at the location of the 6’s, the 5’s and the 3’s. More precisely, let’s look at the transpose of \(AA^T\text{.}\) We should notice that if we take the transpose of this matrix, we have the very same matrix. That is,

\begin{equation*} \left(\bbm 14 \amp 6 \amp 5\\ 6\amp 4\amp 3\\ 5\amp 3\amp 2\ebm\right)^T = \bbm 14 \amp 6 \amp 5\\ 6\amp 4\amp 3\\ 5\amp 3\amp 2\ebm\text{!} \end{equation*}

🔗

We’ll formally define this in a moment, but a matrix that is equal to its transpose is called symmetric.

🔗

Look at the next part of the example; what do we notice about \(A+A^T\text{?}\) We should see that it, too, is symmetric. Finally, consider the last part of the example: do we notice anything about \(A-A^T\text{?}\)

🔗

We should immediately notice that it is not symmetric, although it does seem “close.” Instead of it being equal to its transpose, we notice that this matrix is the opposite of its transpose. We call this type of matrix skew symmetric. (Some mathematicians use the term antisymmetric.) We formally define these matrices here.

🔗

Definition 6.1.11. Symmetric and Skew Symmetric Matrices.

A matrix \(A\) is symmetric if \(A^T = A\text{.}\)

🔗

A matrix \(A\) is skew symmetric if \(A^T = -A\text{.}\)

🔗

Note that in order for a matrix to be either symmetric or skew symmetric, it must be square.

🔗

So why was \(AA^T\) symmetric in our previous example? Did we just luck out? (Of course not.) Let’s take the transpose of \(AA^T\) and see what happens.

\begin{align*} (AA^T)^T \amp = (A^T)^T(A)^T \quad \text{ (transpose multiplication rule)}\\ \amp = AA^T \quad ((A^T)^T = A)\text{.} \end{align*}

🔗

We have just proved that no matter what matrix \(A\) we start with, the matrix \(AA^T\) will be symmetric. Nothing in our string of equalities even demanded that \(A\) be a square matrix; it is always true.

🔗

We can do a similar proof to show that as long as \(A\) is square, \(A+A^T\) is a symmetric matrix. (Why do we say that \(A\) has to be square?) We’ll instead show here that if \(A\) is a square matrix, then \(A-A^T\) is skew symmetric.

\begin{align*} (A-A^T)^T \amp = A^T - (A^T)^T \quad\ \text{(transpose subtraction rule)}\\ \amp = A^T - A\\ \amp = -(A - A^T)\text{.} \end{align*}

🔗

So we took the transpose of \(A - A^T\) and we got \(-(A-A^T)\text{;}\) this is the definition of being skew symmetric.

🔗

We’ll take what we learned from Example 6.1.10 and put it in a box. (We’ve already proved most of this is true; the rest we leave to solve in the Exercises.)

🔗

Theorem 6.1.12. Symmetric and Skew Symmetric Matrices.

Given any matrix \(A\text{,}\) the matrices \(AA^T\) and \(A^TA\) are symmetric.

🔗
Let \(A\) be a square matrix. The matrix \(A+A^T\) is symmetric.

🔗
Let \(A\) be a square matrix. The matrix \(A-A^T\) is skew symmetric.

🔗

🔗

Why do we care about the transpose of a matrix? Why do we care about symmetric matrices?

🔗

There are two answers that each answer both of these questions. First, we are interested in the transpose of a matrix and symmetric matrices because they are interesting. One particularly interesting thing about symmetric and skew symmetric matrices is this: consider the sum of \((A+A^T)\) and \((A-A^T)\text{:}\)

\begin{equation*} (A+A^T)+(A-A^T) = 2A\text{.} \end{equation*}

This gives us an idea — if we were to multiply both sides of this equation by \(\frac12\text{,}\) then the right hand side would just be \(A\text{.}\) This means that

\begin{equation*} A = \underbrace{\frac12(A+A^T)}_{\text{symmetric}}\ +\ \underbrace{\frac12(A-A^T)}_{\text{skew symmetric}}\text{.} \end{equation*}

That is, any matrix \(A\) can be written as the sum of a symmetric and skew symmetric matrix. That’s interesting.

🔗

The second reason we care about them is that they are very useful and important in various areas of mathematics. The transpose of a matrix turns out to be an important operation; symmetric matrices have many nice properties that make solving certain types of problems possible.

🔗

Most of this text focuses on the preliminaries of matrix algebra, and the actual uses are beyond our current scope. One easy to describe example is curve fitting. Suppose we are given a large set of data points that, when plotted, look roughly quadratic. How do we find the quadratic that “best fits” this data? The solution can be found using matrix algebra, and specifically a matrix called the pseudoinverse. If \(A\) is a matrix, the pseudoinverse of \(A\) is the matrix \(A^\dagger = (A^TA)^{-1}A^T\) (assuming that the inverse exists). We aren’t going to worry about what all the above means; just notice that it has a cool sounding name and the transpose appears twice.

🔗

In the next section we’ll learn about the trace, another operation that can be performed on a matrix that is relatively simple to compute but can lead to some deep results.

🔗

Exercises Exercises

Exercise Group.

A matrix \(A\) is given. Find \(A^T\text{;}\) make note if \(A\) is upper/lower triangular, diagonal, symmetric and/or skew symmetric.

🔗

1.

\(\bbm 0\amp -6\amp 1\\ 6\amp 0\amp 4\\ -1\amp -4\amp 0\ebm\)

🔗

2.

\(\bbm 0\amp 3\amp -2\\ 3\amp -4\amp 1\\ -2\amp 1\amp 0\ebm \)

🔗

3.

\(\bbm -9\amp 4\amp 10\\ 6\amp -3\amp -7\\ -8\amp 1\amp -1\ebm\)

🔗

4.

\(\bbm -2\amp 10\\ 1\amp -7\\ 9\amp -2\ebm \)

🔗

5.

\(\bbm 1\amp 0\amp 0\\ 0\amp 2\amp 0\\ 0\amp 0\amp -1\ebm \)

🔗

6.

\(\bbm 4\amp 0\amp 0\\ -2\amp -7\amp 0\\ 4\amp -2\amp 5\ebm \)

🔗

7.

\(\bbm 6\amp -4\amp -5\\ -4\amp 0\amp 2\\ -5\amp 2\amp -2\ebm\)

🔗

8.

\(\bbm 4\amp 2\amp -9\\ 5\amp -4\amp -10\\ -6\amp 6\amp 9\ebm \)

🔗

9.

\(\bbm -7\amp -8\amp 2\amp -3\ebm\)

🔗

10.

\(\bbm 3\amp 1\\ -7\amp 8\ebm \)

🔗

11.

\(\bbm 13\amp -3\\ -3\amp 1\ebm\)

🔗

12.

\(\bbm 0\amp 0\amp 0\\ 0\amp 0\amp 0\\ 0\amp 0\amp 0\ebm\)

🔗

13.

\(\bbm 4\amp -7\amp -4\amp -9\\ -9\amp 6\amp 3\amp -9\ebm \)

🔗

14.

\(\bbm 4\amp -5\amp 2\\ 1\amp 5\amp 9\\ 9\amp 2\amp 3\ebm \)

🔗

15.

\(\bbm 2\amp -5\amp -3\\ 5\amp 5\amp -6\\ 7\amp -4\amp -10\ebm \)

🔗

16.

\(\bbm -7\amp 4\\ 4\amp -6\ebm\)

🔗

17.

\(\bbm -3\amp -4\amp -5\\ 0\amp -3\amp 5\\ 0\amp 0\amp -3\ebm \)

🔗

18.

\(\bbm -9\amp 8\amp 2\amp -7\ebm \)

🔗

19.

\(\bbm 4\amp 0\amp -2\\ 0\amp 2\amp 3\\ -2\amp 3\amp 6\ebm\)

🔗

20.

\(\bbm 1\amp 0\\ 0\amp 9\ebm \)

🔗

21.

\(\bbm -5\amp -9\\ 3\amp 1\\ -10\amp -8\ebm \)

🔗

22.

\(\bbm 6\amp -7\amp 2\amp 6\\ 0\amp -8\amp -1\amp 0\\ 0\amp 0\amp 1\amp -7\ebm\)

🔗

23.

\(\bbm 0\amp 1\amp -2\\ -1\amp 0\amp 4\\ 2\amp -4\amp 0\ebm\)

🔗

24.

\(\bbm 3\amp -10\amp 0\amp 6\\ -10\amp -2\amp -3\amp 1\ebm \)

🔗

Prev Top Next