Section4.1Matrix Addition and Scalar Multiplication
As mentioned above, a matrix is a construction that allows us to organize information in a tabular form. For example, we may be interested in the following (made up) data involving crop yields on several Southern Alberta farms, given in tabular form:
Someone in charge of compiling data on farms and crops probably already has a program or spreadsheet set up with all of the farms and crops pre-defined; what they are interested in are the numbers giving the crop yields. Thus, when they enter their data into the computer, they are probably more interested in the array
As long as weβre consistent and always assign each farm to the same row, and each crop to the same column, we can dispense with the labels and work directly with the data.
The array above is our first example of a matrix, which we now define. The definition of matrix is remarkable only in how unremarkable it seems β it is simply a way of organizing information (usually numbers) into an array.
The horizontal lines of numbers form rows and the vertical lines of numbers form columns. A matrix with \(m\) rows and \(n\) columns is said to be an \(m\times n\) matrix (βan \(m\) by \(n\) matrixβ, or a matrix of size \(m\times n\)).
That is, \(a_{32}\) means βthe number in the third row and second column.β To save space, we will sometimes use the shorthand notation \(A = [a_{ij}]\) to denote a matrix \(A\) with entries \(a_{ij}\text{.}\) If we need to specify the size, we can also write \(A=[a_{ij}]_{m\times n}\text{.}\)
In particular, we can obtain row or column vectors by isolating any row or column of a given \(m\times n\) matrix. For example, given our matrix of crop data in EquationΒ (4.1.1), we might be interested only in Farm B, in which case we would want the row vector
Continuing with our farming example, suppose that in addition to the matrix \(A\) of 2014 crop yields above we also have data for 2015 crop yields given by
We might be interested in quantities such as the total crop yields over two years. If we arrange these totals into a matrix \(T\text{,}\) it seems like it should be reasonable to define matrix addition in such a way that we can write
\begin{equation*}
T = A + B\text{.}
\end{equation*}
This leads to two questions. First, how do we define matrix addition in order to ensure this outcome? Second, and perhaps more fundamentally, what do we mean by β\(=\)β in the context of matrices? Let us tackle the second question first.
Now we move on to describing how to add two matrices together. To start off, take a wild stab: how do you think we should add our matrices of crop data above? Well, if we want the sum to represent the total yields for each crop, then it stands to reason that to add the two matrices, we add together each of the corresponding crop yields within them:
For another example, suppose we wanted to know the total production for each farm in 2014, assuming that our matrices represent all of the crops each farm produces. This would be obtained by simply calculating the total for each row in the matrix \(A\text{.}\) Another way to accomplish the same task is as follows: let
The column vector \(Y\) then contains the total yields for each farm. Notice that although we only defined the sum of two matrices in DefinitionΒ 4.1.5, it makes sense to add any number of matrices, and there is no need to add parentheses.
It stands to reason that scalar multiplication should be defined in exactly the same way for row and column vectors; after all, in most cases these are just different ways of writing down the same mathematical object. Thus, in order to multiply a row or column vector by a scalar, we should multiply each entry in the vector by that scalar. From here, itβs not too much of a stretch to conclude that the same definition is reasonable for matrices in general.
Let \(A = [a_{ij}]\) be an \(m\times n\) matrix and let \(k\) be a scalar. The scalar multiplication \(A\) by \(k\text{,}\) denoted \(kA\text{,}\) is defined by
Referring one last time to our farming data, we could imagine that a fertilizer company is advertising a new product they claim will increase crop yields by 30%. Increasing the yield of each crop by 30\% amounts to multiplying each of the entries in the matrices \(A\) or \(B\) by a factor of 1.3; according to DefinitionΒ 4.1.6, this is the same as forming the scalar multiples \(1.3A\) and \(1.3B\text{.}\)
Since we have two yearsβ worth of crop data, we could also ask for the average yield for each crop on each farm. For example, Farm A produced 48 tonnes of corn in 2014, and 41 tonnes of corn in 2015. The two-year average for corn on Farm A is thus
Notice that we can obtain the average for each entry by dividing each entry in \(T=A+B\) by 2, which is the same thing as multiplying by \(\frac{1}{2}\text{.}\) Our matrix of averages is
with similar considerations for the other entries, we also could have obtained the average by first dividing \(A\) and \(B\) by \(\frac{1}{2}\text{,}\) and then adding the result. That is,
Expressions such as \(\frac12 A+\frac12 B\) that use both addition and scalar multiplication together occur frequently in Linear Algebra, and are known as linear combinations. In general, a linear combination can be formed from any number of matrices, as long as theyβre all of the same size.
\(A + C\) is not defined. If we look at our definition of matrix addition, we see that the two matrices need to be the same size. Since \(A\) and \(C\) have different dimensions, we donβt even try to create something as an addition; we simply say that the sum is not defined.
Our example raised a few interesting points. Notice how \(A + B = B + A\text{.}\) We probably arenβt surprised by this, since we know that when dealing with numbers, \(a+b = b+a\text{.}\) Also, notice that \(5A+5B = 5(A+B)\text{.}\) In our example, we were careful to compute each of these expressions following the proper order of operations; knowing these are equal allows us to compute similar expressions in the most convenient way.
In fact, this is a special matrix. We define \(\tto\text{,}\) which we read as βthe zero matrix,β to be the matrix of all zeros. We should be careful; this previous βdefinitionβ is a bit ambiguous, for we have not stated what size the zero matrix should be. Is \(\bbm 0\amp 0\\0\amp 0\\ \ebm\) the zero matrix? How about \(\bbm 0\amp 0\ebm\text{?}\)
Letβs not get bogged down in semantics. If we ever see \(\tto\) in an expression, we will usually know right away what size \(\tto\) should be; it will be the size that allows the expression to make sense. If \(A\) is a \(3\times 5\) matrix, and we write \(A + \tto\text{,}\) weβll simply assume that \(\tto\) is also a \(3\times 5\) matrix. If we are ever in doubt, we can add a subscript; for instance, \(\tto_{2\times 7}\) is the \(2\times7\) matrix of all zeros.
Be sure that this last property makes sense; it says that if we multiply any matrix by the number 0, the result is the zero matrix, or \(\tto\text{.}\) (You now have more than one kind of zero to keep track of!)
Itβs important to understand that since matrix addition and scalar multiplication are defined in terms of the entries of our matrices, the properties in TheoremΒ 4.1.10 follow directly from the properties of real number arithmetic in SectionΒ 1.2. For example to prove item 1 above, let \(A=[a_{ij}]\) and \(B=[b_{ij}]\) be \(m\times n\) matrices. We then have
Our matrix properties identified \(\tto\) as the Additive Identity; i.e., if you add \(\tto\) to any matrix \(A\text{,}\) you simply get \(A\text{.}\) This is similar in notion to the fact that for all numbers \(a\text{,}\)\(a+0 = a\text{.}\) A Multiplicative Identity would be a matrix \(I\) where \(I\times A = A\) for all matrices \(A\text{.}\) (What would such a matrix look like? A matrix of all 1s, perhaps?) However, in order for this to make sense, weβll need to learn to multiply matrices together, which weβll do in the next section.