Matrices#


Table of Contents#


Definition of a Matrix#

A matrix is a rectangular array of numbers.

An \(m \times n\) matrix contains \(m\) rows and \(n\) columns.

Matrix Notation#

Given an \(m \times n\) matrix \(\mathbf{A}\) the notation \(A_{ij}\) denotes the entry of \(\mathbf{A}\) located at the intersection of the \(i\)-th row and the \(j\)-th column.

\( \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \\ \end {bmatrix} \)

For example, if

\( \mathbf{A} = \begin{bmatrix*}[r] 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 9 & 10 & 11 & 12 \\ \end {bmatrix*} \)

then

\(A_{12} = 2, A_{34} = 12, A_{31} = 9\)

Diagonal entries of a matrix#

Entries \(A_{11}, A_{22}, A_{33}, \dots\) are called the diagonal entries of the matrix.

Definition of a Zero Matrix#

A matrix with all entries equal to zero is called a zero matrix.

Definition of a Square Matrix#

A matrix having the same number of rows and columns is called a square matrix.

Definition of an Identity Matrix#

An identity matrix \(\mathbf{I_n}\) is the square matrix with \(n\) rows and \(n\) columns whose diagonal entries are all equal to \(1\) and whose off-diagonal entries are all equal to \(0\).


Matrix Addition#

Let there be two \(m \times n\) matrices \(\mathbf{A, B}\).

Then \(\mathbf{A + B}\) is an \(m \times n\) matrix with entry \((A + B)_{ij} = A_{ij} + B_{ij}\)

The addition of matrices of unequal size is not defined.

Example#

\( \mathbf{A} = \begin{bmatrix*}[r] 2 & 1 & 0 & 3 \\ -1 & 0 & 2 & 4 \\ 4 & -2 & 7 & 0 \\ \end {bmatrix*} \)

\( \mathbf{B} = \begin{bmatrix*}[r] -4 & 3 & 5 & 1 \\ 2 & 2 & 0 & -1 \\ 3 & 2 & -4 & 5 \\ \end {bmatrix*} \)

\( \mathbf{A + B} = \begin{bmatrix*}[r] 2 + (-4) & 1 + 3 & 0 + 5 & 3 + 1 \\ (-1) + 2 & 0 + 2 & 2 + 0 & 4 + (-1) \\ 4 + 3 & (-2) + 2 & 7 + (-4) & 0 + 5 \\ \end {bmatrix*} = \begin{bmatrix*}[r] -2 & 4 & 5 & 4 \\ 1 & 2 & 2 & 3 \\ 7 & 0 & 3 & 5 \\ \end {bmatrix*} \)


Scalar Multiplication#

Let there be an \(m \times n\) matrix \(\mathbf{A}\) and a scalar \(r\).

Then \(r\mathbf{A}\) is an \(m \times n\) matrix with entry \((rA)_{ij} = r \cdot A_{ij}\)

Example#

\( \mathbf{A} = \begin{bmatrix*}[r] 2 & 1 & 0 & 3 \\ -1 & 0 & 2 & 4 \\ 4 & -2 & 7 & 0 \\ \end {bmatrix*} \)

\( 2\mathbf{A} = \begin{bmatrix*}[r] 2 \cdot 2 & 2 \cdot 1 & 2 \cdot 0 & 2 \cdot 3 \\ 2 \cdot -1 & 2 \cdot 0 & 2 \cdot 2 & 2 \cdot 4 \\ 2 \cdot 4 & 2 \cdot -2 & 2 \cdot 7 & 2 \cdot 0 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 4 & 2 & 0 & 6 \\ -2 & 0 & 4 & 8 \\ 8 & -4 & 14 & 0 \\ \end {bmatrix*} \)


Example#

\( \mathbf{A} = \begin{bmatrix*}[r] 2 & 1 & 0 & 3 \\ -1 & 0 & 2 & 4 \\ 4 & -2 & 7 & 0 \\ \end {bmatrix*} \)

\( \mathbf{B} = \begin{bmatrix*}[r] -4 & 3 & 5 & 1 \\ 2 & 2 & 0 & -1 \\ 3 & 2 & -4 & 5 \\ \end {bmatrix*} \)

Compute \(2\mathbf{A - B}\).

\(2\mathbf{A - B} = 2\mathbf{A} + (-1)\mathbf{B}\)

\( 2\mathbf{A} = \begin{bmatrix*}[r] 2 \cdot 2 & 2 \cdot 1 & 2 \cdot 0 & 2 \cdot 3 \\ 2 \cdot -1 & 2 \cdot 0 & 2 \cdot 2 & 2 \cdot 4 \\ 2 \cdot 4 & 2 \cdot -2 & 2 \cdot 7 & 2 \cdot 0 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 4 & 2 & 0 & 6 \\ -2 & 0 & 4 & 8 \\ 8 & -4 & 14 & 0 \\ \end {bmatrix*} \)

\( (-1)\mathbf{B} = \begin{bmatrix*}[r] -1 \cdot -4 & -1 \cdot 3 & -1 \cdot 5 & -1 \cdot 1 \\ -1 \cdot 2 & -1 \cdot 2 & -1 \cdot 0 & -1 \cdot -1 \\ -1 \cdot 3 & -1 \cdot 2 & -1 \cdot -4 & -1 \cdot 5 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 4 & -3 & -5 & -1 \\ -2 & -2 & -0 & 1 \\ -3 & -2 & 4 & -5 \\ \end {bmatrix*} \)

\( 2\mathbf{A - B} = \begin{bmatrix*}[r] 4 & 2 & 0 & 6 \\ -2 & 0 & 4 & 8 \\ 8 & -4 & 14 & 0 \\ \end {bmatrix*} + \begin{bmatrix*}[r] 4 & -3 & -5 & -1 \\ -2 & -2 & 0 & 1 \\ -3 & -2 & 4 & -5 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 4 + 4 & 2 + (-3) & 0 + (-5) & 6 + (-1) \\ -2 + (-2) & 0 + (-2) & 4 + 0 & 8 + 1 \\ 8 + (-3) & -4 + (-2) & 14 + 4 & 0 + (-5) \\ \end {bmatrix*} = \begin{bmatrix*}[r] 8 & -1 & -5 & 5 \\ -4 & -2 & 4 & 9 \\ 5 & -6 & 18 & -5 \\ \end {bmatrix*} \)


Properties of matrix addition and scalar multiplication#

Let \(\mathbf{A, B, C}\) be \(m \times n\) matrices and let \(s, t\) be scalars.

Matrix addition is commutative.

\(\mathbf{A + B = B + A}\)

Matrix addition is associative.

\(\mathbf{(A + B) + C = A + (B + C)}\)

The zero matrix is the additive identity element.

\(\mathbf{A + 0 = A}\)

\(s(\mathbf{A + B}) = s \mathbf{A} + s \mathbf{B}\)

\((s + t)\mathbf{A} = s \mathbf{A} + t \mathbf{A}\)

\(s (t \mathbf{A}) = (st)\mathbf{A}\)


Matrix Transpose#

The transpose of an \(m \times n\) matrix \(\mathbf{A}\) is the \(n \times m\) matrix \(\mathbf{A}^\top\) such that

\((\mathbf{A}^\top)_{ij} = (\mathbf{A})_{ji}\)

The \(k\)-th row of \(\mathbf{A}^\top\) is the \(k\)-th column of \(\mathbf{A}\).

Example#

\( \mathbf{A} = \begin{bmatrix} \textcolor{red}{1} & \textcolor{orange}{0} & \textcolor{yellow}{2} \\ \textcolor{red}{3} & \textcolor{orange}{4} & \textcolor{yellow}{1} \\ \end {bmatrix} \)

\( \mathbf{A}^\top = \begin{bmatrix} \textcolor{red}{ 1} & \textcolor{red}{ 3} \\ \textcolor{orange}{0} & \textcolor{orange}{4} \\ \textcolor{yellow}{2} & \textcolor{yellow}{1} \\ \end {bmatrix} \)


Matrix Multiplication#

The product \(\mathbf{AB}\) of two matrices \(\mathbf{A, B}\) is defined iff the number of columns of \(\mathbf{A}\) is equal to the number of rows of \(\mathbf{B}\).

If \(\mathbf{A}\) is an \(m \times n\) matrix and \(\mathbf{B}\) is an \(n \times p\) matrix then their product \(\mathbf{AB}\) is an \(m \times p\) matrix.

The entry \((\mathbf{AB})_{ij}\) located at the intersection of the \(i\)-th row and the \(j\)-th column of the matrix \(\mathbf{AB}\) is computed using the entries of the \(i\)-th row of \(\mathbf{A}\) and the entries of the \(j\)-th column of \(\mathbf{B}\) according to the rule

\((\mathbf{AB})_{ij} = \sum_k A_{ik} \cdot B_{kj}\)

The entry of \(\mathbf{AB}\) located at the intersection of the \(i\)-th row and \(j\)-th column is the dot product of the \(i\)-th row of \(\mathbf{A}\) with the \(j\)-th column of \(\mathbf{B}\).

Example#

\( \mathbf{A} = \begin{bmatrix*}[r] 1 & 0 & 2 \\ 3 & -5 & 1 \\ \end {bmatrix*} \)

\( \mathbf{B} = \begin{bmatrix*}[r] -1 & 2 \\ 2 & 0 \\ 1 & -3 \\ \end {bmatrix*} \)

Compute \(\mathbf{AB}\).

\( \begin{aligned} & \underset{\mathbf{B}}{ \begin{bmatrix*}[r] \textcolor{blue}{-1} & \textcolor{purple}{ 2} \\ \textcolor{blue}{ 2} & \textcolor{purple}{ 0} \\ \textcolor{blue}{ 1} & \textcolor{purple}{-3} \\ \end {bmatrix*}} \\ \underset{\mathbf{A}}{ \begin{bmatrix*}[r] \textcolor{red}{ 1} & \textcolor{red}{ 0} & \textcolor{red}{ 2} \\ \textcolor{orange}{3} & \textcolor{orange}{-5} & \textcolor{orange}{1} \\ \end {bmatrix*}} & \underset{\mathbf{AB}}{ \begin{bmatrix*}[r] \textcolor{red}{ 1} \cdot \textcolor{blue}{ -1} + \textcolor{red}{ 0} \cdot \textcolor{blue}{ 2} + \textcolor{red}{ 2} \cdot \textcolor{blue}{ 1} & \textcolor{red}{ 1} \cdot \textcolor{purple}{2} + \textcolor{red}{ 0} \cdot \textcolor{purple}{0} + \textcolor{red}{ 2} \cdot \textcolor{purple}{-3} \\ \textcolor{orange}{3} \cdot \textcolor{blue}{ -1} + \textcolor{orange}{-5} \cdot \textcolor{blue}{ 2} + \textcolor{orange}{1} \cdot \textcolor{blue}{ 1} & \textcolor{orange}{3} \cdot \textcolor{purple}{2} + \textcolor{orange}{-5} \cdot \textcolor{purple}{0} + \textcolor{orange}{1} \cdot \textcolor{purple}{-3} \\ \end {bmatrix*}} \\ \end {aligned} \\ \)

\( \begin{aligned} & \underset{\mathbf{B}}{ \begin{bmatrix*}[r] -1 & 2 \\ 2 & 0 \\ 1 & -3 \\ \end {bmatrix*}} \\ \underset{\mathbf{A}}{ \begin{bmatrix*}[r] 1 & 0 & 2 \\ 3 & -5 & 1 \\ \end {bmatrix*}} & \underset{\mathbf{AB}}{ \begin{bmatrix*}[r] 1 & -4 \\ -12 & 3 \\ \end {bmatrix*}} \\ \end {aligned} \\ \)

Why is matrix multiplication defined in this way?#

\( \begin{bmatrix} \text{---} & \mathbf{u_1}^\top & \text{---} \\ \text{---} & \mathbf{u_2}^\top & \text{---} \\ & \vdots & \\ \text{---} & \mathbf{u_m}^\top & \text{---} \\ \end {bmatrix} \begin{bmatrix} \vert & \vert & & \vert \\ \mathbf{v_1} & \mathbf{v_2} & \dots & \mathbf{v_p} \\ \vert & \vert & & \vert \\ \end {bmatrix} = \begin{bmatrix} \mathbf{u_1 \cdot v_1} & \mathbf{u_1 \cdot v_2} & \dots & \mathbf{u_1 \cdot v_p} \\ \mathbf{u_2 \cdot v_1} & \mathbf{u_2 \cdot v_2} & \dots & \mathbf{u_2 \cdot v_p} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{u_1 \cdot v_1} & \mathbf{u_1 \cdot v_2} & \dots & \mathbf{u_m \cdot v_p} \\ \end {bmatrix} \)

The dot (inner) product of two vectors \(\mathbf{u, v} \in \mathbb{R}^n\) is the scalar

\(\mathbf{u \cdot v} = u_1v_1 + u_2v_2 + \dots + u_nv_n = \sum_{k = 1}^n u_kv_k = \begin{bmatrix} u_1 & u_2 & \dots & u_n \\ \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \\ \end{bmatrix} = \begin{bmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \\ \end{bmatrix}^\top \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \\ \end{bmatrix} = \mathbf{u^\top v} \)


Properties of matrix multiplication#

For any scalars \(s\) and any matrices \(\mathbf{A, B, C, D}\) such that

\(\# \,\,\text{col}\,\, \mathbf{A} = \# \,\,\text{row}\,\, \mathbf{B} = \# \,\,\text{row}\,\, \mathbf{C}\)

and

\(\# \,\,\text{col}\,\, \mathbf{B} = \# \,\,\text{col}\,\, \mathbf{C} = \# \,\,\text{row}\,\, \mathbf{D}\)

in other words

\(\mathbf{A}_{[m \times p]}, \mathbf{B}_{[p \times q]}, \mathbf{C}_{[p \times q]}, \mathbf{D}_{[q \times n]}\)

the following holds

associativity of matrix multiplication

\(\mathbf{A(BD) = (AB)D}\)

\(\mathbf{A(B + C) = AB + AC}\)

\(\mathbf{(B + C)D = BD + CD}\)

\(s (\mathbf{AB}) = (s \mathbf{A}) \mathbf{B} = \mathbf{A} (s \mathbf{B})\)

\(\mathbf{I_mA = A = AI_n}\) when \(\mathbf{A}\) is an \(m \times n\) matrix

Note#

Square matrices \(\mathbf{A, B}\) of the same size can be multiplied in two ways \(\mathbf{AB}\) and \(\mathbf{BA}\).

But note that in most cases \(\mathbf{AB \ne BA}\).

Given three matrices \(\mathbf{A, B, C}\) such that

\(\# \,\,\text{col}\,\, \mathbf{A} = \# \,\,\text{row}\,\, \mathbf{B} = \# \,\,\text{row}\,\, \mathbf{C}\)

in other words

\(\mathbf{A}_{[m \times p]}, \mathbf{B}_{[p \times q]}, \mathbf{C}_{[p \times r]}\)

it can happen that \(\mathbf{AB = AC}\) while \(\mathbf{B \ne C}\) and \(\mathbf{A \ne 0}\)

and it can happen that \(\mathbf{AB = 0}\) while \(\mathbf{A \ne 0}\) and \(\mathbf{B \ne 0}\).

Example#

\( \mathbf{A} = \begin{bmatrix*}[r] 0 & 1 \\ 0 & 2 \\ \end {bmatrix*}, \mathbf{B} = \begin{bmatrix*}[r] 1 & 1 \\ 3 & 4 \\ \end {bmatrix*}, \mathbf{C} = \begin{bmatrix*}[r] 2 & 5 \\ 3 & 4 \\ \end {bmatrix*} \)

Compute \(\mathbf{AB}\).

\( \begin{bmatrix*}[r] 0 & 1 \\ 0 & 2 \\ \end {bmatrix*} \begin{bmatrix*}[r] 1 & 1 \\ 3 & 4 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 3 & 4 \\ 6 & 8 \\ \end {bmatrix*} \)

Compute \(\mathbf{BA}\).

\( \begin{bmatrix*}[r] 1 & 1 \\ 3 & 4 \\ \end {bmatrix*} \begin{bmatrix*}[r] 0 & 1 \\ 0 & 2 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 0 & 3 \\ 0 & 11 \\ \end {bmatrix*} \)

Compute \(\mathbf{AC}\).

\( \begin{bmatrix*}[r] 0 & 1 \\ 0 & 2 \\ \end {bmatrix*} \begin{bmatrix*}[r] 2 & 5 \\ 3 & 4 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 3 & 4 \\ 6 & 8 \\ \end {bmatrix*} \)


Properties of matrix transposition#

If \(\mathbf{A}\) is an \(m \times n\) matrix then \(\mathbf{A}^\top\) is an \(n \times m\) matrix.

\((\mathbf{A}^\top)^\top = \mathbf{A}\)

\(\mathbf{(A + B)}^\top = \mathbf{A}^\top + \mathbf{B}^\top\)

\((s \mathbf{A})^\top = s \mathbf{A}^\top\)

\((\mathbf{AB})^\top = \mathbf{B^\top A^\top}\)

Example#

\( \mathbf{A} = \begin{bmatrix*}[r] 0 & 2 & 0 \\ 3 & 0 & 1 \\ \end {bmatrix*} \)

\( \mathbf{B} = \begin{bmatrix*}[r] 2 & 5 \\ 3 & 4 \\ -2 & 4 \\ \end {bmatrix*} \)

Compute \(\mathbf{AB}\).

\( \mathbf{AB} = \begin{bmatrix*}[r] 0 & 2 & 0 \\ 3 & 0 & 1 \\ \end {bmatrix*} \begin{bmatrix*}[r] 2 & 5 \\ 3 & 4 \\ -2 & 4 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 6 & 8 \\ 4 & 19 \\ \end {bmatrix*} \)

Compute \((\mathbf{AB})^\top\).

\( \mathbf{AB} = \begin{bmatrix*}[r] 6 & 8 \\ 4 & 19 \\ \end {bmatrix*} \implies (\mathbf{AB})^\top = \begin{bmatrix*}[r] 6 & 4 \\ 8 & 19 \\ \end {bmatrix*} \)

also

\( (\mathbf{AB})^\top = \mathbf{B^\top A^\top} \)

(see below)

Compute \(\mathbf{A^\top B^\top}\).

\( \mathbf{A} = \begin{bmatrix*}[r] 0 & 2 & 0 \\ 3 & 0 & 1 \\ \end {bmatrix*} \implies \mathbf{A}^\top = \begin{bmatrix*}[r] 0 & 3 \\ 2 & 0 \\ 0 & 1 \\ \end {bmatrix*} \)

\( \mathbf{B} = \begin{bmatrix*}[r] 2 & 5 \\ 3 & 4 \\ -2 & 4 \\ \end {bmatrix*} \implies \mathbf{B}^\top = \begin{bmatrix*}[r] 2 & 3 & -2 \\ 5 & 4 & 4 \\ \end {bmatrix*} \)

\( \mathbf{A^\top B^\top} = \begin{bmatrix*}[r] 0 & 3 \\ 2 & 0 \\ 0 & 1 \\ \end {bmatrix*} \begin{bmatrix*}[r] 2 & 3 & -2 \\ 5 & 4 & 4 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 15 & 12 & 12 \\ 4 & 6 & -4 \\ 5 & 4 & 4 \\ \end {bmatrix*} \)

Compute \(\mathbf{B^\top A^\top}\).

\( \mathbf{B^\top A^\top} = \begin{bmatrix*}[r] 2 & 3 & -2 \\ 5 & 4 & 4 \\ \end {bmatrix*} \begin{bmatrix*}[r] 0 & 3 \\ 2 & 0 \\ 0 & 1 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 6 & 4 \\ 8 & 19 \\ \end {bmatrix*} \)


Powers of a square matrix#

The powers of a square matrix are defined in the following manner.

\( \begin{aligned} \mathbf{A^2} &= \mathbf{AA} \\ \mathbf{A^3} &= \mathbf{AAA} \\ \mathbf{A^k} &= \underbrace{\mathbf{AA \dots A}}_{k \,\,\,\text{times}} \\ \end {aligned} \)

Example#

\( \mathbf{A} = \begin{bmatrix} 1 & 0 \\ 3 & 2 \\ \end {bmatrix} \)

Compute \(\mathbf{A^3}\).

\( \mathbf{A^3} = \mathbf{AAA} = \begin{bmatrix} 1 & 0 \\ 3 & 2 \\ \end {bmatrix} \begin{bmatrix} 1 & 0 \\ 3 & 2 \\ \end {bmatrix} \begin{bmatrix} 1 & 0 \\ 3 & 2 \\ \end {bmatrix} = \begin{bmatrix*}[r] 1 & 0 \\ 9 & 4 \\ \end {bmatrix*} \begin{bmatrix*}[r] 1 & 0 \\ 3 & 2 \\ \end {bmatrix*} = \begin{bmatrix*}[r] 1 & 0 \\ 21 & 8 \\ \end {bmatrix*} \)