10 MATRIX-VECTOR AND MATRIX-MATRIX MULTIPLICATION, SCALAR PRODUCT

10.1. Matrix-vector product

Definition

We define the matrix-vector product between a m \times n matrix and a n-vector x, and denote by Ax, the m-vector with i-th component

    \[(Ax)_{i}=\sum_{j=1}^{n}A_{ij}x_{j}, i=1,\dots,m.\]

The picture on the left shows a symbolic example with n=2 and m=3. We have y=Ax, that is:

    \[y_{1}=A_{11}x_{1}+A_{12}x_{2},\]

    \[y_{2}=A_{21}x_{1}+A_{22}x_{2},\]

    \[y_{3}=A_{31}x_{1}+A_{32}x_{2}.\]

Interpretation as linear combinations of columns

If the columns of A are given by the vectors a_{i}, i=1,\dots,n so that A=(a_{1},\dots, a_{n}), then Ax can be interpreted as a linear combination of these columns, with weights given by the vector x:

    \[Ax=\sum_{i=1}^{n}x_{i}a_{i}.\]

In the above symbolic example, we have y=Ax, that is:

    \[ y = x_{1} \begin{pmatrix} A_{11} \\ A_{21} \\ A_{31} \end{pmatrix} + x_{2} \begin{pmatrix} A_{12} \\ A_{22} \\ A_{32} \end{pmatrix}.\]

See also:

Interpretation as scalar products with rows

Alternatively, if the rows of A are the row vectors a_{i}^{T}, i=1,\dots,m:

    \[A = \begin{pmatrix} a_{1}^{T} \\ \vdots \\ a_{m}^{T} \end{pmatrix},\]

then Ax is the vector with elements a_{i}^{T}x, i=1,\dots,m:

    \[Ax = \begin{pmatrix} a_{1}^{T}x \\ \vdots \\ a_{m}^{T}x \end{pmatrix}.\]

In the above symbolic example, we have y=Ax, that is:

    \[y_{1} =\begin{pmatrix} A_{11} \\ A_{12} \end{pmatrix}^{T} \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} = A_{11}x_{1} + A_{12}x_{2}\]

    \[y_{2} =\begin{pmatrix} A_{21} \\ A_{22} \end{pmatrix}^{T} \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} = A_{21}x_{1} + A_{22}x_{2}\]

    \[y_{3} =\begin{pmatrix} A_{31} \\ A_{32} \end{pmatrix}^{T} \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} = A_{31}x_{1} + A_{32}x_{2}\]

Left product

If z \in \mathbb{R}^{m}, then the notation z^{T}A is the row vector of size n equal to the transpose of the column vector A^{T}z \in \mathbb{R}^{n}. That is:

    \[(z^{T}A)_{j} = \sum_{i=1}^{m} A_{ij}z_{i}, j=1,\dots,n.\]

Example: Return to the network example, involving a m \times n incidence matrix. We note that, by construction, the columns of A sum to zero, which can be compactly written as 1^{T}A=0, or A^{T}1=0.

10.2. Matrix-matrix product

Definition

We can extend the matrix-vector product to the matrix-matrix product, as follows. If A \in \mathbb{R}^{m \times n} and B \in \mathbb{R}^{n \times p}, the notation AB denotes the m \times p matrix with i, j element given by

    \[(AB)_{ij} = \sum_{k=1}^{n}A_{ik}B_{kj}.\]

Transposing a product changes the order, so that (AB)^{T}=B^{T}A^{T}.

Column-wise interpretation

If the columns of B are given by the vectors b_{i}, with i=1, \dots, p, so that B=[b_{1}, \dots, b_{p}], then AB can be written as

    \[AB = A \begin{pmatrix}b_{1} & \dots & b_{p} \end{pmatrix} = \begin{pmatrix}Ab_{1} & \dots & Ab_{p} \end{pmatrix}.\]

In other words, AB results from transforming each column b_{i} of B into Ab_{i}.

Row-wise interpretation

The matrix-matrix product can also be interpreted as an operation on the rows of A. Indeed, if A is given by its rows a_{i}^{T}, i = 1, \dots, m then AB is the matrix obtained by transforming each one of these rows via B, into a_{i}^{T}B, i = 1, \dots, m:

    \[AB = \begin{pmatrix}a_{1}^{T} \\ \vdots \\ a_{m}^{T}\end{pmatrix}B= \begin{pmatrix}a_{1}^{T}B \\ \vdots \\ a_{m}^{T}B\end{pmatrix}.\]

(Note that a_{i}^{T}B‘s are indeed row vectors, according to our matrix-vector rules.)

10.3. Block Matrix Products

Matrix algebra generalizes to blocks, provided block sizes are consistent. To illustrate this, consider the matrix-vector product between a m \times n matrix A and a n-vector x, where A, x are partitioned in blocks, as follows:

    \[A = \begin{pmatrix} A_{1} & A_{2} \end{pmatrix}, \quad x = \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix}\]

where A_{i} is m \times n_{i}, x_{i} \in \mathbb{R}^{n}, i=1, 2, n_{1}+n_{2}=n. Then Ax = A_{1}x_{1} + A_{2}x_{2}.

Symbolically, it’s as if we would form the ‘‘scalar’’ product between the ‘‘row vector (A_{1}, A_{2}) and the column vector (x_{1}, x_{2})!

Likewise, if a n \times p matrix B is partitioned into two blocks B_{i}, each of size n_{i}, i=1,2, with n_{1}+n_{2}=n, then

    \[AB = \begin{pmatrix} A_{1} & A_{2} \end{pmatrix} \begin{pmatrix} B_{1} \\ B_{2} \end{pmatrix} = A_{1}B_{1} + A_{2}B_{2}.\]

Again, symbolically we apply the same rules as for the scalar product — except that now the result is a matrix.

ExampleGram matrix.

Finally, we can consider so-called outer products. Assume matrix A is partitioned row-wise and matrix B is partitioned column-wise. Therefore, we have:

    \[A = \begin{pmatrix} A_{1} \\ A_{2} \end{pmatrix}, \quad B =  \begin{pmatrix} B_{1} & B_{2} \end{pmatrix} .\]

The dimensions of these matrices should be consistent such that A_{1}, A_{2} are of dimensions m_{1} \times n and m_{2} \times n respectively and B_{1}, B_{2} are of dimensions n \times p_{1} and n \times p_{2} respectively. The dimensions of the resultant matrices A_{1}B_{1}, A_{1}B_{2}, A_{2}B_{1}, A_{2}B_{2} will be m_{1} \times p_{1}, m_{1} \times p_{2}, m_{2} \times p_{1}, m_{2} \times p_{2} respectively.

Then the product C = AB can be expressed in terms of the blocks, as follows:

    \[C = AB = \begin{pmatrix} A_{1} \\ A_{2} \end{pmatrix} \begin{pmatrix} B_{1} & B_{2} \end{pmatrix} = \begin{pmatrix} A_{1}B_{1} & A_{1}B_{2} \\ A_{2}B_{1} & A_{2}B_{2} \end{pmatrix} .\]

10.4. Trace, scalar product

Trace

The trace of a square n \times n matrix A, denoted by TrA, is the sum of its diagonal elements:

    \[{\bf Tr}A = \sum_{i=1}^{n}A_{ii}.\]

Some important properties:

  • Trace of transpose: The trace of a square matrix is equal to that of its transpose.
  • Commutativity under trace: for any two matrices A \in \mathbb{R}^{m \times n} and B \in \mathbb{R}^{n \times m}, we have

    \[{\bf Tr}(AB) = {\bf Tr}(BA).\]

Scalar product between matrices

We can define the scalar product between two m \times n matrices A, B via

    \[\langle A,B \rangle = {\bf Tr}(A^{T}B) = \sum_{i=1}^{m}\sum_{j=1}^{n}A_{ij}B_{ij}.\]

The above definition is symmetric: we have

    \[\langle A,B \rangle = {\bf Tr}(A^{T}B) = {\bf Tr}(A^{T}B)^{T} = {\bf Tr}(B^{T}A) = \langle B,A \rangle.\]

Our notation is consistent with the definition of the scalar product between two vectors, where we simply view a vector in \mathbb{R}^{n} as a matrix in \mathbb{R}^{n \times 1}. We can interpret the matrix scalar product as the vector scalar product between two long vectors of length mn each, obtained by stacking all the columns of A, B on top of each other.

License

Icon for the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Linear Algebra and Applications Copyright © 2023 by VinUiversity is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, except where otherwise noted.

Share This Book