Expressing matrix multiplications using rank one matrices
Matrix multiplication is a fundamental operation in many areas of computer science. In this article, we will explore a new technique of doing matrix multiplications. We will first get some intuition by going through some examples. We will then formally prove that the new technique gives the same result as the standard matrix multiplication. Finally, we will discuss its possible applications.
Prerequisites
I am assuming that you are familiar with linear algebra corresponding to a typical undergrad course. In particular, I make use of addition between matrices, multiplication between matrices, and the concept of the rank of a matrix.
Examples
Suppose we multiply two $2 \times 2$ matrices in the standard way.
$$ \begin{align*} & \begin{bmatrix} 1 & 3 \\ 2 & 4 \end{bmatrix} \begin{bmatrix} 5 & 7 \\ 6 & 8 \end{bmatrix} \\ \\ = & \begin{bmatrix} 1.5 + 3.6 & 1.7 + 3.8 \\ 2.5 + 4.6 & 2.7 + 4.8 \end{bmatrix}. \end{align*} $$In the first operand,
Similarly, we can express the second operand,
If we see these two multiplications,
We are now going to prove that this result holds for any matrix. Before moving on to the proof, we will describe the notation we will follow in the proof.
Notation
A matrix will be denoted by a capital letter with its dimensions on the subscript. For example, $A_{m \times n}$ and $B_{n \times p}$.
An element of a particular matrix is given by the corresponding lowercase letter of the matrix with the index of the element as the subscript. For example, the $i, j^{th}$ element of $A_{m \times n}$ will be denoted by $a_{i,j}$.
If we want to abstractly refer to an element inside a matrix, we use the element notation inside square brackets. For example, to express that all elements of $C_{m \times p}$ equal the sum of its indices, we write $$ [c_{i,j}] = [i + j]. $$
Theorem
We first define the concepts we are going to need in the proof.
Definition 1: Matrix-matrix addition
If we have two matrices $A_{m \times n}$ and $B_{m \times n}$, then their addition is $$ C_{m \times n} = A_{m \times n} + B_{m \times n} $$ where $$ [c_{i,j}] = [a_{i,j} + b_{i,j}]. $$
Definition 2: Matrix-matrix multiplication
If we have two matrices $A_{m \times n}$ and $B_{n \times p}$, then their multiplication is $$ C_{m \times p} = A_{m \times n}B_{n \times p} $$ where $$ [c_{i,j}] = [\sum_{k=1}^{n} a_{i,k}.b_{k,j}]. $$
Let’s now state and prove our theorem.
Theorem: Matrix-matrix multiplication in terms of rank one matrices
If we have two matrices $A_{m \times n}$ and $B_{n \times p}$, then their multiplication is $$ C_{m \times p} = A_{m \times n}B_{n \times p} $$ where $$ C_{m \times p} = \sum_{k=1}^{n} \left( \begin{bmatrix} a_{1,k} \\ a_{2,k} \\ . \\ . \\ . \\ a_{m,k} \end{bmatrix} \begin{bmatrix} b_{k,1} & b_{k,2} & . & . & . &b_{k,p} \end{bmatrix} \right). $$
Proof
We start by writing down the definition of matrix multiplication.
We can write down the complete matrix $C_{m \times p}$ as
$$ \begin{align*} C_{m \times p} = \begin{bmatrix} \sum_{k=1}^{n} a_{1,k}.b_{k,1} & . & . & \sum_{k=1}^{n} a_{1,k}.b_{k,p} \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ \sum_{k=1}^{n} a_{m,k}.b_{k,1} & . & . & \sum_{k=1}^{n} a_{m,k}.b_{k,p} \\ \end{bmatrix}. \end{align*} $$Using the definition of matrix addition, we can write
$$ \begin{align*} C_{m \times p} = \begin{bmatrix} a_{1,1}.b_{1,1} & . & . & a_{1,1}.b_{1,p} \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ a_{m,1}.b_{1,1} & . & . & a_{m,1}.b_{1,p} \\ \end{bmatrix} + \begin{bmatrix} a_{1,2}.b_{2,1} & . & . & a_{1,2}.b_{2,p} \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ a_{m,2}.b_{2,1} & . & . & a_{m,2}.b_{2,p} \\ \end{bmatrix} + \begin{bmatrix} a_{1,3}.b_{3,1} & . & . & a_{1,3}.b_{3,p} \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ a_{m,3}.b_{3,1} & . & . & a_{m,3}.b_{3,p} \\ \end{bmatrix} + ... + \begin{bmatrix} a_{1,n}.b_{n,1} & . & . & a_{1,n}.b_{n,p} \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ a_{m,n}.b_{n,1} & . & . & a_{m,n}.b_{n,p} \\ \end{bmatrix}. \end{align*} $$ The above sum can be written as $$ \begin{align*} C_{m \times p} = \sum_{k=1}^{n} \begin{bmatrix} a_{1,k}.b_{k,1} & . & . & a_{1,k}.b_{k,p} \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ a_{m,k}.b_{k,1} & . & . & a_{m,k}.b_{k,p} \end{bmatrix}. \end{align*} $$Each term in the summation can be expressed as a multiplication of the column
We can thus finally write
Discussion
What are the advantages of viewing multiplications like this? One area where matrix multiplications are used is matrix factorizations. So it might be worth investigating what matrix factorizations look like in terms of the sum of rank one matrices. In fact, we can assume that we have a hardware that works in terms of rank one matrices and explore what a compiler for that hardware would look like. We can see whether the “human way of thinking” about matrix factorizations can be translated into sum of rank one matrices. This is the topic I will explore in the follow up articles.