> You can't model matrix multiplication as a convolution Hmm. I'm not convinced ...

> You can't model matrix multiplication as a convolution

Hmm. I'm not convinced yet...

The "convolution" of two functions M and N -- let's notate it as M <> N, because ASCII -- will itself be a function, where for each index `i` we consider indices of M and N that sum to `i`. At each index `i`, we "sum" the "products" of M and N at indices summing to `i`; so if `i` is 2 (and we start indexing at 0, for sanity), we have `(M <> N)(2) = M(0)*N(2) + M(1)*N(1) + M(2)*N(0)`.

https://betterexplained.com/articles/intuitive-convolution/

Now consider a matrix multiplication M * N. We can think of M as an array of row vectors, and N an array of column vectors; in both cases, they're functions from (a subset of) the naturals to matrix elements.

To instantiate convolution on our matrices, we can take multiplication as dot-product, and addition as a formal (uninterpreted) sum. (So, technically, we need to lift the elements of our input matrices to these sums as well, but there's a canonical way to do this.) Then M <> N is an array of formal sums of vectors; each formal sum gives an anti-diagonal band of elements in the matrix product, and the whole array runs along the main diagonal.

There are almost certainly faults in the remaining details -- for example, it's not really clear that this is associative yet, due to the formal sum -- but this seems to cast matrix multiplication in the shape of a convolution.