No, ALL machine learning architectures are convolutions https://grlearning.githu...

Legend2440 · on Feb 20, 2024

Reading the paper, they're saying convolutions are powerful enough to express any possible architecture. But that's just computational universality - this does not mean they are convolutions.

atq2119 · on Feb 20, 2024

Every matrix multiply can be viewed as a 1x1 convolution, just like every convolution can be viewed as an (implicit, for larger than 1x1 kernels) matrix multiply.

I'm not sure this is particularly enlightening, but it's probably one small step of understanding that is required to truly "get" the underlying math.

Edit: Should have said that every matrix multiply against weights is a convolution. The matrix multiply in the standard attention block (query times keys) really doesn't make sense to be seen as a convolution.

adamnemecek · on Feb 20, 2024

Integral transform is the right abstraction level. Matmul is an integral transform.