Architecture matters because while deep learning can conceivably fit a curve wit...

mr_toad · on Oct 4, 2024

> can conceivably fit a curve with a single, huge layer

I think you need a hidden layer. I’ve never seen a universal approximation theorem for a single layer network.

dongecko · on Oct 4, 2024

I second that thought. There is a pretty well cited paper from the late eighties called "Multilayer Feedforward Networks are Universal Approximators". It shows that a feedforward network with a single hidden layer containing a finite number of neurons can approximate any continuous function. For non continous function additional layers are needed.

ted_dunning · on Oct 5, 2024

Minsky and Papert showed that single layer perceptrons suffer from exponentially bad scaling to reach a certain accuracy for certain problems.

Multi-layer substantially changes the scaling.