> Can someone explain this in a way so that an ordinary mortal computer scientist can understand it?
I'll try.
Instead of the common approach that tries to search for image pixels to minimize e.g. a denoising objective, they realize that they can instead search for the weights of an image generator network such that the generated image matches the objective.
They argue that the structure of the network then constitutes some prior knowledge over what a natural image should look like.
My (probably wrong) interpretation: since a convolutional neural network essentially works by looking for some spatial patterns at different resolutions, their optimization process boils down to finding the high and mid-resolution patterns that best match the input image, and then re-using that information to fill in the missing information (or replacing "noise" that does not match the extracted pattern).
I'll try.
Instead of the common approach that tries to search for image pixels to minimize e.g. a denoising objective, they realize that they can instead search for the weights of an image generator network such that the generated image matches the objective.
They argue that the structure of the network then constitutes some prior knowledge over what a natural image should look like.
My (probably wrong) interpretation: since a convolutional neural network essentially works by looking for some spatial patterns at different resolutions, their optimization process boils down to finding the high and mid-resolution patterns that best match the input image, and then re-using that information to fill in the missing information (or replacing "noise" that does not match the extracted pattern).