By structure, we are simply referring to the number of layers, the number of neurons in each layer, and the specific connections between neurons in each pair of neighboring layers, right?
So in this paper, they carefully chose a certain structure, set the weights randomly, and then what happened after that? I understand that they did not then train it with a training data set, but I'm not quite getting what they did with the single distorted image.
Well given that it's CNNs, you're leaving out weight sharing.
So by structure you should also include the demand that the prediction of any NxN patch of the image should be roughly equal to the prediction of any other NxN patch of the image.
So in this paper, they carefully chose a certain structure, set the weights randomly, and then what happened after that? I understand that they did not then train it with a training data set, but I'm not quite getting what they did with the single distorted image.