Two conv layers (6 and 12 feature maps, 5x5 filters), one fully connected layer (120 neurons), activation=Tanh, pooltype: average (excluding padding), cost=Negative Log Likelihood.
Learning rate=0.20, minibatch size=100, dropout = 0.0, L2 lambda=0.0, momentum=0.0, initialization: normal
Training for 10 epochs:
CPU: 172 sec, GPU: 14 sec.
When decreasing batch size to 20 images, the numbers are: CPU: 318 sec, GPU: 38 sec.
So yeah, the CPU code your professors wrote was really crappy.
Two conv layers (6 and 12 feature maps, 5x5 filters), one fully connected layer (120 neurons), activation=Tanh, pooltype: average (excluding padding), cost=Negative Log Likelihood.
Learning rate=0.20, minibatch size=100, dropout = 0.0, L2 lambda=0.0, momentum=0.0, initialization: normal
Training for 10 epochs:
CPU: 172 sec, GPU: 14 sec.
When decreasing batch size to 20 images, the numbers are: CPU: 318 sec, GPU: 38 sec.
So yeah, the CPU code your professors wrote was really crappy.