I just did a quick test for Lenet-5, using Theano + CuDNN 5, Xeon E5-1620v2 3.7G...

I just did a quick test for Lenet-5, using Theano + CuDNN 5, Xeon E5-1620v2 3.7GHz, Maxwell Titan X GPU:

Two conv layers (6 and 12 feature maps, 5x5 filters), one fully connected layer (120 neurons), activation=Tanh, pooltype: average (excluding padding), cost=Negative Log Likelihood.

Learning rate=0.20, minibatch size=100, dropout = 0.0, L2 lambda=0.0, momentum=0.0, initialization: normal

Training for 10 epochs:

CPU: 172 sec, GPU: 14 sec.

When decreasing batch size to 20 images, the numbers are: CPU: 318 sec, GPU: 38 sec.

So yeah, the CPU code your professors wrote was really crappy.