CS:4980:006 Deep Learning Assignment 2: Due 9/13/2018


In this assignment, you will study the Python program for creating multi layer general neural networks: mlnnSGD.py and its application mnist.py to the MNIST data set mnist.pkl.gz .

  1. Try to use the network architecture (784, 60, 30, 10) for the MNIST data set. Assuming the mini batch size is 100 and the number of epochs is 30, find the minimal error rate on the test set by changing the learning rate. Report the learning rate and the associated error rate.

  2. In the for loop of backprop(self, x, y) in mlnnSGD.py, the following variables are used: theta, ad, delta, y_hat, delta_hat, and nabla_w[i] . Please list the shapes of these variables for all values of i in the previous exercise.

  3. The second hidden layer in the previous exercise has 30 neurons. Please use each value from 25 to 35 for the number of neurons in this layer with the number of epoches being 10 (other parameters are unchanged), and record the error rates in each case. What conclusions you can draw from this experiment?

  4. Please implement the Cross Entropy cost function in mlnnSGD.py and compare it with the Sum of Square of Errors (Quadratic Cost) function on the example in the first exercise. Which function allows you to reduce the number of epoches (and by how much) without increasing the error rate of the final network on the test set?

Please submit everything required in the ICON dropbox for Assignment 2 before the deadline.

Thank you!