CS:4980:006 Deep Learning Assignment 6: Due 10/18/2018


In this assignment, we will study the residual neural network for image recognition. The original code is provided at here. The modified Python program can be found here. The file can be uncompressed into a couple of Python files on a linux machine by the command
       tar xvfz cifar10-py.tgz
The dataset is CIFAR 10. To save your time, the dataset has been loaded onto the Argon cluster, at "/nfsscratch/cifar_10_data/".

To run the program, you may copy the script cifar10.sh into your working directory on Argon (make it executable), create a subdirectory called "result", and submit a job as follows:

       qsub cifar10.sh
If the job number is xxxxxx, then check the files cifar.oxxxxxx and cifar.exxxxxx for the output of cifar10.sh, when the job is done. You may use
       qstat -u HAWKID
to check the status of your jobs, where HAWKID is your hawkid.

If you prefer to run the code on a different machine, you may copy /nfsscratch/cifar10_data.tgz from the Argon machine, or run the following Python code on the new machine:

  python generate_cifar10_tfrecords.py --data-dir=${PWD}/cifar-10-data
and use ${PWD}/cifar-10-data for cifar10_main.py:
  python cifar10_main.py --data-dir=${PWD}/cifar-10-data \
                         --job-dir=${PWD}/result \
                         --num-gpus=1 \
                         --train-steps=1000

Here are your tasks:

  1. Run the program five times with train_steps=2000 initially and increased by 2000 in the next run. The total number of train steps is 10,000. Record the accuracy for each run.

  2. Study the code files cifar10_model.py and model_base.py. The default architecture is to use three stages of block stacking using _residual_v1 from model_base.py. Increase the number of stages from 3 to 4 and run the program again with 10000 train steps. Compare the accuracy with that from Problem 1. You need to create a new output directory for each new architecture.

  3. Replace the residual block (defined by _residual_v1) by the bottleneck block (defined by _bottleneck_residual_v2 in model_base.py), and repeat the experiments in Problems 1 and 2 with this new block.

  4. Count the number of parameters (i.e., scalars) in the architectures used in Problems 1-3 (there are two architectures in Problem 3).

Please submit everything required, including the changed code and output of a sample run, the accuracy reports, counts of parameters in the ICON dropbox for Assignment 6 before the deadline.

Thank you!