CSCE 636 Deep Learning

Instructor: Prof. Anxiao (Andrew) Jiang. Email: ajiang@cse.tamu.edu

Time and Location: 7/3/2024 to 8/7/2024, Monday through Friday (summer 2024), online

TA: Jeffrey Hykin. Email: jeff.hykin+csce636@tamu.edu 

Office Hours:

Dr. Jiang's office hour: 11:30am-12:00pm on Tuesdays and Fridays, via zoom https://us04web.zoom.us/j/77071104472?pwd=haVa6MqqVQFYRgD9mXxQZAUGcJTTBq.1  (Meeting ID: 770 7110 4472, Passcode: 2VwU5M)

TA's office hour: 3:00-3:30pm on Mondays and Wednesdays, and 2:00-2:30pm on Thursdays, via zoom: https://us04web.zoom.us/j/78863134572?pwd=BTeF17nskfDjOxibfrpg0vb0bfnRWC.1 

Grading and Requirements:

The final grade is based on homework and mini projects. Homework: 30%. Mini-projects: 70%.

Homework Policy: An electronic copy should be submitted in https://canvas.tamu.edu

Mini Project Policy: There will be a set of mini projects in the semester. Each mini project will have its own grading method (to be specified in the project details). Honor code for the project: (1) a student should complete each project independently, without collaborating with others; (2) a student should spend no more than eight hours on each project (excluding the time for computers to do computing).

For both homework and projects, no late submssion will be accepted unless there is a valid reason by TAMU policy (such as sick leave with a doctor's note) that is approved by Dr. Jiang.

Textbook:

Computing resources:

1. Google CoLab: an open and free Jupyter notebook environment by Google that runs in the cloud and allows us to use CPU, GPU and TPU resources. It requires no setup. It's a good resource for anyone who wants to do swift experiments in deep learning.

2. HPRC: You can apply for an account at TAMU HPRC (High Performance Research Computing). It has CPU and GPU resources.

Lecture Schedule:

7/3/2024 (Wednesday): What is Deep Learning (See Lecture 1 video and slides. Read Chapter 1 of textbook.) The Mathematical Building Blocks of Deep Learning (See Lecture 2 video for Part 1, video for Part 2 and slides. Read Chapter 2 of textbook.)       

7/4/2024 (Thursday): holiday, no class.    

7/5/2024 (Friday): Introduction to Keras and TensorFlow (See Lecture 3 video and slides. Read Chapter 3 of textbook.)          

7/8/2024 (Monday): Getting Started with Neural Networks: Classification and Regression (See Lecture 4 video for Part 1, video for Part 2, video for Part 3 and slides. Read Chapter 4 of textbook.)        

7/9/2024 (Tuesday): Fundamentals of Machine Learning (See Lecture 5 video and slides. Read Chapter 5 of textbook.)    

7/10/2024 (Wednesday):  Working with Keras: A Deep Dive (See Lecture  6 video for Part 1, video for Part 2, video for Part 3 and slides. Read Chapter 6 and 7 of textbook.)    

7/11/2024 (Thursday):  Introduction to Deep Learning for Computer Vision (See Lecture 7 video for Part 1, video for Part 2, video for Part 3 and slides. Read Chapter 8 of textbook.)    

7/12/2024 (Friday):  Advanced Deep Learning for Computer Vision (See Lecture 8 video for Part 1, video for Part 2 and slides. Read Chapter 9 of textbook.)        

7/15/2024 (Monday): Advanced Deep Learning for Computer Vision (See Lecture 8 video for Part 3, video for Part 4, video for Part 5 and  slides. Read Chapter 9 of textbook.)    

7/16/2024 (Tuesday):  Deep Learning for Timeseries (See Lecture 9 video for Part 1 and slides. Read Chapter 10 of textbook.)      

7/17/2024 (Wednesday):  Deep Learning for Timeseries (See Lecture 9 video for Part 2, video for Part 3 and slides. Read Chapter 10 of textbook.)     

7/18/2024 (Thursday): Deep Learning for Text (See Lecture 10 video for Part 1 and slides. Read Chapter 11 of textbook.)    

7/19/2024 (Friday):  Deep Learning for Text (See Lecture 10 video for Part 2 and slides. Read Chapter 11 of textbook.)     

7/22/2024 (Monday): Deep Learning for Text (See Lecture 10 video for Part 3, video for Part 4 and   slides. Read Chapter 11 of textbook.)     

7/23/2024 (Tuesday): Deep Learning for Text (See Lecture 10 video for Part 5video for Part 6, video for Part 7  and slides. Read Chapter 11 of textbook.)      

7/24/2024 (Wednesday): Generative Deep Learning (See Lecture video on auto-encoder. Read Chapter 12 of textbook.)    

7/25/2024 (Thursday): Generative Deep Learning (See Lecture video on VAE and GAN. Read Chapter 12 of textbook.)     

7/26/2024 (Friday): Generative Deep Learning (See Lecture video on GAN. Read Chapter 12 of textbook.)    

7/29/2024 (Monday):   Generative Deep Learning (See Lecture video on GAN. Read Chapter 12 of textbook.)     

7/30/2024 (Tuesday):  Best Practices for the Real World (See Lecture video on tuning hyperparameters. Read Chapter 13 of textbook.)      

7/31/2024 (Wednesday):  Transfer Learning (See Lecture video on transfer learning)       

8/1/2024 (Thursday):  Transfer Learning (See Lecture video on transfer learning)     

8/2/2024 (Friday): Ensemble Learning (See Lecture video on ensemble learning, Part 1 of 3)      

8/5/2024 (Monday):  Ensemble Learning (See Lecture video on ensemble learning, Part 2 of 3)       

8/6/2024 (Tuesday):   Ensemble Learning (See Lecture video on ensemble learning, Part 3 of 3)      

Homework and Projects:

Check out the Jupyter notebook for Chapter 2 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter02_mathematical-building-blocks.ipynb. Then:

1) For the neural network in the Jupiter notebook for the MNIST task (which has 2 layers, where the first layer has 512 neurons and the second layer has 10 neurons), adjust the size of the first layer (namely, the number of neurons in the first layer) as: 16, 32, 64, 128, 256, 512. For each size, train the neural network and record its test accuracy (namely, accuracy on the test set). Draw a figure, where the x-axis is the size of the first layer, and the y-axis is the test accuracy. Submit your figure. 

2) For the neural network in the Jupiter notebook for the MNIST task (which has 2 layers, where the first layer has 512 neurons and the second layer has 10 neurons), adjust the number of layers to be: 2, 3, 4, 5. (The last layer has to have size 10 since we have 10 classes. But you can decide on the sizes of the other layers.) For each case, train the neural network and record its test accuracy. Draw a figure, where the x-axis is the number of layers, and the y-axis is the test accuracy. Submit your figure, and specify the sizes of your layers for each case. 

1) Check out the Jupyter notebook for Chapter 3 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-keras-and-tf.ipynb. Then, use the "GraidentTape API" to find the derivative of the function f(x) = sin(x) for x = 0, 0.1, 0.2 and 0.3. Submit your Jupyter notebook that shows both the code and the result you got. 

2) Check out the Jupyter notebook for Chapter 4 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_getting-started-with-neural-networks.ipynb. Then, for the task "Classifying movie reviews: A binary classification example", tune the hyper-parameters of the model (such as changing the number of layers, changing the sizes of layers, changing the optimizer, changing the learning rate, etc.), and see if you can improve the model's performance. Submit a Jupyter notebook where you clearly show the code with the best hyper-parameters that you have found, along with its performance on training, validation and test sets.  

1) Check out the Jupyter notebook for Chapter 5 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter05_fundamentals-of-ml.ipynb . Answer the question: what regularization techniques were mentioned in that Jupyter notebook? 

2) The MNIST dataset has 60,000 training images and 10,000 test images. Each image is a 28x28 array, where each array element is between 0 and 255. The images have 10 labels: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.

 We now create a new dataset of about 30,000 training images, about 5,000 test images and 5 labels (which are 0, 1, 2, 3, 4) as follows. First, randomly pair up the training images of label 0 with the training images of label 1, to get about 6,000 such pairs. (The actual number may be a little less than 6,000, because the number of images for some label may be less than -- although close to -- 6,000. In this case, just get as many pairs as you can.) Then, for each pair (A,B) where A is an image of label 0 and B is an image of label 1, we create a new image of size 28x28, where each element's value is the "average" of the two corresponding pixel values in A and B. (So the new image is a "mixture" of the two original images.) This way we create about 6,000 new "mixture" images for training. In a similar way, we create about 1,000 new "mixture" images for testing. We give all these about 6,000+1,000=7,000 new "mixture" images the new label 0. Then, in the same way, we create about 6,000 new training images and about 1,000 new test images by mixing the original images of label "2" and "3", and give them the new label 1; create about 6,000 new training images and about 1,000 new test images by mixing the original images of label "4" and "5", and give them the new label 2; create about 6,000 new training images and about 1,000 new test images by mixing the original images of label "6" and "7", and give them the new label 3; create about 6,000 new training images and about 1,000 new test images by mixing the original images of label "8" and "9", and give them the new label 4.    

    Your task: submit your code that creates the above new dataset; then for each of the 5 new labels, randomly select 2 images of that label from your new dataset, and display them in your submitted Jupyter notebook. 

3) Design a neural network model to classify the 5 classes in the new dataset, and optimize it by tuning its hyper-parameters and trying our learned regualization techniques (such as L1 regularization, L2 regularization, dropout). 

    Your task: For your final (namely, optimized) neural network model, submit its code, and show the model's performance (including loss value and accuracy) for training, validation and testing. (For training performance and validation performance, illustrate them using figures, where the x-axis is the number of training epochs, and the y-axis is the loss or accuracy. For testing performance, just show the values of loss and accuracy.) Also, answer the questions: in this process of optimizing your model, did you observe underfitting? Did you observe overfitting? Did you try the reguliazation techniques? Did they help? 

1) Check out the Jupyter notebook for Chapter 8 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_intro-to-dl-for-computer-vision.ipynb. It uses the convolutional base of VGG16 for an image classification task, and also tries fine tuning. Your task: use another existing trained neural network (which is different from VGG16, such as ResNet) for the same task, and also try fine tuning. Submit your complete code, draw figures on the training/validation performance, and show the testing performance.

Check out the Jupyter notebooks for Chapter 9 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part01_image-segmentation.ipynb, https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part02_modern-convnet-architecture-patterns.ipynb, and https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part03_interpreting-what-convnets-learn.ipynb. Then:

1) Try the code on image segmentation. Show the segmentation result on a randomly selected image.

2) Try the code on "Visualizing intermediate activations". Show the visualization result on a randomly selected image from the cats-and-dogs dataset. 

3) Try the code on "Visualizing convnet filters". Show the visualization result for a randomly selected filter.

4) Try the code on "Visualizing heatmaps of class activation". Show the visualization result on a randomly selected image from the cats-and-dogs dataset. 

Check out the Jupyter notebook for Chapter 10 at https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_dl-for-timeseries.ipynb. It has tried 7 methods for the temperature prediction problem: Try 1 (A common-sense, non-machine learning baseline method), Try 2 (A fully connected neural network), Try 3 (1-d convolutional neural networks), Try 4 (Recurrent neural network), Try 5 (LSTM with recurrent dropout), Try 6 (stacking RNN layers), Try 7 (Bidirectional RNN).

Your task: use the above 7 methods to predict the temperature in 48 hours (instead of 24 hours). In the Jupyter notebook, include your code as well as the performance of the 7 methods.  

This project is on image classification for noisy MNIST dataset. When we add noise to images in the MNIST dataset, the digits in the images become more and more difficult for human to recognize. For example, the images here have increasingly large noise levels. However, interestingly, deep neural networks can still be trained to recognize them relatively well.

Your task is to train a good hand-written-digit recognition classifier for the noisy images. Here are the train_images and train_labels. (You can download them, and then use pickle.load(open(path, 'rb')) to open them. They are both tensorflow tensors.)

You should submit two things:

1) Place all your code in a single Jupyter notebook, and submit it in Canvas. You should accompany your code with clear/detailed explanations, so that we can understand the methods you used.

2) Email your trained model to our project email address: jeff.hykin+csce636project@tamu.edu  (Please make sure to include your name and UIN in the email.) We need to be able to test your model easily in Google CoLab by running a simple line "test_loss, test_acc = model.evaluate(test_images, test_labels)" on our holdout test set (test_images, test_labels). (If you need to proprocess the images before running the trained model, please include those preprocess functions in your Jupyter notebook, and explain clearly how to run your code. It needs to be simple to run your codes.)

Method of grading: your grade for this project will be equal to: 

test_acc  x 100

For example, if your test accuracy is 0.62, then the grade is 62; and if your test accuracy is 0.9, then the grade is 90. However, if the code in your Jupyter notebook is incomplete, the grade will be 0; and if the explanations in the Jupypter notebook are not clear, then 5 points will be taken away.      

This project is on machine translation for two "artificial" languages: an "Input Language" and an "Output Language". We want to build a model to translate the texts in the "Input Language" to texts in the "Output Language". For example, a text in the "Input Language" can be "a g b f a f a e a k a j c f b f c d a k a k c e b g a h a k b d b f b f b d c d " , and its translation to the "Output Language" is "b f c f b f c d a j e f g c e b g a k i j b d b f a k l m b f b d a h ed ee ef a k k eg a k h eh a e ei c d a f ej ek a g d el".

As a training dataset, here is a list of 112,000 texts in the Input Language, and here is the list of 112,000 corresponding texts in the Output Language.  (After downloading them, you can use pickle.load(open(path,'rb')) to open them. They are two lists of strings in Python.)

Your task is to design and train a good model, which can take texts in the "Input language" as its input, translate them to the corresponding texts in the "Output Language" and output them. (Note: the final output of your code should not include extra tokens such as [start] or [end]. So if your model's output includes such extra tokens, you need to write a few more lines of code to remove them.)

Your trained model will be evaluated using a testing set (which is shown below, and is similar to the above training set). Its performance will be measured by its "test accuracy", defined as follows. A text in the Output Language that is output by your model is considered "correct" if it matches the ground-truth text EXACTLY. (That is, if the two texts have any difference, the text output by your model will be considered "incorrect".) Then the "test accuracy" of your model is defined as the ratio between "the number of correct texts output by your model" and "the total number of samples in the test set".

You should submit THREE things:

1) Place all your code in a single Jupyter notebook, and submit it in Canvas. You should accompany your code with clear/detailed explanations, so that we can understand the methods you used. Submit it by Thursday 8/1/2024.

2) Email your trained model to our project email address jeff.hykin+csce636project@tamu.edu. (Please make sure to include your name and UIN in the email.) We need to be able to test your model very easily in Google CoLab by running a simple line of code on our test set (described below). You need to specify in your Jupyter notebook what that single line of code is that can do the above. Submit it by Thursday 8/1/2024.

3) As the test dataset for this project, here is a list of 1,000 texts in the Input Language (to be posted on Friday 8/2/2024). Please run your trained model to get the 1,000  corresponding texts in the Output Language (as a list of 1,000 strings in Python), and email them to our project email address jeff.hykin+csce636project@tamu.edu by Sunday 8/4/2024. (You can use pickle.dump(yourList, open(path,'wb')) to save your list in a file, and then attach it to your email.) Please note that you must use the model you have already submitted to get those 1,000 texts in the Output Language.  (We will use your model to verify the above submitted texts,. If any change is detected, the whole project will receive 0 point.)

Method of grading: your grade for this project will be equal to:

test_accuracy x 100

For example, if your test accuracy is 0.75, then the grade is 75; and if your test accuracy is 0.93, then the grade is 93. However, if the code in your Jupyter notebook is incomplete, the grade will be 0; if the explanations in the Jupyter notebook are not clear, then 5 points will be taken away; and if we need to make any (even simple) modification to your code in order to run it for testing (instead of simply running a single line of code following your instruction), then some points will be taken away (depending on how serious the modification needs to be). If the model is changed in any way after its submission to produce the results for the test set, the project will receive no point.

Statement: The Americans with Disabilities Act (ADA) is a federal anti-discrimination statute that provides comprehensive civil rights protection for persons with disabilities. Among other things, this legislation requires that all students with disabilities be guaranteed a learning environment that provides for reasonable accommodation of their disabilities. If you believe you have a disability requiring an accommodation, please contact Disability Services, currently located in the Disability Services building at the Student Services at White Creek complex on west campus or call 979-845-1637. For additional information, visit http://disability.tamu.edu.

“An Aggie does not lie, cheat or steal, or to tolerate those who do.” See http://aggiehonor.tamu.edu