Online Deep Learning Course For Humanists
Project Description
This is a Google Summer of Code 2019 Project with Red Hen Lab.
The Project goal is to design and develops an online course, to teach deep learning for students in the humanities and social sciences. The course will contain basic deep learning theory and labs case studies from multimodal communication.
Project Mentors: Francis Steen, Mark Turner and Rajesh Kasturirangan.
Source code: Github Repository.
Course Outline
Background Knowledge
Programming
Math
- basic matrix, calculus, and statistics.
- What is Artificial Intelligence?
- What is Machine Learning
- What is Deep Learning? Why we use it?
- Relationship: AI-> machine learning -> deep learning
Setup the environment we need in this course, including Anaconda, TensorFlow and Jupyter Lab.
1. Neuron and Perception
- Biological neuron model
- Artificial neural
- Mathematics form
- Question and data set
- Linear classifier
- Implement a perceptron
2.3 Multilayer Perceptron
- XOR question
Chapter 3 Neural Networks
3.1 Two layer Neural Network
The architecture:
- Nodes
- Input/Output
- Layer
- Input Layer
- Output Layer
- Hidden Layer: Why we call it hidden layer?
- Connection
- Fully connected
- Weights
Activation function
- Why need this? None-linearities
- Some common activation functions
3.2 Forward Propagation
3.3 Backward Propagation
- Math: Matrix Multiplication
Make an animation videos.
Chapter 4. Learning from data: Training Neural Networks
4.1 Example
- Error loss The empirical loss measures the total loss over the dataset
4.2 Loss Optimization
- Loss function is a function of the Weight
4.3 Gradient Descent
- Greedy algorithm
- Like Hiking Down a Mountain Make an animation video
- Local minimum
Make an animation videos.
4.4 Optimization
Physical Experiment
- Every student acted as a Neuron
- Mock Forward Propagation
- Mock Backward Propagation
Chapter 5 Make your own neural network to classify handwritten digits
In this chapter, the student will learn how to teach the computer to classify handwritten digits by using MNIST dataset in Python.
DataSet: The dataset I choose for this part is MNIST(Modified National Institute of Standards and Technology) dataset, which has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST(National Institute of Standards and Technology) which gives data set of over 800,000 images of handwritten digits from 3,600 writers. The digits have been size-normalized and centered in a fixed-size image.
Experiment
Run Chapter 5's code on Raspberry Pi.
Project
NLP Project using the red hen lab's News Dataset
Classify Hand Gesture Pose in Art
In this part, the student will learn how to classify Hand Gesture Pose in Art using Tensorflow.
Dataset:we will use Rijksmuseum dataset, the Fototeca Zeri dataor the Bibliotheca Hertziana dataset.
Audio Recognition of Simple words
In this part, the student will learn how to build a basic speech recognition network that recognizes ten different words: "yes", "no", "up", "down", "left", "right", "on", "off", "stop", or "go". Real speech and audio recognition systems is much more complex than classify handwritten digits.
Dataset: we will use the Speech Commands dataset, which consists of over 105,000 WAVE audio files of people saying thirty different words. This data was collected by Google and released under a CC-BY license.
Exploring Sentiment in Literature
In this part, the student will learn how to build use Convolutional Neural Network and Recurrent Neural Network to train a models, which will let us exploring Sentiment in Literature.
Dataset: we will choose three books (one positive book, one negative book, and one normal book) form Project Gutenberg. Project Gutenberg offers over 58,000 free eBooks.