Advanced Applied Deep Learning
Lecture Course
Sheng Yun Wu
Lecture Course
Sheng Yun Wu
Objective:
To introduce students to the fundamental concepts of deep learning, the structure of neural networks, and the motivation behind convolutional neural networks (CNNs). By the end of the week, students will understand the basics of neural networks and their importance in image processing tasks.
Lecture 1: Introduction to Deep Learning and its Applications
What is Deep Learning?
Difference between machine learning (ML) and deep learning (DL).
Overview of popular deep learning frameworks (TensorFlow, PyTorch).
Deep learning applications: Image classification, object detection, natural language processing (NLP), speech recognition, self-driving cars, healthcare, etc.
Why is Deep Learning Powerful?
Use of large datasets and high computational power (GPUs, TPUs).
Ability to automatically extract features from raw data, eliminating the need for manual feature engineering.
Hierarchical feature learning: from basic patterns to complex abstractions.
Historical Context of Neural Networks
Brief history of neural networks and the key milestones that led to modern deep learning (e.g., multi-layer perceptrons, backpropagation, AlexNet).
Lecture 2: Artificial Neural Networks (ANNs)
Introduction to Neural Networks
Structure of an artificial neural network (ANN): Neurons, layers (input, hidden, output), and weights.
The mathematical operation: Weighted sums, activation functions, and the role of biases.
Common activation functions: Sigmoid, Tanh, and ReLU (Rectified Linear Unit).
Forward Propagation
How information flows through a network from input to output.
Computing the output using a linear combination of inputs and weights, followed by applying an activation function.
Training a Neural Network: Backpropagation and Gradient Descent
Overview of the loss function (e.g., Mean Squared Error, Cross-Entropy Loss).
Backpropagation algorithm: How errors are propagated back through the network to update weights.
Gradient descent: Minimizing the loss function using optimization techniques.
Learning rate and its role in convergence.
Practical Session: Building a Simple Fully Connected Neural Network
Objective: Implement a simple neural network from scratch using TensorFlow/Keras.
Dataset: Use the MNIST dataset (handwritten digits).
Download the dataset using TensorFlow/Keras.
Preprocess the data (normalize pixel values between 0 and 1).
Build a fully connected neural network (Multi-Layer Perceptron, MLP) with one hidden layer to classify digits.
Train and evaluate the model.
Key Steps:
Load and preprocess the MNIST dataset.
Define the network architecture using Keras (input layer, hidden layer, and output layer).
Compile the model using a suitable optimizer (e.g., Adam) and loss function (Cross-Entropy Loss).
Train the model on the training data and evaluate its accuracy on the test data.
Visualize training history (accuracy and loss over epochs).
Assignment for Week 1-2:
Reading:
Chapter 1 of "Advanced Applied Deep Learning" by Umberto Michelucci.
Focus on the basics of neural networks and how they can be applied to classification tasks.
Coding Task:
Implement a fully connected neural network for MNIST classification using TensorFlow/Keras.
Experiment with different numbers of hidden neurons and activation functions to observe their impact on performance.
Discussion Question:
What are the limitations of fully connected neural networks when it comes to high-dimensional data, such as images?
Summary of Key Concepts:
Deep learning and its applications in real-world tasks.
Structure and function of artificial neural networks (ANNs).
Forward propagation, backpropagation, and gradient descent.
Hands-on implementation of a simple fully connected neural network for digit classification.
This first week sets the stage for deeper exploration of CNNs in the following weeks by building foundational knowledge in neural networks and introducing the basic process of training a model on image data.