EECS E6691 - Topics and Data Driven Analysis and Computation
Advanced Deep Learning
Columbia University Course
Zoran Kostic, Ph.D., Dipl. Ing., Professor of Professional Practice, zk2172(at)columbia.edu
Electrical Engineering Department, Data Sciences Institute, Columbia University in the City of New York
Course in a nutshell:
Advanced theory and practice of Deep Learning. Applications and projects.
Description: Advanced (Second) Course on Deep Learning
Bulletin Description: Regularized autoencoders, sparse coding and predictive sparse decomposition, denoising autoencoders, representation learning, manifold perspective on representation learning, structured probabilistic models for deep learning, Monte Carlo methods, training and evaluating models with intractable partition functions, restricted Boltzmann machines, approximate inference, deep belief networks, deep learning in speech and object recognition.
Detailed Description for Spring 2024
EECS E6691 Advanced Deep Learning (TOPICS DATA-DRIVEN ANAL & COMP)
Spring 2024, 3 credits
Professor Zoran Kostic zk2172 (at) columbia.edu
A second-level seminar-style course in which the students study advanced topics in deep learning. Prior to this course, students must have previously taken a first course in deep learning. The course consists of: (i) studying state-of-the art architectural and modeling concepts, (ii) systematic review of recent literature and reproduction of the results, (iii) pursuing novel research ideas, (iv) participating in local and potentially in public contests on Kaggle or elsewhere, (v) class presentation(s) of paper studies during the semester, (vi) final project, (vii) quizzes during the lecture time. The course will address topics beyond the material covered in the first Deep Learning course (such as the Columbia course ECBM E4040), with applications of interest to students. Example topics are object detection and tracking, smart city and medical applications, use of spectral-domain processing, applications of transformers, and capsule networks.
Students entering the course must have prior experience with deep learning and neural network architectures, including Convolutional Neural Nets (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memories (LSTMs), and autoencoders. They need to have a working knowledge of coding in Python, Python libraries, Jupyter notebook, Tensorflow, both on local machines and on a cloud platform (such as Google Cloud Platform, GCP), and of GitHub or similar. The framework and associated tools, which are the focus of this course, are PyTorch and Google Cloud. The course will leverage the infrastructure and Python coding templates from the ECBM E4040 assignments (the first deep learning course by Prof. Kostic). Students must be self-sufficient learners and take an active role during the classroom activities.
Semester class assignments (paper reviews) will consist of reading, coding, and presentations. Every week, several student groups will give presentations reviewing selected papers from recent conferences such as NIPS and ICLR, including students’ results in reproducing the papers, followed by open discussion. Quizzes are a part of the class time.
The final project is a group project (up to 3 students). The topic will be selected by students or by the instructor. It needs to be documented in a conference-style report and the code deposited in a GitHub repository. The code needs to be documented and instrumented so the instructor can run it after downloading from the repository. A Google Slides presentation of the project, suitable for a poster version, is required. Students will present the project at the end of the semester using the slides.
Prerequisites
(i) Machine Learning (taken previously, or in parallel with this course).
(ii) ECBM E4040 Neural Networks and Deep Learning, or an equivalent neural network/DL university course taken for academic credit. Although the quality of online ML and DL courses (Coursera, Udacity, edX) is outstanding, many takers of these courses complete the hands-on coding assignments superficially and therefore do not gain practical coding skills, which are essential to participate in this advanced course. Therefore, online courses are not accepted as prerequisites.
(iii) The course requires an excellent theoretical background in probability and statistics and linear algebra.
Students are strongly advised to drop the class if they do not have an adequate theoretical background and/or previous experience with programming deep learning models. It is strongly advised (the instructor’s requirement) that students take no more than 12 credits of coursework (including this course) during the semester in which this course is taken.
Registration
The enrollment is limited to several dozen students. Instructor’s permission is required to register. Students interested in the course need to add their name to the SSOL waitlist and may be asked to complete a questionnaire. The instructor will move the students off the SSOL waitlist.
Grading
Assignments: (2-4 assignments per semester)
Paper review presentations: 2-3 student presentations + code + discussions
Project (proposal slide presentation + final slide presentation + final report + code repository)
Exam and Quizzes
Assignment submission policy:
The total number of days late across all assignments is 4.
Late days do not apply to the final project (report, slides, presentations, or code)
Content
Analytical study and software design.
Several assignments in Python and in PyTorch
Significant project.
Pursuing deeper exploration of deep learning.
Syllabus (2024 Spring)
Attention Fundamentals
Transformer Architecture
Building the Transformer
Vision Transformer
Segmentation Transformer
SWIN Transformer: Hierarchical Vision Transformer using Shifted Windows
End-to-End Object Detection with Transformers: DETR
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GPT
Generative Models
Diffusion Models: DDPM and DALL·E 2
Diffusion Models; From LDMs to Stable Diffusion
Scalable Diffusion Models with Transformers
CLIP: Contrastive Language-Image Pre-training - Learning Transferable Visual Models From Natural Language Supervision
Mamba SSM: Linear-Time Sequence Modeling with Selective State Spaces
LLaMA and LLaMA adapter
Unifying LLMs and Knowledge Graphs
Frameworks for Autonomous Language Agents
Stealing Secrets from a Production Language Model
Graph Neural Networks
DSPY: Compiling Declarative Languages - Model Calls Into Self-Improving Pipelines
YOLOv3 to YoloV9
Multiresolution Pyramids: HOG, SIFT, Scattering Networks
Organization
Lectures:
Presentation of material by instructors and guest lecturers
Student Presentations:
Every student contributes several presentations on the subject of interest (can be in groups)
Assignments:
A combination of analytical and programming assignments
Exam and Quizzes:
During the class time
Project:
Team-based
Students with complementary backgrounds
Significant design
Reports and presentations to Columbia and NYC community
Best projects could qualify for publications and/or funding
In-person attendance is mandatory
Prerequisites:
Required: knowledge of linear algebra, probability and statistics, programming, machine learning, first course in deep learning.
Prerequisite courses: ECBM E4040 or similar
Time:
Spring 2026
Spring 2025
Spring 2024 - https://doc.sis.columbia.edu/#subj/EECS/E6691-20241-001/
Spring 2023
Spring 2022 - Advanced Deep Learning (EECS E6691 TPC - Topics in Data-driven Analysis and Computation)
Spring 2021 - Advanced Deep Learning (EECS E6691 TPC - Topics in Data-driven Analysis and Computation)
Project Areas
Smart cities
Medical Applications
Autonomous vehicles
Environmental
Physical data analytics
Finance
Books, Tools and Resources
BOOKS:
Tools/Software platform:
PyTorch as the main framework, Google TensorFlow, Google Cloud, Python, bitbucket, PyTorch
2025 Spring Projects
Automated Brain Aneurysm Detection in Time-of-Flight Magnetic Resonance Angiography (TOF-MRA)
YOLO-based model for baggage security screening
Self-Supervised Contrastive Pre-Training for Time Series via Time-Frequency Consistency
Ordering Message Passing to Deal with Heterophily and Over-Smoothing
SFT dataset for self debugging in code gen LLMs
Optimizing MoE Routers: Design, Implementation, and Evaluation in Transformer Models
VeriClaim: Interpretable Fact Verification for Claims
Enhancing Generalizable Reasoning in LLMs via GRPO on Diverse Reasoning Tasks
Predicting Human Actions from Pose Estimation Using Deep Learning
LLMs with an Opinion
Autoencoder prediction of NYC real estate value
Reduced-order Neural Surrogate Modeling of Fluid Physics.
Comparing Audio Tagging Models with fMRI Data
Automated Quantum Circuit Builder
Multimodal Skin Diseases Analysis
Visual Reasoning Model with Chain of Thought for Robust Vision-Language Understanding
2024 Spring Projects
AudioMamba: Mamba Architecture for Audio Classification
Music Generation with Music transformer and LSTM
COVID-19 Forecasting using Spatio-Temporal Graph Attention Networks
ConMamba: Convolution-augmented Mamba for Speech Recognition
Diminished Reality for Emerging Applications in Medicine through Inpainting
Efficent Deep Learning Investigation
Real-time Automatic Face Anonymization in Video Streams
Development of Discrete Prompt Learning via Evolutionary Search
Heart Disease Detection Using Transformer Models
3D Tumor Segmentation with U-Net: Analyzing MRI Scans for Medical Insights
Low-Light Raw Image Enhancement
Direct Preference Optimization and Proximal Policy Optimization on Small Language Models
3D Scene Reconstruction using Neural Radiance Fields and Structure from Motion
Dynamic Video Generation from Static Comic Panels
Small Object Detection using improved YOLO
Predictive Modeling of Tennis Player Poses and Ball Trajectory
VMamba: Visual State Space Model
2023 Spring Projects
Object Recognition and Seq2Seq Models for Handwritten Equation Recognition and LaTeX Translation
Deep Learning for Soccer Pass Receiver Prediction in Broadcast Images
Automatic Person Removal Pipeline
AI Photographic Assistant: Implementation of Deep Learning Photographic Tools
The Development of Athena: Leveraging GPT-3.5 for Adaptive and Interactive Intelligent Tutoring for Personalized Learning
What’s Happened, Happened: Leveraging Past Decisions for Improved Interactive Image Segmentation
ASL Detection Correction and Completion
A hierarchical attention based model for biopsy classification
Enhancing Indoor Bouldering Experience for Color-Blind Climbers: A Deep Learning Approach for Route Identification on Climbing Walls
SwagGAN (StyleGAN for Fashion)
Learning Multi-scale Visual Representation via Language Description for Segmentation
Deep Learning For Financial Time Series
Classifying neuron cell types within Drosophila melanogaster using graph convolutional networks
SAM Based Cell Blood Classification Model
Instance-Level Image Retrieval While Navigating Interactive 3D Environments
Gaze and head redirection model based on StyleGAN3
2022 Spring Projects
Pix2Pix Image-to-Image Translation with Conditional Adversarial Networks
Jump rope counter
Black Box Adversarial Attack with Style Information
Learning Signed Distance Function for 3D Shape Representation (DeepSDF)
Representation learning without any labeled data
Comparison of Self-Supervised Models for Music Classification
Subcellular localization of proteins using deep learning
Image Descriptions Generator
Predicting remaining surgery duration
Vision Transformer
Adversarial Audio Synthesis
PlaNet - Latent Dynamics from Pixels
RecoNET: Understanding What happens in Videos
2021 Spring Projects
Multi-Graph Graph Attention Network for Music Recommendation
3D Facial Reenactment from 2D Video
Temporal Fusion Transformers for Time Series Forecasting
Deep Reinforcement Learning for Environmental Policy
Stochastic & Split Convolutions for High Resolution Image Classification (Integrated with EfficientNet)
Temporal Fusion Transformers for Time Series Forecasting
Forecasting Corn Yields with Semiparametric CNNs
Pose Estimation + Instance Segmentation
Speaker Independent Speech Separation
Generalized Autoencoder-based Transfer Learning for Structural Damage Assessment
Deep Reinforcement Learning for Environmental Policy
End-to-end object detection with Transformers
Exploring latent space of InfoGAN
Multi-Graph Graph Attention Network for Music Recommendation
3D Facial Reenactment from 2D Video
Pose Estimation + Instance Segmentation
2018-2020 Projects
See list of projects under E6040 link
Course sponsored by equipment and financial contributions of:
NVIDIA GPU Education Center, Google Cloud, IBM Bluemix, AWS Educate, Atmel, Broadcom (Wiced platform); Intel (Edison IoT platform), Silicon Labs.
PREVIOUS SEMESTERS
Detailed Description for Spring 2023
Instructor: Dr. Mehmet Kerem Turkcan mkt2126 (at) columbia.edu
This is an advanced-level course in which the students study topics in deep learning. It is required that students had previously taken a first-course in deep learning. The course consists of: (i) lectures on state-of-the-art architectural and modeling concepts, (ii) assignments, (iii) exam, and a (iv) final project. The course will address topics beyond material covered in the first course on Deep Learning (such as ECBM E4040), with applications of interest to students. In 2023, the main subject of the lectures will be object detection.
Students entering the course have to have prior experience with deep learning and neural network architectures including Convolutional Neural Nets (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memories (LSTMs), and autoencoders. They need to have working knowledge of coding in Python, Python libraries, Jupyter notebook, TensorFlow both on local machines and on Google Cloud, and of GitHub or similar code hosting tools. The framework and associated tools which will be the focus of this course are PyTorch and Google Cloud. Students have to be self-sufficient learners and to take an active role during classroom activities.
There will be a few (3-4) assignments throughout the semester focusing on coding. In the second half of the course, there will be a midterm exam comprised of multiple-choice questions.
Final projects need to be documented in a conference-style report, with code deposited in a GitHub repository. The code needs to be documented and instrumented such that the instructor can run it after a download from the repository. A Google Slides presentation of the project suitable for a poster presentation is required.
Prerequisites
(i) Machine Learning (taken previously, or in parallel with this course).
(ii) ECBM E4040 Neural Networks and Deep Learning, or an equivalent neural network/DL university course taken for academic credit.
(iii) The course requires an excellent theoretical background in probability and statistics, and linear algebra.
Students are strongly advised to drop the class if they do not have an adequate theoretical background and/or previous experience with programming of deep learning models. It is strongly advised (the instructor’s requirement) that students take no more than 12 credits of any coursework (including this course and project courses) during the semester while this course is being taken.
Registration
The enrollment is limited to several dozen students. The instructor’s permission is required to register. Students interested in the course need to populate the SSOL waitlist, and MUST also populate the questionnaire. The instructor will move the students off of the SSOL waitlist after reviewing the questionnaire.
(Tentative) Grading for the course (2023 Spring)
Assignments: 30%
Midterm Exam (Delivered at Week 11): 30%
Project (Final report & Code Repository): 40%
(Potential) Class Contribution: x