CS691 : Special Topic in Computer Science
Advanced Machine Learning
Course Details
Instructor: Prof. Anirban Dasgupta
TA: Shriraj Sawant (sawant_shriraj@iitgn.ac.in)
This is a broad introduction to machine learning. The course is designed to be a followup to ES 654 (Machine Learning). However, it can be taken independently, provided the required background about probability, linear algebra are picked up.
Prerequisites/Expected knowledge: No formal prerequisites. The students are expected to be reasonably confident in probability, linear algebra, and programming and the basics of supervised and unsupervised classification (content taught in the first half of ES 654).
Class Timings: As per timetable.
Office Hours: By Appointment
List of Topics
Overview of supervised classification models: Support Vector Machines (SVMs), kernels, Representer Theorem, kernelizing perceptrons, SVMs with kernels, Kernel Principal Component Analysis; deep network models (CNNs, RNNs);
Optimization: Perceptrons, gradient descent, stochastic gradient descent, convex optimization, duality, and Karush-Kuhn-Tucker conditions; various optimizers, momentum, hyperparameter tuning, introduction to non-convex optimization techniques e.g. alternating minimization
Generalization, Model selection, boosting, computational learning theory: Risk minimization, Vapnik-Chervonenkis (VC) dimension, sample complexity, weak learning, strong learning, Adaboost, gradient boosting for tree-based models
Bayesian Machine Learning: Topic models, Latent Dirichlet Allocation, Graphical models (including Hidden Markov Models and Conditional Random Fields), Markov Chain Monte Carlo Methods, variational inference, Gaussian processes --correlation, inference, regression.
Efficiency and scalability: Random projection and its applications, random features, hash-kernels, randomized algorithms for scalability e.g. dimension reduction and sampling-based techniques; applications to efficient matrix/tensor factorization, nearest neighbors, model compression, and pruning; parallel and distributed settings (HogWild, parameter server)
Reinforcement learning: Markov Decision Processes, value iteration, Q-learning
Causality and fairness: Simpson’s paradox, structured causal models, introduction to randomized experiments and counterfactuals; introduction to notions of group and individual fairness in classification, clustering, and regression settings
Machine learning engineering: Building an ML pipeline for the real world: data collection, metrics, model building, model deployment, updating. Use cases from companies.
Resources
Christopher Bishop. Pattern Recognition and Machine Learning. Springer, 2006. Made available by the author at https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
Kevin Murphy. Machine Learning, A Probabilistic Perspective. The MIT Press, 2012.
References:
Understanding Machine Learning, from Theory to Algorithms. Shai Shalev Schwartz and Shai Ben David. Cambridge University Press 2014. Made available at http://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning
Hardt, Moritz, and Benjamin Recht. Patterns, predictions, and actions: A story about machine learning. arXiv preprint arXiv:2102.05242 (2021). https://mlstory.org/
Andriy Burkov: Machine Learning Engineering. Published via LeanPub at https://leanpub.com/MLE
Judea Pearl, Madelyn Glymour, Nicholas P. Jewell. Causal Inference in Statistics – A Primer. Wiley, 2016.
Apart from this lecture notes and videos will be made available.
Learning Outcomes
At the end of the course, the students will be able to understand the fundamentals and principles of various advanced machine learning algorithms and will be equipped to design new algorithms for novel settings. The course is expected to take students from being “users” of ML toolboxes to being “designers” of new ones. There will also be a significant emphasis on hands-on implementation of several concepts learned in class.