Date and Lecture
Topics
Readings
Week 1: 04/01/2025 (Tuesday)
Course overview and introduction
Two-class classification and multi-class classification
Probability theory (by Matthew Shum)
Introduction to probability by C.M. Grinstead and J.L. Snell
A few useful things to know about machine learning (Pedro Domingos)
04/03/2025 (Thursday)
Multi-class classification formulation
Pegasos, cutting plane algorithm
J. Friedman, T. Hastie, and R. Tibshirani, 1999, " Additive Logistic Regression: a Statistical View of Boosting ".
K. Crammer and Y. Singer, 2001, " On the Algorithm Implementation of Multiclass Kernel-based Vector Machines ".
T. G. Dietterich and G. Bakiri. 1995. " Solving multiclass learning problems via error-correcting output codes ".
Softmax function
V. Franc and S. Sonnenburg 2008, Optimized cutting plane algorithm for Support Vector Machine .
S. Shalev-Shwartz, Y. Singer, N. Srebro, 2007. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Structured prediction
Softmax function and cross-entropy
G. Tsoumakas and I. Katakis, 2007." Multi-label Classification: An Overview. "
Structured prediction
I. Tsochantaridis, T. Joachims, T. Hofmann and Y. Altun, 2005. Large Margin Methods for Structured and Interdependent Output Variables .
B. Taskar, C. Guestrin and D. Koller, 2003. Max-Margin Markov Networks .
Auto-context, fixed-point model, graphical models and summary of structured prediction
J. Lafferty, A. McCallum, and F. Pereira, 2001. " Conditional random fields: Probabilistic models for segmenting and labeling sequence data ".
Week 4: 04/22/2025 (Tuesday)
Auto-context, fixed-point model
Z. Tu and X. Bai, 2010. " Auto-context and Its Application to High-level Vision Tasks and 3D Brain Image Segmentation ".
Q. Li, J. Wang, D. Wipf, and Z. Tu, " Fixe d-Point Model for Structured Labeling ", 2013.
Week 5: 04/29/2025 (Tuesday)
Hidden Markov model
John Jumper, et al., "Highly accurate protein structure prediction with AlphaFold", Nature, 2021.
05/01/2025 (Thursday)
Recurrent neural networks
The Unreasonable Effectiveness of Recurrent Neural Networks
" Finding structure in time ", Jeff Elman.
" Long Short Term Memory ", Sepp Hochreiter, Jurgen Schmidhuber.
Week 6: 05/06/2025 (Tuesday)
Recurrent neural networks
" A Critical Review of Recurrent Neural Networks for Sequence Learning ", Zachary C. Lipton, John Berkowitz, Charles Elkan.
J Mao, W Xu, Y Yang, J Wang, Z Huang, A Yuille, " Deep captioning with multimodal recurrent neural networks (m-rnn) ", ICLR 2015.
K Xu, J Ba, R Kiros, K Cho, A Courville, R Salakhutdinov, R Zemel, Y Bengio, " Show, Attend and Tell: Neural Image Caption Generation with Visual Attention ", ICML 2015.
I Sutskever, O Vinyals, QV Le, " Sequence to sequence learning with neural networks ", NeurIPS 2014.
van den Oord, S Dieleman, H Zen, K Simonyan, O Vinyals, A Graves, N Kalchbrenner, A Senior, K Kavukcuoglu, " WaveNet:a generative model for raw audio ", arxiv 2016
05/8/2025 (Thursday)
Attention based models
Transformers, Graph neural networks
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, Y. Bengio, "Show, attend and tell: Neural image caption generation with visual attention", ICML 2015.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention Is All You Need ", NeurIPS 2017.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding", NAACL-HLT 2019.
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko, "End-to-End Object Detection with Transformers", ECCV 2020.
Transformers
Sparse Representations
B. Olshausen and D. Field, 1996. " Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images ", Nature.
EJ Candes, T Tao, 2006. " Near-optimal signal recovery from random projections: Universal encoding strategies? ", IEEE Trans. on Information Theory.
EJ Candès, X Li, Y Ma, J Wright, 2011. " Robust principal component analysis ?" Journal of the ACM.
Week 8: 05/20/2025 (Tuesday)
Weakly-supervised learning
T. G. Dietterich, R. H. Lathrop, T. Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles ". Artificial Intelligence 1997.
C Zhang, JC Platt, PA Viola, " Multiple instance boosting for object detection ", NeurIPS 2006.
05/22/2025 (Thursday)
Semi-supervised learning
X. Zhu, " Semi-supervised learning literature survey ", technical report, 2005.
M Belkin, P Niyogi, V Sindhwani, " Manifold regularization: A geometric framework for learning from labeled and unlabeled examples ", JMLR 2006.
Kihyuk Sohn, et al., "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence", NeurIPS, 2020.
Week 9: 05/27/2025 (Tuesday)
Slides (self-supervised)
Self-supervised learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, Ross Girshick , "Momentum Contrast for Unsupervised Visual Representation Learning", CVPR 2020.
Jean-Bastien Grill, "Bootstrap your own latent: A new approach to self-supervised Learning", arXiv:2006.07733, 2020.
05/29/2025 (Thursday)
Slides (semi-supervised)
Generative modeling
Zhuowen Tu, "Learning Generative Models via Discriminative Approaches", CVPR, 2007.
Ian Goodfellow et al., "Generative adversarial networks", NeurIPS, 2014.
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, Surya Ganguli, "Deep Unsupervised Learning using Nonequilibrium Thermodynamics", ICML 2015.
Week 10: 06/03/2025 (Tuesday)
Generative modeling
DP Kingma, M Welling, "Auto-encoding variational bayes", ICLR 2013.
Diffusion Models
Large language models
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding", 2018
.Tom B. Brown et al, "Language Models are Few-Shot Learners", NeurIPS, 2020.
Long Ouyang et al., "Training language models to follow instructions with human feedback", 2022.
Alec Radford et al., "Learning Transferable Visual Models From Natural Language Supervision", 2021.