Date and Lecture
Topics
Readings
Course overview and introduction
Review of supervised learning techniques
Probability theory (by Matthew Shum)
Introduction to probability by C.M. Grinstead and J.L. Snell
A few useful things to know about machine learning (Pedro Domingos)
Learning Python
Two-class classification and multi-class classification
J. Friedman, T. Hastie, and R. Tibshirani, 1999, " Additive Logistic Regression: a Statistical View of Boosting ".
K. Crammer and Y. Singer, 2001, " On the Algorithm Implementation of Multiclass Kernel-based Vector Machines ".
T. G. Dietterich and G. Bakiri. 1995. " Solving multiclass learning problems via error-correcting output codes ".
Multi-class classification formulation
Pegasos, cutting plane algorithm
V. Franc and S. Sonnenburg 2008, Optimized cutting plane algorithm for Support Vector Machine .
S. Shalev-Shwartz, Y. Singer, N. Srebro, 2007. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Softmax function
Multi-label classification
Softmax function and cross-entropy
G. Tsoumakas and I. Katakis, 2007." Multi-label Classification: An Overview. "
Structured prediction: Structural SVM and Max-Margin Markov Networks
I. Tsochantaridis, T. Joachims, T. Hofmann and Y. Altun, 2005. Large Margin Methods for Structured and Interdependent Output Variables .
B. Taskar, C. Guestrin and D. Koller, 2003. Max-Margin Markov Networks .
Markov random fields (MRFs) and conditional random fields (CRFs)
J. Lafferty, A. McCallum, and F. Pereira, 2001. " Conditional random fields: Probabilistic models for segmenting and labeling sequence data ".
Auto-context, fixed-point model, graphical models and summary of structured prediction
Z. Tu and X. Bai, 2010. " Auto-context and Its Application to High-level Vision Tasks and 3D Brain Image Segmentation ".
Q. Li, J. Wang, D. Wipf, and Z. Tu, " Fixe d-Point Model for Structured Labeling ", 2013.
Auto-context, fixed-point model
Hidden Markov model
The Unreasonable Effectiveness of Recurrent Neural Networks
" Finding structure in time ", Jeff Elman.
" Long Short Term Memory ", Sepp Hochreiter, Jurgen Schmidhuber.
Week 6: 11/10/20 (Tuesday)
Recurrent neural networks
" A Critical Review of Recurrent Neural Networks for Sequence Learning ", Zachary C. Lipton, John Berkowitz, Charles Elkan.
J Mao, W Xu, Y Yang, J Wang, Z Huang, A Yuille, " Deep captioning with multimodal recurrent neural networks (m-rnn) ", ICLR 2015.
K Xu, J Ba, R Kiros, K Cho, A Courville, R Salakhutdinov, R Zemel, Y Bengio, " Show, Attend and Tell: Neural Image Caption Generation with Visual Attention ", ICML 2015.
I Sutskever, O Vinyals, QV Le, " Sequence to sequence learning with neural networks ", NeurIPS 2014.
van den Oord, S Dieleman, H Zen, K Simonyan, O Vinyals, A Graves, N Kalchbrenner, A Senior, K Kavukcuoglu, " WaveNet:a generative model for raw audio ", arxiv 2016
Attention based models
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, Y. Bengio, "Show, attend and tell: Neural image caption generation with visual attention", ICML 2015.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention Is All You Need ", NeurIPS 2017.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding", NAACL-HLT 2019.
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko, "End-to-End Object Detection with Transformers", ECCV 2020.
BERT model
11/19/20 (Thursday)
Compressive sensing and
Robust principal component analysis
B. Olshausen and D. Field, 1996. " Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images ", Nature.
EJ Candes, T Tao, 2006. " Near-optimal signal recovery from random projections: Universal encoding strategies? ", IEEE Trans. on Information Theory.
Week 8: 11/24/20 (Tuesday)
Compressive sensing and
Robust principal component analysis
EJ Candès, X Li, Y Ma, J Wright, 2011. " Robust principal component analysis ?" Journal of the ACM.
11/26/18 (Thursday)
No-class
Weakly-supervised learning
Semi-supervised learning
T. G. Dietterich, R. H. Lathrop, T. Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles ". Artificial Intelligence 1997.
C Zhang, JC Platt, PA Viola, " Multiple instance boosting for object detection ", NeurIPS 2006.
X. Zhu, " Semi-supervised learning literature survey ", technical report, 2005.
M Belkin, P Niyogi, V Sindhwani, " Manifold regularization: A geometric framework for learning from labeled and unlabeled examples ", JMLR 2006.
Convolutional Neural Networks
Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, L. Jackel , " Backpropagation applied to handwritten zip code recognition ", Neural Computation, 1989.
A Krizhevsky, I Sutskever, GE Hinton, " Imagenet classification with deep convolutional neural networks ", NeurIPS 2012.
K. Simonyan, A. Zisserman, " Very deep convolutional networks for large-scale image recognition ", ICLR 2015.
K He, X Zhang, S Ren, J Sun, " Deep Residual Learning for Image Recognition ", CVPR 2016.
C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, " Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning ", arxiv 2016.
S. Ioffe, C. Szegedy, " Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift ", arxiv 2015.
S. Ruder, An overview of gradient descent optimization algorithms
Graph Neural Networks
W. Hamilton, Z. Ying, and J. Leskovec, "Inductive representation learning on large graphs", Advances in neural information processing systems, 2017.
T. Kipf and M. Welling, "Semi-Supervised Classification with Graph Convolutional Networks", ICLR 2017.
F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, "The graph neural network model", IEEE Transactions on Neural Networks, 2008.
J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, "Graph Neural Networks: A Review of Methods and Applications", arXiv:1812.08434, 2018.
Generative adversarial networks and Introspective neural networks
Z. Tu, " Learning Generative Models via Discriminative Approaches ", CVPR 2007.
I Goodfellow, J Pouget-Abadie, M Mirza, B Xu, D Warde-Farley, S Ozair, A Courville, Y Bengio, " Generative Adversarial Networks ", NeurIPS 2014.
DP Kingma, M Wellingm, " Auto-encoding variational bayes ", ICLR 2014.
A Radford, L Metz, S Chintala, " Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks ", ICLR 2016.
M Arjovsky, S Chintala, L Bottou, " Wasserstein GAN ", ICML 2017.
K Lee, W Xu, F Fan, Z Tu, " Wasserstein Introspective Neural Networks ", CVPR 2018. 12/10/20 .