Synopsis
This course is designed for PhD students (year 1) in applied mathematics, statistics, and engineering who are interested in learning from data. It covers advanced topics in statistical machine learning, with emphasis on the integration of statistical models and algorithms for statistical inference. This course aims to first make connections among classical topics, and then move forward to modern topics, including statistical view of deep learning. Various applications will be discussed, such as computer vision, human genetics, and text mining.
Note: On one side, this course can be challenging for some non-math students as some homework requires mathematical derivation. On the other side, it can be challenging for some math students as it requires coding. If you are still interested in, then let's suffer to learn! Of course, students are welcome to be audience.
Lecture information
Fall, 2025, Tuesday, Thursday, 03:00PM - 04:20PM, Rm 2406, Lift 17-18, main academic building, HKUST.
Introduction. [Note]
Illustrating the Practical Value of Statistics: A Simple Example [pdf]
Suggested reading:
Computer age statistical inference [book] Part I Classic Statistical Inference.
Ten Statistical Ideas that Changed the World [link]
Lecture 1. James-Stein Estimator and Empirical Bayes. [Lecture note]
Ref: Stein's Unbiased Risk Estimate (SURE) [link]
Suggested reading:
Empirical Bayes: Concepts and Methods [link] Very nice review!!
Empirical Bayes: Ideas and Applications [link] Very nice talk!
Bayesian lens and Bayesian blinker [link]
Tractable Evaluation of Stein’s Unbiased Risk Estimator with Convex Regularizers [link]
ebnm: An R Package for Solving the Empirical Bayes Normal Means Problem Using a Variety of Prior Families. [link]
Understanding Diffusion Models: A Unified Perspective [link] Very hot topic!!!
Diffusion Posterior Sampling for General Noisy Inverse Problems (Tweedies' formula in AI) [link]
Lecture 2. Linear mixed models. [Lecture note]
Lecture 3. Explicit and implicit regularization in supervised learning. [Lecture note]
Ref: Additive logistic regression: a statistical view of boosting [link]
Ref: Greedy function approximation: A gradient boosting machine. [link]
Ref: Boosting as a Regularized Path to a Maximum Margin Classifier [link]
Suggested reading: Gradient and Newton Boosting for Classification and Regression [link]
Suggested reading: Gaussian Process Boosting [link]
Suggested reading: Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success [link]
Suggested reading: Why do tree-based models still outperform deep learning on tabular data? [link]
Suggested reading: Tabular Data: Deep Learning is Not All You Need [link]
Interview with Jerome H. Friedman about Gradient Boosting [link] Very interesting interview with many stories.
Code examples [GradientBoostingDemo][GradientBoosting_sklearn][GradientBoosting_rpart]
Lecture 4. The Expectation-Maximization (EM) algorithm and its extension. [Lecturenote1][lecturenote2]
Lecture 5. Variational Inference. [lecturenote][lecturenote2]
Ref: Variational Inference: A Review for Statisticians. [link]
Ref: Advances in Variational Inference. [Arxiv link][PAMI version]
Ref: Covariance, robustness and Variational Bayes. [link]
Lecture 6. False discovery rate. [Lecturenote]
Lecture 7. Matrix factorization. [lecturenote]
Principal Component Analysis. A review article on PCA, appear in Nature Reviews. [link]
Low-Rank Modeling and Its Applications in Image Analysis. [The Matlab code to produce the results presented in this paper]
Empirical Bayes Matrix Factorization [link]
Sparse Bayesian methods for low-rank matrix estimation. [link]
Genes mirror geography within Europe [link]
Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies [link]
Lecture 8. Latent Dirichlet Allocation and PSD model.
Ref: Inference of population structure using multilocus genotype data. Genetics. 155.
Ref: Section 13.5 modeling population admixture. Computer age statistical inference by Efron and Hastie. 2016.
Ref: Finding scientific topics. PNAS [link] (Gibbs sampling for topic models)
Ref: Latent Dirichlet Allocation [link]
Suggested reading
Lecture 9. Generative adversarial networks. [Lecturenote]
Lecture 10. Variational inference in deep learning. [lecturenote 1][lecture note 2]
Ref: The principles of diffusion models [link]
Suggested reading:
Reference books
Bishop C. (2006) Pattern Recognition and Machine Learning [link]
Hastie, Tibshirani, Friedman, Elements of statistical learning. [link]
Efron B. and Hastie. T. (2016) Computer-Age Statistical Inference [link]
Kevin Patrick Murphy. (2022) Probabilistic Machine Learning: An Introduction [link]
Kevin Patrick Murphy. (2022) Probabilistic Machine Learning: Advanced topics [link]
John Winn, Christopher M. Bishop, Thomas Diethe, John Guiver and Yordan Zaykov. Model-based machine learning [link for early access]
Simon J.D. Prince (2023) Understand deep learning. [link]
Bishop C., Bioshop H.(2024) Deep learning: foundation and concepts. [link]
Grading policy: Assignment (60%) + Project (40%)
Assignment (60%): posted on Canvas
Project (40%)
To be posted.