MATH 5472. Computer-Age Statistical Inference and its applications

Synopsis

This course is designed for PhD students (year 1) in applied mathematics, statistics, and engineering who are interested in learning from data. It covers advanced topics in statistical machine learning, with emphasis on the integration of statistical models and algorithms for statistical inference. This course aims to first make connections among classical topics, and then move forward to modern topics, including statistical view of deep learning. Various applications will be discussed, such as computer vision, human genetics, and text mining.

Note: On one side, this course can be challenging for some non-math students as some homework requires mathematical derivation. On the other side, it can be challenging for some math students as it requires coding. If you are still interested in, then let's suffer to learn! Of course, students are welcome to be audience.

Lecture information

Fall, 2025, Tuesday, Thursday, 03:00PM - 04:20PM, Rm 2406, Lift 17-18, main academic building, HKUST.

- - Introduction. [Note]
    - Illustrating the Practical Value of Statistics: A Simple Example [pdf]
    - Suggested reading:
      - Computer age statistical inference [book] Part I Classic Statistical Inference.
      - Ten Statistical Ideas that Changed the World [link]
  - Lecture 1. James-Stein Estimator and Empirical Bayes. [Lecture note]
    - Ref: Stein's Unbiased Risk Estimate (SURE) [link]
    - Suggested reading:
      - Empirical Bayes: Concepts and Methods [link] Very nice review!!
      - Empirical Bayes: Ideas and Applications [link] Very nice talk!
      - Bayesian lens and Bayesian blinker [link]
      - Tractable Evaluation of Stein’s Unbiased Risk Estimator with Convex Regularizers [link]
      - ebnm: An R Package for Solving the Empirical Bayes Normal Means Problem Using a Variety of Prior Families. [link]
      - Understanding Diffusion Models: A Unified Perspective [link] Very hot topic!!!
      - Diffusion Posterior Sampling for General Noisy Inverse Problems (Tweedies' formula in AI) [link]
  - Lecture 2. Linear mixed models. [Lecture note]
    - R package: Variance Component Model [link]
    - Ref: PRML, Chapters 2, 3, and 6. (Gaussian distribution, Bayesian linear model, and Gaussian process)
    - Suggested reading
      - Ridge Regularizaton: an Essential Concept in Data Science [link] Very nice review!
      - Bayesian Lasso [link]
  - Lecture 3. Explicit and implicit regularization in supervised learning. [Lecture note]
    - Ref: Additive logistic regression: a statistical view of boosting [link]
    - Ref: Greedy function approximation: A gradient boosting machine. [link]
    - Ref: Boosting as a Regularized Path to a Maximum Margin Classifier [link]
    - Suggested reading: Gradient and Newton Boosting for Classification and Regression [link]
    - Suggested reading: Gaussian Process Boosting [link]
    - Suggested reading: Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success [link]
    - Suggested reading: Why do tree-based models still outperform deep learning on tabular data? [link]
    - Suggested reading: Tabular Data: Deep Learning is Not All You Need [link]
    - Interview with Jerome H. Friedman about Gradient Boosting [link] Very interesting interview with many stories.
    - Code examples [GradientBoostingDemo][GradientBoosting_sklearn][GradientBoosting_rpart]
  - Lecture 4. The Expectation-Maximization (EM) algorithm and its extension. [Lecturenote1][lecturenote2]
    - Ref: Liu C., et al. 1998. Parameter expansion for EM acceleration the PXEM algorithm.
    - Ref: Learning From Crowds [link]
    - Suggested reading: Majorization-minimization algorithms in signal processing, communications, and machine learning [link]
  - Lecture 5. Variational Inference. [lecturenote][lecturenote2]
  - Lecture 6. False discovery rate. [Lecturenote]
  - Lecture 7. Matrix factorization. [lecturenote]
  - Lecture 8. Latent Dirichlet Allocation and PSD model.
  - Lecture 9. Generative adversarial networks. [Lecturenote]
  - Lecture 10. Variational inference in deep learning. [lecturenote 1][lecture note 2]
  - [You are very much ON TIME]

Reference books

Bishop C. (2006) Pattern Recognition and Machine Learning [link]
Hastie, Tibshirani, Friedman, Elements of statistical learning. [link]
Efron B. and Hastie. T. (2016) Computer-Age Statistical Inference [link]
Kevin Patrick Murphy. (2022) Probabilistic Machine Learning: An Introduction [link]
Kevin Patrick Murphy. (2022) Probabilistic Machine Learning: Advanced topics [link]
John Winn, Christopher M. Bishop, Thomas Diethe, John Guiver and Yordan Zaykov. Model-based machine learning [link for early access]
Simon J.D. Prince (2023) Understand deep learning. [link]
Bishop C., Bioshop H.(2024) Deep learning: foundation and concepts. [link]

Presentation [link]

Grading policy: Assignment (60%) + Project (40%)

Assignment (60%): posted on Canvas

Assignment 1 [pdf] Due Date: September, 16, 2025 (11:59 pm).

Assignment 2 [pdf]

Assignment 3 [pdf]

Assignment 4 [pdf]

Project (40%)

To be posted.

Google Sites

Report abuse