MATH 4432 Statistical Machine Learning

Synopsis

This course is open to senior undergraduates in applied mathematics, statistics, and engineering who are interested in learning from data. It covers hot topics in statistical learning, also known as machine learning, featured with various applications.

Note: we don't have enough time to cover deep learning methods. For those who are interested in this part, please refer to my PG course Math 5472.

Reference

An Introduction to Statistical Learning, with applications in R. By James, Witten, Hastie, and Tibshirani.

Syllabus (Fall, 2022, HKUST)

Introduction to Probabilistic world. [lecture note]

Lecture 1. Overview of Statistical machine learning. [lecture note]

Lecture 2. Linear models. [lecture note]

Lecture 3. Classification. [lecture note]

A code example [logistic regression][pdf]

Lecture 4. Resampling. [lecture note]

Code examples [CV: logistic regression with variable screening ][CV: LDA with variable screening]

Lecture 5. Model selection and Regularization. [lecture note]

Code example [Ridge regression with CV]

Lecture 6. Algorithm. [lecture note]

Code example [Coordinate decent for Lasso]

Lecture 7. Tree-based methods. [lecture note]

Code examples [GradientBoostingDemo][GradientBoosting_sklearn][GradientBoosting_rpart][GBM_rpart_tutorial]
Suggested reading: Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success [link]
Examples of Random Forest will be given in the tutorial.

More coding examples are available in the tutorial session.

Assignment

Assignment 1
Assignment 2
Assignment 3
Assignment 4

Final Exam

Policy

Assignment (40%) + Final Exam (60%)

Tutorials by our TA, Zhiwei Wang (HKUST PhD student)

Google Sites

Report abuse