Synopsis
This course is open to senior undergraduates in applied mathematics, statistics, and engineering who are interested in learning from data. It covers hot topics in statistical learning, also known as machine learning, featured with various applications.
Note: we don't have enough time to cover deep learning methods. For those who are interested in this part, please refer to my PG course Math 5472.
Reference
An Introduction to Statistical Learning, with applications in R. By James, Witten, Hastie, and Tibshirani.
Syllabus (Fall, 2022, HKUST)
Introduction to Probabilistic world. [lecture note]
Lecture 1. Overview of Statistical machine learning. [lecture note]
Lecture 2. Linear models. [lecture note]
Lecture 3. Classification. [lecture note]
A code example [logistic regression][pdf]
Lecture 4. Resampling. [lecture note]
Lecture 5. Model selection and Regularization. [lecture note]
Code example [Ridge regression with CV]
Lecture 6. Algorithm. [lecture note]
Code example [Coordinate decent for Lasso]
Lecture 7. Tree-based methods. [lecture note]
Code examples [GradientBoostingDemo][GradientBoosting_sklearn][GradientBoosting_rpart][GBM_rpart_tutorial]
Suggested reading: Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success [link]
Examples of Random Forest will be given in the tutorial.
More coding examples are available in the tutorial session.
Assignment
Assignment 3
Assignment 4
Final Exam
Policy
Assignment (40%) + Final Exam (60%)
Tutorials by our TA, Zhiwei Wang (HKUST PhD student)