Class time: W 0930-1215
Location: YC Liang Hall 104
Outline: 2025Fall_S3005_outline.pdf
Password: see Blackboard
Name: Kin Wai CHAN
Email: kinwaichan@cuhk.edu.hk
Office: LSB 115
Tel: 3943 7923
Office hour:
I have an open-door policy. Feel free to drop by anytime and ask me questions.
Cheuk Hin (Andy) CHENG
Email: andychengcheukhin@link.cuhk.edu.hk
Office: LSB G32
Tel: 3943 8535
Yi Ho (Henry) NGAN
Email: yihongan@link.cuhk.edu.hk
Office: LSB G30
Tel: 3943 8534
This course introduces a wide variety of nonparametric techniques for performing statistical inference and prediction, emphasizing both conceptual foundations and practical implementation. Basic theoretical justification is also provided. The content covers three broad themes: (i) rank-type and order-type methods for handling location, dispersion, correlation, distribution and regression problems, (ii) resampling-type procedures for testing and assessing precision, and (iii) smoothing-type techniques for estimation and prediction. Topics include Wilcoxon signed-rank test, Mann-Whitney rank sum test, Spearman’s rho, Kendall’s tau, Kruskal-Wallis test, Kolmogorov-Smirnov test, bootstrapping, Jackknife, subsampling, permutation tests, kernel method, k-nearest neighbour, tree-based method, classification, etc.
Note: No prerequisite but knowledge of Stat 2001, 2005 and 2006 is strongly recommended.
A self-contained lecture note is the main source of reference. Complementary textbooks include
(Major) Bonnini, S., Corain, L., Marozzi, M., and Salmaso, L. (2014). Nonparametric hypothesis testing: rank and permutation methods with applications in R. Wiley.
(Major) Wasserman, L. (2006). All of nonparametric statistics. Springer.
(Minor) Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer.
(Minor) James, G., Witten, D., Hastie, T., and Tibshirani, R (2013). An Introduction to Statistical Learning: with Applications in R. Springer.
Upon finishing the course, students are expected to
appreciate the beauty of nonparametric methods;
apply a wide variety of nonparametric techniques to perform inference, prediction and learning tasks;
understand the pros and cons of parametric and nonparametric methods;
master the skills in deriving basic theoretical properties of nonparametric methods;
use computer programs to perform nonparametric statistical analysis for real-life problems.
There are three main assessment components, plus a bonus component.
a (out of 100) is the average score of approximately eight assignments with the lowest two scores dropped;
m (out of 100) is the score of mid-term project; and
f (out of 100) is the score of final project.
b (out of 2) is the bonus points, which will be given to students who actively participate in class.
The total score t (out of 100) is given by
t = min{100, 0.3a + 0.2max(m,f) + 0.5f + b}
If min(t, f ) < 30, the final letter grade will be handled on a case-by-case basis. Otherwise, your letter grade will be in the A range if t ≥ 85, at least in the B range if t ≥ 65, at least in the C range if t ≥ 55.
* For the most updated information, please always refers to the course outline announced by the course instructor in Blackboard, which shall prevail the above information if there is any discrepancy.
Introduction: history, philosophy, examples.
Statistical foundation: basic testing and estimation, statistical limiting theorems.
Location and scale problems: sign test, signed-rank test, rank sum test, Ansari–Bradley test.
Correlation problem: Spearman’s ρ, Kendall’s τ, Bergsma–Dassios’s correlation, Chatterjee correlation
Distribution problem: Kolmogorov–Smirnov test, Cram ́er–von Mises test, Anderson-Darling test.
Permutation tests: ideas of randomization, examples of permutation tests.
Bootstrap and Subsampling: different bootstrapping methods, Jackknife, Subsampling.
Density estimation: histogram, kernel method, bandwidth selection.
Nonparametric regression: Nadaraya–Watson kernel estimator, local polynomial estimator.
Other topics: (a) classification, (b) Bayesian nonparametric, (c) rank-type regression, (d) k-nearest neighbor, ...
* Click (S3005/2025Fall/lecture) to download lecture notes (or click the individual links below).
* The finalized version of the notes will be uploaded one day before the lecture.
* All rights reserved by the authors. Re-distribution by any means is strictly prohibited.
Front matters
Part I: Philosophy and Foundation
Part II: Rank-type and order-type methods
Chapter 3: Location and scale problems
Chapter 4: Correlation problem
Chapter 5: Distribution problem
Part III: Resampling-type procedures
Part IV: Smoothing-type estimation and learning techniques
Appendices
Appendix A: Basic Mathematics
Appendix B: Basic probability
Appendix C: Basic Statistics
Appendix D: Basic programming in R --- for students who want to review; read Lectures 2 and 3 in RMSC 1101
Appendix E: R-codes used throughout the courses can be found in the lecture folder (this folder will be updated from time to time).
P.S.: Not all materials in the appendices are directly useful for this course. I will tell you which parts are useful when we need them.
* Click (S3005/2025Fall/A) to download assignments.
Assignment 1: concepts of nonparametric methods, theory of ranks, simulation experiments --- Due: 26 Sep (Fri) @1800
Remark: Any form of generative AI is NOT allowed to be used for the assignments.
* Click S3005/2024Fall/inclassNote to download in-class notes.
* In-class notes will be uploaded within one week after the lecture.
Lecture 1 (3 Sep) --- Review (indicator, estimation, testing), new statistics (rank, sign, order, ...), theory of ranks
Lecture 2 (10 Sep) --- Proof of Thm 3.2 (Eg 2.5), rank principle, 5 tests (sign, signed rank, rank sum, trend, A-B)
* Click (S3005/2025Fall/quiz) to download quizzes.
Start time: 17 October (Friday) @ 6:30 pm
Duration: 3 hours
Location: LSB LT 3 & 5 (tentative)
Scope: TBA
Instruction: TBA
Mock: TBA
Start time: 12 December (Friday) @ 6:30 pm
Duration: 3 hours
Location: LSB LT 2 & 5 (tentative)
Scope: TBA
Instruction: TBA
Mock: TBA