(Old Version)

STAT 3005

Nonparametric Statistics (2020-21 Fall)

Class Information

    • Class time: W 0930-1215

    • ZOOM ID: 987-4899-0267

    • ZOOM PW: See Blackboard

    • Outline

Instructor

    • Name: Kin Wai CHAN

    • Email: kinwaichan@cuhk.edu.hk

    • Office: LSB 115

    • Tel: 3943 7923

    • Office hour:
      (i) I have an open door policy. Feel free to drop by anytime and ask me questions.
      (ii) Because of pandemic, you may make an appointment with me for a ZOOM meeting.

Teaching Assistants

Man Fung Heman LEUNG

Hon Kiu James TO

Description

This course introduces a wide variety of nonparametric techniques for performing statistical inference and prediction, emphasizing both conceptual foundations and practical implementation. Basic theoretical justification is also provided. The content covers three broad themes: (i) rank-type and order-type methods for handling location, dispersion, correlation, distribution and regression problems, (ii) resampling-type procedures for testing and assessing precision, and (iii) smoothing-type techniques for estimation and prediction. Topics include Wilcoxon signed-rank test, Mann-Whitney rank sum test, Spearman’s rho, Kendall’s tau, Kruskal-Wallis test, Kolmogorov-Smirnov test, bootstrapping, Jackknife, subsampling, permutation tests, kernel method, k-nearest neighbour, tree-based method, classification, etc.

Note: No prerequisite but knowledge of Stat 2001, 2005 and 2006 is strongly recommended.

Textbooks

A self-contained lecture note is the main source of reference. Complementary textbooks include

Learning outcomes

Upon finishing the course, students are expected to

      1. appreciate the beauty of nonparametric methods;

      2. apply a wide variety of nonparametric techniques to perform inference, prediction and learning tasks;

      3. understand the pros and cons of parametric and nonparametric methods;

      4. master the skills in deriving basic theoretical properties of nonparametric methods;

      5. use computer programs to perform nonparametric statistical analysis for real life problems.

Assessment and Grading

There are three main assessment components, plus a bonus component.

      • a (out of 100) is the average score of approximately eight assignments with the lowest three scores dropped;

      • m (out of 100) is the score of mid-term exam; and

      • f (out of 100) is the score of final exam.

      • b (out of 2) is the bonus points, which will be given to students who actively participate in class.

The total score t (out of 100) is given by

t = min{100, 0.3a + 0.2max(m,f) + 0.5f + b}

Your letter grade will be in the A range if t ≥ 85, at least in the B range if t ≥ 65, at least in the C range if t ≥ 55. However, if f < 30, the final letter grade will be handled on a case-by-case basis.

Important note: For the most updated information, please always refers to course outline announced by course instructor in Blackboard, which shall prevail the above information if there is any discrepancy.

Syllabus

Part I: Philosophy and Foundation

  1. Introduction: history, philosophy, examples.

  2. Statistical foundation: basic testing and estimation, statistical limiting theorems.

Part II: Rank-type and order-type methods

  1. Location and scale problems: sign test, signed-rank test, rank sum test, Siegel Tukey test.

  2. Correlation problem: Spearman’s ρ, Kendall’s τ and Kendall’s τ*.

  3. Distribution problem: Kolmogorov–Smirnov test, Cram ́er–von Mises test, Anderson-Darling test.

  4. Regression problem: Kruskal Wallis, Jonchkheere-Terpstra test, Friedman Test.

Part III: Resampling-type procedure

  1. Bootstrap and Subsampling: different bootstrapping methods, Jackknife, Subsampling.

  2. Permutation tests: ideas of randomization, examples of permutation tests.

Part IV: Smoothing-type estimation and learning techniques

  1. Density estimation: kernel method, k-nearest neighbor.

  2. Nonparametric regression: kernel method, penalized optimization method, tree-based method.

  3. Classification: density method, regression method, other methods.

Lecture Notes (Draft)

(All rights reserved by the authors. Re-distribution in any mean is strictly prohibited.)

Front matters

Part I: Philosophy and Foundation

Part II: Rank-type and order-type methods

Part III: Resampling-type procedures

Part IV: Smoothing-type estimation and learning techniques

Appendices (Optional)

P.S.: Not all materials in the appendices are directly useful for this course. I will tell you which parts are useful when we need them. 

Assignments

Quizzes

Mid-term project

    • Time: 5 pm 30 October (Friday) -- 5 pm 1 November (Sunday)

    • Duration: 48 hours

    • Scope: Chapter 1 -- Chapter 4

    • Instructions: The detailed instructions are stated on the first page of the question paper. Some highlights are listed below:

        • Read the instructions carefully before doing the exam.

        • There is one bonus question that is worth 10 points.

        • Complete the exam by yourself.

        • Consult and use any official course materials if you wish.

    • Submission

        • Compile your answers in a single ".pdf" file (i.e., not MS words, jpeg, zip, etc).

        • Sign the Honor Code, and attach it as a cover of your submitted file.

        • Name the document in the format S3005_M_sid_name.pdf., e.g., S3005_M_1155001234_ChanKinWai.pdf.

        • Submit to Blackboard.

        • You may submit your answers as many times as you wish, however, only the last submission will be graded.

    • Mock mid-term exam

    • (Real) mid-term exam

Final project

    • Time: 10 am, 11 December (Friday) -- 10 am, 14 December (Monday)

    • Duration: 72 hours

    • Scope: Chapter 1 -- Chapter 9

    • Instructions: The detailed instructions are stated on the first page of the question paper. Some highlights are listed below:

        • Read the instructions carefully before doing the exam.

        • There is one bonus question that is worth 10 points.

        • Complete the exam by yourself.

        • Consult and use any official course materials if you wish.

    • Submission

        • Compile the written part of your answers in a single ".pdf" file (i.e., not MS words, jpeg, zip, etc).

        • Submit your R-codes separately for each questions.

        • Sign the Honor Code, and attach it as a cover of your submitted file.

        • Name the documents in the correct format. There are FOUR files you need to submit.

        • Submit to Blackboard.

        • You may submit your answers as many times as you wish, however, only the last submission will be graded.

    • Mock project

    • (Real) project