Applied Statistics
Syllabus (Fall 2017, HKUST)
- Chapter 0. Introduction to Statistics [Lecture notes]
- Suggested reading: The art of R programming: A Tour of Statistical Software Design, by Norman Matloff. 2011.
- Suggested reading: ggplot2: Elegant Graphics for Data Analysis (Use R!) 2nd edition, by Hadley Wickham. 2016.
- Chapter 1. Descriptive Statistics [Lecture notes]. [Reviewed Questions]
- Chapter 2. Probability [Lecture notes]. [Reviewed Questions] [Discussion]
- Suggested reading: Chapter 1 "A Murder Mystery" of The book "Model-based machine learning" by John Winn and Christopher Bishop.
- Suggested reading: Book "Thinking, Fast and Slow" by Daniel Kahneman
- Chapter 3. Random Variable and Probability Distribution. [Lecture notes] [Reviewed Questions 1] [Reviewed Questions 2]
- Suggested reading: Chapter 2 "Assessing people's skills" of The book "Model-based machine learning" by John Winn and Christopher Bishop.
- One additional exercise for Normal distribution and Bayes Rule [ex_Normal Distribution][Solution]
- Chapter 4. Estimation. [Lecture notes] [Reviewed Questions]
- Chapter 5. Testing Hypothesis. [Lecture notes][Reviewed Questions]
- Chapter 6. Simple linear regression. [Lecture notes][Reviewed Questions]
- Chapter 7. Two-sample problems. [Lecture notes][Reviewed Question]
- Chapter 8. ANOVA (Analysis of Variance). [Lecture notes]
- Chapter 9. Goodness-of-fit test. [Lecture notes][Reviewed Questions]
- Epilogue: where to go from here
- Overview of Statistical learning [notes]
- An Introduction to Statistical Learning [On-line course by Hasite and Tibshirani]
- Learning from Data by Yaser S. Abu-Mostafa, Malik Magdon-Ismail and Hsuan-Tien Lin. [On-line course by Yaser S. Abu-Mostafa]
- Doing Bayesian Data Analysis. By John K. Kruschke.
- The book "Model-based machine learning" by John Winn and Christopher Bishop. [website]
Exam
- Mid-term Examination (LTG & LTJ, October 11, 7:30pm-9:30pm)
- Final Examination (TBA)
Grading
- Mid-term Examination 35% (0%)
- Final Examination 65% (100%)
Policy
- No make-up midterm examination will be given.
- The weight of missed midterm exam will be transferred to the final exam.
- This course is essentially graded by curve.
- Supporting information here is from JBstatistics.
- The maximum of the total scores calculated by the above two weighting schemes will be taken to determine the student's grade.
- 1. Discrete Probability Distributions
- 1.1 Introduction to Discrete Random Variables and Discrete Probability Distributions
- 1.2 Expected Value and Variance of Discrete Random Variables
- 1.3 Introduction to the Bernoulli Distribution
- 1.4 The Bernoulli Distribution: Deriving the Mean and Variance
- 1.5 An Introduction to the Binomial Distribution
- 1.6 Binomial/Not Binomial: Some Examples
- 1.7 The Binomial Distribution: Mathematically Deriving the Mean and Variance
- 1.8 An Introduction to the Hypergeometric Distribution
- 1.9 An Introduction to the Poisson Distribution
- 1.10 The Poisson Distribution: Mathematically Deriving the Mean and Variance
- 1.11 Discrete Probability Distributions: Example Problems (Binomial, Poisson, Hypergeometric, Geometric)
- 1.12 The Relationship Between the Binomial and Poisson Distributions
- 1.13 Proof that the Binomial Distribution tends to the Poisson Distribution
- 1.14 Introduction to the Geometric Distribution
- 1.15 Introduction to the Negative Binomial Distribution
- 1.16 Introduction to the Multinomial Distribution
- 2. Continuous Random Variables & Continuous Probability Distributions
- 2.1 An Introduction to Continuous Probability Distributions
- 2.2 Finding Probabilities and Percentiles for a Continuous Probability Distribution
- 2.3 Deriving the Mean and Variance of a Continuous Probability Distribution
- 2.4 Introduction to the Continuous Uniform Distribution
- 2.5 An Introduction to the Normal Distribution
- 2.6 Standardizing Normally Distributed Random Variables
- 2.7 The Normal Approximation to the Binomial Distribution
- 2.8 Normal Quantile-Quantile Plots
- 2.9 An Introduction to the Chi-Square Distribution
- 2.10 An Introduction to the t Distribution (Includes some mathematical details)
- 2.11 An Introduction to the F Distribution
- 3. Using Tables to Find Areas and Percentiles (Z, t, X2, F)
- 3.1 Finding Areas Using the Standard Normal Table (for tables that give the area to left of z)
- 3.2 Finding Percentiles Using the Standard Normal Table (for tables that give the area to left of z)
- 3.3 Finding Areas Using the Standard Normal Table (for tables that give the area between 0 and z)
- 3.4 Finding Percentiles Using the Standard Normal Table (for tables that give the area between 0 and z)
- 3.5 Using the t Table to Find Areas and Percentiles
- 3.6 R Basics: Finding Percentiles and Areas for the t Distribution
- 3.7 Using the F Table to Find Areas and Percentiles
- 3.8 Finding Percentiles and Areas for the F Distribution Using R
- 3.9 Using the Chi-square Table to Find Areas and Percentiles
- 4. Sampling Distributions
- 5. Confidence Intervals
- 5.1 Introduction to Confidence Intervals
- 5.2 Deriving a Confidence Interval for the Mean
- 5.3 Intro to Confidence Intervals for One Mean (Sigma Known)
- 5.4 Finding the Appropriate z Value for the Confidence Interval Formula
- 5.5 Confidence Intervals for One Mean: Interpreting the Interval
- 5.6 What Factors Affect the Margin of Error?
- 5.7 Confidence Intervals for One Mean: Sigma Not Known (t Method)
- 5.8 Intro to the t Distribution (non-technical)
- 5.9 Confidence Intervals for One Mean: Determining the Required Sample Size
- 5.10 Confidence Intervals for One Mean: Investigating the Normality Assumption
- 6. Hypothesis Testing
- 6.1 An Introduction to Hypothesis Testing
- 6.2 Z Tests for One Mean: Introduction
- 6.3 Z Tests for One Mean: The Rejection Region Approach
- 6.4 Z Tests for One Mean: The p-value
- 6.6 What is a p-value?
- 6.7 Type I Errors, Type II Errors, and the Power of the Test
- 6.8 One-Sided Test or Two-Sided Test?
- 6.9 Statistical Significance versus Practical Significance
- 6.10 The Relationship Between Confidence Intervals and Hypothesis Tests
- 6.11 Calculating Power and the Probability of a Type II Error (A One-Tailed Example)
- 6.12 Calculating Power and the Probability of a Type II Error (A Two-Tailed Example)
- 6.13 What Factors Affect the Power of a Z Test?
- 6.14 Hypothesis Testing in 17 Seconds
- 6.15 t Tests for One Mean: Introduction
- 6.16 t Tests for One Mean: An Example
- 6.17 t Tests for One Mean: Investigating the Normality Assumption
- 6.18 Hypothesis tests on one mean: t or z?
- 6.19 Using the t Table to Find the P-value in One-Sample t Tests
- 6.20 Finding Areas Under the t Distribution
- 7. Inference for Two Means
- 7.1 Inference for Two Means: Introduction
- 7.2 The Sampling Distribution of the Difference in Sample Means
- 7.3 Pooled-Variance t Tests and Confidence Intervals: Introduction
- 7.4 Pooled-Variance t Tests and Confidence Intervals: An Example
- 7.5 Welch (Unpooled Variance) t Tests and Confidence Intervals: Introduction
- 7.6 Welch (Unpooled Variance) t Tests and Confidence Intervals: An Example
- 7.7 Pooled or Unpooled Variance t Tests and Confidence Intervals?
- 7.8 An Introduction to Paired-Difference Procedures
- 7.9 An Example of a Paired-Difference t Test and Confidence Interval
- 7.10 Pooled-Variance t Procedures: Investigating the Normality Assumption
- 8. Inference for Proportions
- 9. Chi-square Tests
- 10. Variances
- 11. ANOVA
- 12. Regression
- 12.1 Introduction to Simple Linear Regression
- 12.2 Simple Linear Regression: The Least Squares Regression Line
- 12.3 Simple Linear Regression: Interpreting Model Parameters
- 12.4 Simple Linear Regression: Assumptions
- 12.5 Checking assumptions with residual plots
- 12.6 Inference on the slope (the formulas)
- 12.7 Inference on the Slope (An Example)
- 12.8 The Correlation Coefficient and Coefficient of Determination
- 12.9 Simple Linear Regression: An Example
- 12.10 Simple Linear Regression: Always Plot Your Data!
- 12.11 Simple Linear Regression: Transformations
- 12.12 Estimation and Prediction of the Response Variable in Simple Linear Regression
- 12.13 Leverage and Influential Points in Simple Linear Regression
- 12.14 The Pooled-Variance t Test as a Regression
- 13. Jimmy and Mr. Snoothouse