Applied Statistics and Computation Lectures

I created this webpage based on the lecture series that I have taken in the Applied Statistics - II (MAT 2302) subject in SEMESTER-II of the M.Sc. in Engineering Mathematics Program in the Department of Mathematics, ICT Mumbai. I have got requests from some other senior students in our department to share these lectures. Hence, I thought to create this webpage. This course is a continuation of Applied Statistics - I (MAT 2301, mainly Probability Theory and Introduction to Point Estimation). The goal of this course is to give students an idea about statistical simulation and understand various statistical concepts through simulation. Since this course is a part of the Mathematics program, computation ideas are given more emphasis rather than proving classical results.

Some ideas, like the simulation of power function, comparing test procedures, approximating sampling distribution, and univariate and multivariate delta methods, practitioners may find useful. These methods are commonly used in applications of nonlinear models in real-life problems. I am thankful to Khushboo Agarwal, PMRF fellow, IITB IEOR for her wonderful support throughout this course and for raising several interesting and insightful questions. I am thankful to the students whose cooperation throughout this course has been phenomenal and their timely discussion on several simulation concepts using R and Python has been exemplary. Despite being undergraduates in Mathematics, they have supported immensely in the learning process of Applied Statistics and its Application in Computation.


References:

George Casella and Roger L. Berger, Statistical Inference, Second Edition.

Alexander M. Mood, Franklin A. Graybill and Duane C. Boes, Introduction to the Theory of Statistics, Third Edition, Mc Graw Hill Education

Larry Wasserman, All of Statistics: A Concise Course in Statistical Inference, Springer Text in Statistics.

Lectures on Applied Statistics and Computation


Lecture I (2 Hours) (Topic: Inequalities)

1) Holder's Inequality

2) Minkowski's Inequality

3) Jensen's Inequality

4) Hoeffding's Inequality

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the Tutorial: Click Here

Link to the Selected Solutions: Click Here

Lecture II (2 Hours) (Topic: A Gentle Introduction to R)

1) Vectors, matrices, and their operations

3) Writing mathematical functions in R and plotting

4) Probability Integral transform and its demonstration using R (Simulation from exponential(3) random variable using uniform(0,1) random numbers.

5) Demonstration of Central Limit Theorem.

Link to the Video: Click Here

Link to the R Codes: Click Here


Lecture III (2 Hours) (Topic: Delta Method)

1) Results related to Convergence in Probability and Convergence in Distribution

2) Approximation of mean and variance of functions of random variables using Delta method

3) Delta Method as an extension of Central Limit Theorem

4) Computation of Confidence Interval using Delta Method.

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture IV (2 Hours) (Topic: Higher Order Delta Method)

1) Extension of Delta Method for Multiparameter case

2) Higher-order delta method and problems

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the Tutorial: Click Here


Lecture V (2 Hours) (Topic: Monte Carlo Integration)

1) Computing coverage probability of confidence interval.

2) Discussion on Statistical Simulation.

3) Verification of Delta method and CLT using simulation

4) Monte Carlo Integration and error computation

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture VI (2 Hours) (Topic: Importance Sampling and Monte Carlo Integration)

1) Importance Sampling method for computing integrals

2) Optimal choice of g(.) for importance sampling

3) Implementation of Importance Sampling and comparison with Usual Monte Carlo integration

4) Control Variate method to reduce the variance of approximation

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture VII (2 Hours) (Topic: Accept - Reject Algorithm and Its Implementation)

1) Accept - Reject Algorithm and its proof

2) Optimal Choice of c.

3) Generalization of Accept - Reject algorithm

4) Problem-solving using R

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture VIII (2 Hours) (Topic: Metropolis and Gibbs Algorithm)

1) Accept Reject Algorithm

2) Metropolis Algorithm

3) Gibbs algorithm

4) Problem discussion

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture IX (2 Hours) (Topic: Loss function)

1) Recall: Method of Moments and Maximum Likelihood estimators, Recall Mean Squared Error

3) Introduction to the Loss function

4) Risk function (Expected Loss) and Its computation

5) Admissible Estimator

6) Minimax Estimator

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture IX (2 Hours) (Topic: Sufficient Statistics)

1) Sufficient Statistics

2) Factorization theorem

3) Examples

4) Jointly Sufficient Statistic

5) Concept of Minimal Sufficiency

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture X (2 Hours) (Topic: Minimal Sufficient Statistics and One Parameter Exponential Family)

1) Lehmann Scheffe's (1953) theorem for obtaining minimal sufficient statistics. For proof, see Casella and Berger, 2003 (page 281)

2) One parameter exponential family and minimal sufficient statistics

3) Cramer-Rao lower bound for unbiased estimators and connection to the exponential family

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XI (2 Hours) (Topic: Uniformly Minimum Variance Unbiased Estimator (UMVUE))

1) Proof of Cramer-Rao lower bound for variance of any unbiased estimator of a function of parameter theta.

2) Use and limited use of C-R bound to obtain the best unbiased estimator.

3) Complete family of distributions and complete statistics.

4) Lehmann-Scheffe theorem for UMVUE (Unbiased estimator, the function of complete sufficient statistics is UMVUE)

5) Worked examples and problem discussion

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XII (2 Hours) (Topic: Mid Semester Solution and Problem Discussion)

1) Mid Semester Solution

2) Worked example on the Lehmann Scheffe theorem

Link to the Midsem Paper: Click Here

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XIII (2 Hours) (Topic: UMVUE and related theorems)

1) Uniqueness of UMVUE

2) Best unbiased estimator is uncorrelated with any unbiased estimator of zero.

3) Solution of Important problems

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XIV (2 Hours) (Topic: Location and Scale-Invariant Estimator)

1) Location invariant estimator

2) Scale-invariant estimator

3) Location and scale parameter

4) Pitman estimator for location and scale parameter

5) Worked out examples

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XV (1 Hour) (Topic: Proofs related to MLE)

1) Properties of Maximum Likelihood estimator

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XVI (2 Hours) (Topic: Properties of Maximum Likelihood Estimator and Computation)

1) Score function and Fisher Information

2) Asymptotic distribution of Maximum Likelihood Estimator

3) Confidence interval for the unknown parameter based on the distribution of MLE

4) Implementation in R

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture XVII (2 Hours) (Topic: Distribution of MLE and Fisher Information Matrix)

1) Maximum Likelihood estimator in a multidimensional parameter space

2) Fisher Information matrix

3) Multivariate Delta method and approximate distribution of functions of maximum likelihood estimators

4) Simulation using R and problem solving

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture XVIII (2 Hours) (Topic: Empirical Cumulative Distribution Function and Bootstrap Distribution of Statistical Functionals)

1) Empirical Cumulative Distribution Function and Its properties

2) Bootstrap Distribution of the Statistics

3) Computing variance of estimator using bootstrapping

4) In the following code, change the values n to visualize the empirical distribution function as an

approximation to the uniform(0,1) cdf.

set.seed(123)

x = round(runif(n = 5),2)

print(x)

plot(ecdf(x), xlim=c(0, 1))

points(x,rep(0,length(x)), cex = 1, pch = 19, col = "red")

curve(return(x), add = TRUE, col = "magenta",lwd=2)

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XIX (2 Hours) (Topic: Bootstrap Confidence Interval)

1) Normal based Bootstrap Confidence Interval

2) Pivotal Bootstrap Confidence Interval

3) Percentile Bootstrap Confidence Interval

4) Implementation Using R

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture XX (2 Hours) (Topic: Testing of Hypothesis)

1) Testing of Hypothesis

2) Type- I and Type - II error and Power function

3) Compute and plot power function for a given test rule and interpretation

4) Determining the size and level of a test

c = c(0.1, 0.5, 0.8, 1) # Reject H_0 if mean(X)>c

n = 10

mu_vals = seq(-1, 3, length.out = 500)

for (i in 1:length(c)) {

beta_mu = 1 - pnorm(sqrt(n)*(c[i]-mu_vals))

if(i ==1)

plot(mu_vals, beta_mu, col = i+1, pch = 19, type = "l", xlab = expression(mu), ylab = expression(beta(mu)))

else

lines(mu_vals, beta_mu,, col = i+1)

}

# If you want a size alpha = 0.05 test, say. What should you do?

alpha = 0.1

abline(h = alpha, col = "blue")

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XXI (2 Hours) (Topic: Simulation of Power Function and Understanding the Performance of Testing Procedures)

1) Review of Testing of Hypothesis

2) Worked example for computing size of a test and power function

3) Wald test and examples

4) Simulation of power function of Wald test for testing of the hypothesis of H0: p=0.5 and H1: p not equal to 0.5. R programs developed and power functions for varying sample sizes investigated.

Link to the Video: Click Here

Link to the Writing Notes: Click Here

Link to the R Codes: Click Here


Lecture XXII (2 Hours) (Topic: Wald Test)

1) Wald Test: Compare two proportions from binomial populations

2) Wald Test: Comparing Means

3) Wald Test: Comparing Medians

4) Wald Test: Paired Test

5) Compting Standard error of the Sample Median using bootstrap to obtain the value of Wald test statistic for comparing medians of two distributions

6) Equivalency of testing of hypothesis and confidence interval (Simulation given as an exercise)

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XXIII (2 Hours) (Topic: p-value and its computation and Permutation test)

1) Concept of p-value and its computation

2) Distribution of p-value when the null hypothesis is TRUE and when the alternative hypothesis is TRUE (Simulation exercise is given as homework)

3) p-value for Wald test

4) Pearson's Chisquare test and examples

5) Permutation test (it is an exact test) and implementation in software are given as an exercise.

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XXIX (2 Hours) (Topic: Likelihood Ratio Test)

1) Likelihood Ratio Test and Critical Region

2) Likelihood Ratio test and Sufficient Statistic

3) More examples from normal distribution and development of t-test

4) Equivalency of Likelihood Ratio test and Wald test for large samples.

Link to the Video: Click Here

Link to the Writing Notes: Click Here


Lecture XXV (2 Hours) (Topic: Most Powerful Test and Neyman-Pearson Lemma)

2) Intersection - Union Test

3) Examples and derivation of two-sided t-test

4) Most powerful test and Neyman-Pearson Lemma

5) Monotone Likelihood Ratio and Karlin - Rubin theorem for existence of UMP test for the one-sided hypothesis.

Link to the Video: Click Here

Link to the Writing Notes: Click Here