A collection of concepts and tools about statistics.
1. From Central Limited Theorem to Confidence Interval
* 1) regardless of the distribution of the real population, the sample mean always follow the normal distribution, if you sample enough times with good population (>30).
* 2) the normal distribution of the sample means (x-bar) has its own mean, and and standard deviation.
* 3) CLT says, that mean equals to the mean of the population (mu).
* 4) and CLT says, that standard deviation of sample s, can be estimated by standard deviation of the population (sigma).
s = sigma / sqrt(n)
* 5) in practice, sd of the population (sigma) is not normally available, therefore, we use sd of the sample (s) as an approximation.
* 6) so it follows the whole formula of Confidence Interval:
Confidence Range = (x-bar) +/- z* s/sqrt(n)
* 7) z is the number of standard deviations your precision allows your x-bar to deviate away from the real population mean. z is directly linked to the value of confidence level.
* 8) the proper claim is: I have the confidence that by 95% of the chance, the population mean is x-bar.
2. From Confidence Interval to Hypothesis Testing
* 1) state null (H0) and alternate (H1) hypothesis: the alternative is always the one of your belief; the null is the other side.
- Note: the hypothesis is for the purpose to reject.
* 2) select a level of significance (5%)
* 3) identify the test statistic: R will help you.
* 4) formulate a decision rule:
* 5) take a sample, arrive at a decision
- reject the null, and accept the alternative: I have certain confidence to believe that my initial intuition is correct.
- do not reject the null: I DO NOT have a confidence to believe my intuition. But NOTE, it doesn't mean my intuition was wrong either. It only means: no conclusion.