Statistical Inference Basic Elements

Five basic elements of hypothesis tests

Inferential techniques based on Frequentist (vs. Bayesian) methods have the following five basic elements.

See texts by Agresti, Mendenhall, Conover, and Montgomery, as well as NIST Hypothesis Tests.

Assumptions
Hypotheses
Test Statistic
P-value
Conclusion

Other components include:

Null Distribution (aka Reference Distribution - Montgomery)
Rejection Region (aka Critical Region) - based on the null distribution
Critical Value of the Test Statistic (see also this StatSoft blog entry - How to Find Critical Values for Statistical Tests)

The purpose of the Test Matrix is to serve as a guide to the discussion of each of these elements when designing an experiment or a study.

References:

"Statistical Methods for the Social Sciences, Third Edition" by Alan Agresti and Barbara Finlay, chapter 6.
"Practical Nonparametric Statistics, Third Edition" by W. J. Conover, section 2.3.
See also this StatSoft blog entry - How to Interpret Statistical Analysis Results.

See the map of Hypothesis Tests

Assumptions

The type(s) of data, form of the population, method of sampling (randomness of the sample).

Hypotheses (Null and Alternate)

Null hypothesis

The hypothesis that is directly tested. Usually a statement that there is no effect, difference, or change. A significance test analyzes the strength of the evidence against the null hypothesis.

Alternate hypothesis

A hypothesis that contradicts the null hypothesis. May also be known as the research hypothesis.

a-priori Hypothesis

Hypotheses are formulated before collecting or analyzing the data. See notes on a-priori hypotheses.

Hypothesis - other considerations

simple vs. compound hypothesis;
two-tailed test vs. one-tailed (lower or upper)
- defines the location of the rejection region of the null distribution
one-sided vs. two-sided tests
- (not always the same as one- and two-tailed ... see Conover, pages 98 and 431)
- hypotheses for two-sided test
  - H0: mu = mu0 Ha: mu <> mu0
- examples of null and alternate hypotheses for one-sided tests
  - H0: mu <= mu0 Ha: mu > mu0
  - H0: mu >= mu0 Ha: mu < mu0
Type 1 and Type 2 error

Test Statistic

The statistic calculated from the sample data to test the null hypothesis based on the null distribution. Typically expressed as a point estimate related to a population parameter. Should also be accompanied by an interval estimate of the parameter, reflecting the standard error (uncertainty) of the test statistic value.

A good test statistic is one that is a sensitive indicator of whether the data agree or disagree with the null hypothesis.

Null Distribution

The distribution of the test statistic under the assumption that the null hypothesis is true.

Rejection Region

Also known as the "Critical Region". Used, along with alpha level, as the basis for a Decision Rule.

The set of all points, or collection of test statistic values, in the sample space that result in the decision to reject the null hypothesis at a specific alpha level.

The Rejection Region is defined by the Critical Value of the Test Statistic and the Null Distribution.

p-value

The smallest significance level at which the null hypothesis would be rejected for the given observation. The p-value is calculated under the assumption that the null hypothesis is true (based on the null distribution of the test statistic).

Determined by the null distribution, the rejection (critical) region of the null distribution, and the calculated value of the test statistic.

Understand the difference between statistical significance and practical significance.

Conclusion

Reported p-value, along with a formal decision. Also recommend reporting point and interval estimates of the test statistic, along with the p-value.

In some situations, the value of the test statistic may be reported instead of the p-value. One example would be reporting the value of Cpk (or Ppk). Another example would be reporting GR&R results. In these (and other similar) cases, both point and interval estimates of the test statistic are available and should be considered.