Hotelling's T-squared test

The main idea...

Hotelling's T2 test (Hotelling, 1931) is the multivariate generlisation of the Student's t test; however, objects subject to a Hotelling's T2 should be described by multiple response variables. A one-sample Hotelling's T2 test can be used to test if a set of objects (which should be a sample of a single statistical population) has a mean equal to a hypothetical mean (Figure 1a). A two-sample Hotelling's T2 test may be used to test for significant differences between the mean vectors (multivariate means) of two multivariate data sets (Figure 1b).

Null hypothesis (one-sample)

Null hypothesis (two-sample)

The (multivariate) vector of means of a group of objects is equal to a hypothetical vector of means.

The (multivariate) vectors of means of two groups of objects are equal.

For testing more than two groups, consider multivariate analysis of variance (MANOVA).

Figure 1: Schematic illustrating the logic of a one- and two-way Hotelling's T2 test in a simple, two-dimensional space. Linear combinations of the original variables are used to build a synthetic variable that best separates either a group from a hypothetical mean (μ0; a), or two groups of multivariate-normal data (b). In other words, the maximum possible T2 value is found. Points indicate multivariate means of each population and circles indicate multivariate dispersion. The significance of this separation may be tested by comparison of transformed T2 values to an F-distribution.

Assumptions

The variables of each data set follow a multivariate normal distribution. Each variable may be tested for univariate normality.
The objects have been independently sampled.
In a two-sampled test, the two data sets being tested have (near) equivalent variance-covariance matrices.Bartlett's test may be used to evaluate if this assumption holds.
Each data set describes one population with one multivariate mean. No subpopulations exist within each data set.

Warnings

Hotelling's T2 test is sensitive to violations of the assumption of independently sampled objects. Any interdependence, and hence redundancy, will reduce the power of the test by reducing the effect sample size. Both time-series data and data sampled along some non-random spatial range may be autocorrelated which may mean objects are not independent. Test for temporal autocorrelation prior to conducting Hotelling's T2 test.
Two-sample Hotelling's T2 tests are sensitive to violations of the assumption of equal variances and covariances. This is especially true if sample sizes differ between the two data sets being tested.

Implementations

R
- The "Hotelling" package includes Hotelling's T2 test and a number of useful variants. Further, James-Stein shrinkage estimators may be used in computing Hotelling's T2 test. Permutation-based tests are also available along with plotting functions.

References

Hotelling H (1931) The generalization of Student’s ratio. Ann Math Stat. 2(3):360–378.