The sample variance is even more sensitive to outliers than the sample mean. To illustrate the role of outliers, a random time series of length n = 60 (1901-1960) was generated from a normal distribution with zero mean and a variance that shifted in 1931 from one to six. This shift is easily detected using the target p = 0.05, cutoff length l = 20 and tuning constant h = 6 (Fig. 1). Using h = 6 means that none of the data points in this series has a reduced weight. Note that, although the data value in 1946 is greater than 6, the scale of variation in the second regime is about 2.5 times greater than that in the first regime. When normalized by this greater scale, the data value in 1946 becomes less than 6 and it’s not considered an outlier.
Figure 2 shows the same time series, except an outlier in 1920. This outlier increases the sample variance for 1901-1930 from 0.97 to 2.20. As a result, the ratio of variances for 1931-1950 to 1901-1930 becomes less than the critical F-test value for the given p and l, and the change point in 1931 is not detected. The ratio of variances between the two regimes becomes statistically significant only in 1946.
When the tuning constant is reduced to 2, the weight of the data point decreases to 0.4, and the weighted variance for 1901-1930 is now only slightly greater than that in Fig. 1 (1.20 vs. 0.97). The ratio of variances for the regimes before and after 1931 remains greater than the critical F-test value, and the year 1931 becomes a change point (Fig. 3).
Another example with a synthetic time series with three variance regimes (6, 1, and 6) and two change points is presented in Fig. 4. Using h = 2 helps eliminate the effect on the outlier in 1931 and detect regime shifts correctly. When h was increased to 3, no regime shifts were detected.
Fig. 4. A time series with three variance regimes that can be detected using h = 2, but not at h = 3 or higher.
It is strongly discouraged to use h values lower than 1.5, which may lead to too many shifts downward (from higher to lower variance). The reason is that when calculating the Residual Sum of Squares Index (RSSI) for a potential change point, the weights for newly arrived data are assigned based on the projected smaller scale for the new regime. As a result, too many data points may receive excessively small weights and therefore pass the F-test. It appears that h = 2 produces good results in most cases and therefore is used here as a default value for the tuning constant.