Shifts in mean

Click Regime test --> Run test to open the entry form (Fig. 1). The entire data range is automatically selected. You can select your own data range by clicking the button with the underscore symbol. It is recommended that you place your data in the "Data" worksheet.

Fig. 1. Entry form for calculating regime shifts.

There are two parameters that control the magnitude and scale of the regimes to be detected, the target significance level p, and the cutoff length l. The target significance level is the level at which the null hypothesis that the mean values of the two regimes are equal is tested using the two-tailed Student t-test. The lower the significance level, the larger the magnitude of the shift should be in order to be detected. The target significance level guarantees that the shifts between the regime of l years in length or longer detected by the method will be significant at least at this level. After all the regime shifts are detected, the program also calculates the actual significance levels for the shifts. In some rare circumstances when the regimes are shorter than l years, but the shifts between them are large enough to be detected, the actual significance level may be slightly higher than the target one.

IMPORTANT: If your computer uses a comma (",") as a decimal separator instead of a dot ("."), please change the default target significance level in the opening form from 0.05 to 0,05 (or any other number with your local decimal separator).

The cutoff length is similar to the 100% cutoff point in filtering. The regimes that are longer than the cutoff length will all be detected. If the regimes are shorter than the cutoff length, the probability for them to be detected reduces proportionally to their length. Generally speaking, the shorter the cutoff length, the shorter the regimes that will be detected (and vice versa), but it's not always true. The reason is that the cutoff length also affects the critical magnitude of the shift between the regimes to be detected. For example, let's assume that the difference between the mean values of two regimes is statistically significant at the 0.01 level if the cutoff length is 10 years. But if the cutoff length is reduced to 5 years, the critical magnitude of the shift increases (for the same target significance level), and the regimes may not be detected. It is recommended to experiment with different significance levels and cutoff lengths to better understand their mutual effect on regime detection.

The program also requires the Huber's weight parameter that controls the weights assigned to the outliers (see Handling outliers for more information). Therefore this parameter affects the average value of the regimes.

The Red noise section of the entry form provides a choice of methods for estimation of the lag-1 autocorrelation coefficient (r1). It also contains two checkboxes, Prewhitening and Filtered data output, which control how the data are processed and passed down for further analysis, as following:

Prewhitening box: Not checked. Filtered data output box: Not checked (automatically; i.e., it cannot be checked if the prewhitening box is not checked). Using the chosen method, r1 is estimated and used to calculate the effective degrees of freedom for RSI and adjust the significance level of the shifts for serial correlation. The residuals, i.e., deviations of the original data, xt, from the means of each detected regime, are placed in the “ResM” worksheet.
Prewhitening box: Checked. Filtered data output box: Not checked. Red nose is filtered out from the original data to form a prewhitened time series x*t = xt, - r1xt-1, which is then used to detect the timing of regime shifts in mean. However, the original data is used to calculate the mean values of the regimes and deviations from them.
Prewhitening box: Checked. Filtered data output box: Checked. The prewhitened time series x*t is used to detect regime shifts, calculate mean values of the regimes and deviations from them. Those deviations are placed in the “ResM” worksheet for further analyses. This option is highly recommended if the final goal is to find regime shifts in correlation coefficient between time series with high levels of red noise.

After the entry form is filled and the "OK" button clicked, the program calculates for each time series the regime shift index (RSI), the mean value of the regimes with equal and unequal weights, regime length, final confidence levels for the shifts and the weights of the outliers. This information for each variable is placed in a separate worksheet (named [Name of the variable]M) along with the corresponding graphs. An example of the output is shown in Fig. 2. The program also calculates the combined RSI ("SumM" worksheet) and residuals after the stepwise trend is removed ("ResM" worksheet).

Fig. 2. An example of regime shifts in mean.

It is important to underscore, that the method assumes independence and normal distribution of the data. Statistical Monte Carlo experiments have shown that STARS is quite robust to violations of these assumptions, performing better than the Lanzante's non-parametric L-test (Rodionov, 2004). However, if a preliminary inspection of your data reveals a trend (or positive serial correlation), it is recommended to use a prewhitening procedure described in Section "Red noise estimation." Failure to do so may lead to spurious regime shifts. An example of such an incorrect use of the method can be found in this blog post. If a significant deterministic component such as a trend is present in a time series (as, for example, in mean global surface temperature), it is recommended to remove it first, before prewhitening, since it may lead to overestimation of the lag-1 autocorrelation coefficient.

Page updated

Google Sites

Report abuse