Trang chủ‎ > ‎IT‎ > ‎Data Mining‎ > ‎Time Series Analysis‎ > ‎ARIMA, AR, MA ...‎ > ‎

ARIMA model explanation part 2 by Hyndman

Backshift notation

The backward shift operator BB is a useful notational device when working with time series lags:

Byt=yt1.Byt=yt−1.

(Some references use LL for “lag” instead of BB for “backshift”.) In other words, BB, operating on ytyt, has the effect of shifting the data back one period. Two applications of BB to ytyt shifts the data back two periods:

B(Byt)=B2yt=yt2.B(Byt)=B2yt=yt−2.

For monthly data, if we wish to consider “the same month last year,” the notation is B12ytB12yt = yt12yt−12.

The backward shift operator is convenient for describing the process of differencing. A first difference can be written as

yt=ytyt1=ytByt=(1B)yt.yt′=yt−yt−1=yt−Byt=(1−B)yt.

Note that a first difference is represented by (1B)(1−B). Similarly, if second-order differences have to be computed, then:

y′′t=yt2yt1+yt2=(12B+B2)yt=(1B)2yt.yt″=yt−2yt−1+yt−2=(1−2B+B2)yt=(1−B)2yt.

In general, a ddth-order difference can be written as

(1B)dyt.(1−B)dyt.
Backshift notation is very useful when combining differences as the operator can be treated using ordinary algebraic rules. In particular, terms involving BB can be multiplied together. For example, a seasonal difference followed by a first difference can be written as
(1B)(1Bm)yt=(1BBm+Bm+1)yt=ytyt1ytm+ytm1,(1−B)(1−Bm)yt=(1−B−Bm+Bm+1)yt=yt−yt−1−yt−m+yt−m−1,

the same result we obtained earlier.

Autoregressive models

In a multiple regression model, we forecast the variable of interest using a linear combination of predictors. In an autoregression model, we forecast the variable of interest using a linear combination of past values of the variable. The term autoregression indicates that it is a regression of the variable against itself.

Thus an autoregressive model of order pp can be written as

yt=c+ϕ1yt1+ϕ2yt2++ϕpytp+et,yt=c+ϕ1yt−1+ϕ2yt−2+⋯+ϕpyt−p+et,

where cc is a constant and etet is white noise. This is like a multiple regression but with lagged values of ytyt as predictors. We refer to this as an AR(pp) model.

Autoregressive models are remarkably flexible at handling a wide range of different time series patterns. The two series in Figure 8.5 show series from an AR(1) model and an AR(2) model. Changing the parameters ϕ1,,ϕpϕ1,…,ϕp results in different time series patterns. The variance of the error term etet will only change the scale of the series, not the patterns.

Figure 8.5: Two examples of data from autoregressive models with different parameters. Left: AR(1) with yt=180.8yt1+etyt=18−0.8yt−1+et. Right: AR(2) with yt=8+1.3yt10.7yt2+etyt=8+1.3yt−1−0.7yt−2+et. In both cases, etet is normally distributed white noise with mean zero and variance one.

For an AR(1) model:

  • When ϕ1=0ϕ1=0ytyt is equivalent to white noise.
  • When ϕ1=1ϕ1=1 and c=0c=0ytyt is equivalent to a random walk.
  • When ϕ1=1ϕ1=1 and c0c≠0ytyt is equivalent to a random walk with drift
  • When ϕ1<0ϕ1<0ytyt tends to oscillate between positive and negative values.

We normally restrict autoregressive models to stationary data, and then some constraints on the values of the parameters are required.

  • For an AR(1) model:   1<ϕ1<1−1<ϕ1<1.
  • For an AR(2) model:   1<ϕ2<1−1<ϕ2<1,   ϕ1+ϕ2<1ϕ1+ϕ2<1,   ϕ2ϕ1<1ϕ2−ϕ1<1.

When p3p≥3 the restrictions are much more complicated. R takes care of these restrictions when estimating a model.

Moving average models

Rather than use past values of the forecast variable in a regression, a moving average model uses past forecast errors in a regression-like model.

yt=c+et+θ1et1+θ2et2++θqetq,yt=c+et+θ1et−1+θ2et−2+⋯+θqet−q,

where etet is white noise. We refer to this as an MA(qq) model. Of course, we do not observe the values of etet, so it is not really regression in the usual sense.

Notice that each value of ytyt can be thought of as a weighted moving average of the past few forecast errors. However, moving average models should not be confused with moving average smoothing we discussed in Chapter 6. A moving average model is used for forecasting future values while moving average smoothing is used for estimating the trend-cycle of past values.

Figure 8.6: Two examples of data from moving average models with different parameters. Left: MA(1) with yt=20+et+0.8et-1. Right: MA(2) with yt=et-et-1+0.8et-2. In both cases, et is normally distributed white noise with mean zero and variance one.

Figure 8.6 shows some data from an MA(1) model and an MA(2) model. Changing the parameters θ1,,θqθ1,…,θq results in different time series patterns. As with autoregressive models, the variance of the error term etet will only change the scale of the series, not the patterns.

It is possible to write any stationary AR(pp) model as an MA() model. For example, using repeated substitution, we can demonstrate this for an AR(1) model :

yt=ϕ1yt1+et=ϕ1(ϕ1yt2+et1)+et=ϕ21yt2+ϕ1et1+et=ϕ31yt3+ϕ21et2+ϕ1et1+etetc.yt=ϕ1yt−1+et=ϕ1(ϕ1yt−2+et−1)+et=ϕ12yt−2+ϕ1et−1+et=ϕ13yt−3+ϕ12et−2+ϕ1et−1+etetc.

Provided 1<ϕ1<1−1<ϕ1<1, the value of ϕk1ϕ1k will get smaller as kk gets larger. So eventually we obtain

yt=et+ϕ1et1+ϕ21et2+ϕ31et3+,yt=et+ϕ1et−1+ϕ12et−2+ϕ13et−3+⋯,

an MA() process.

The reverse result holds if we impose some constraints on the MA parameters. Then the MA model is called “invertible”. That is, that we can write any invertible MA(qq) process as an AR() process.

Invertible models are not simply to enable us to convert from MA models to AR models. They also have some mathematical properties that make them easier to use in practice.

The invertibility constraints are similar to the stationarity constraints.

  • For an MA(1) model:   1<θ1<1−1<θ1<1.
  • For an MA(2) model:   1<θ2<1−1<θ2<1,   θ2+θ1>1θ2+θ1>−1,   θ1θ2<1θ1−θ2<1.

More complicated conditions hold for q3q≥3. Again, R will take care of these constraints when estimating the models.


Comments