Variance Estimation for Dependent Data

The asymptotic variance (Σ) of the sample mean is important for inference. Because Σ is the sum of an infinite number of unknown autocovariances, deriving a good heteroscedasticity and autocorrelation consistent (HAC) estimator is well known to be difficult. Commonly used methods include subsampling (Meketon and Schmeiser, 1984) and kernel (Andrews, 1991) estimators. However, they are not well suited for many specialized situations.

  1. Recursive Estimators

      • Recently, half-width analysis (Flegal and Jones, 2010) becomes one of the standard convergence diagnosis in Markov Chain Monte Carlo (MCMC). It requires estimating Σ sequentially. However, all classical estimators do not have efficient updating formulas: O(n) computational steps are needed to update an estimate of Σ when a new datum is appended to the original dataset of size n. The first recursive estimator was proposed by Wu (2009). It is computationally efficient, however, it sacrifices too much statistical efficiency.

      • The first contribution of this project is the development of two new classes of estimators, which nicely balance the tradeoff between statistical and computational efficiency; see Chan and Yau (2016) and Chan and Yau (2017). The estimators rely on recursive subsampling. Conventionally, subsamples each of size l are constructed, where l is increasing with n. In this case, all subsamples need to be re-constructed when l changes. So, it is time-consuming. Our idea is to sequentially construct blocks of subsamples such that (i) the sizes of existing subsamples do not change with n and (ii) the size of the shortest subsample within each block is monotonically increasing across blocks. Properties (i) and (ii) ensure computational and statistical efficiency, respectively, where (ii) is what the existing estimator lacks. We proved that the new recursive estimators achieve the optimal L2 convergence rate, and are uniformly more efficient than the existing method. Besides, they are the only estimators that can be sequentially operated with the optimal subsample sizes. (Back to recent projects)

  2. High Order Accurate Estimators

      • The convergence rate of a non-parametric estimator of Σ can be seriously slower than the root-n-rate. Analogous to other non-parametric problems, the rate of convergence depends on how dependent the time series is. Although there are some estimators with faster convergence rates by regularizing the strength of dependence, there is no simple estimator that can be operated with its asymptotically optimal parameters and is general to different strengths of dependence. So it motivated me to develop a high order corrected estimator of Σ specializing for any particular strength of dependence.

      • Our idea is to iteratively correct the biases of lower order estimators so that the corrected estimator converges in faster rate after every bias correction; see Chan and Yau (2017). In particular, we proved that it has a uniformly smaller mean-squared error than the state-of-the-art competitors. (Back to recent projects)