Code

Any source code I'm willing to provide publicly is located here. It's mostly going to be R code.

Multiwayvcov

*** As of version 2.5-0, the sandwich package (which is available on CRAN) has essentially all of the functionality of multiwayvcov (see the vcovCL and vcovBS functions), due mostly to Achim Zeileis and Susanne Berger (I helped a little). I strongly recommend using the sandwich package for new code, as it's better in pretty much every way. I no longer use multiwayvcov for my work, and there are no plans for future releases of multiwayvcov, even if future versions of R break the package (this is unlikely but possible). If you have uses for multiwayvcov and sandwich cannot act as a replacement, I'd like to hear about it, so I can add that functionality to sandwich. ***

Note: The multiwayvcov package is not compatible with the plm package. If you need multi-way standard error clustering in a fixed effects model I recommend the lfe package. Future versions of sandwich will likely support plm in some form.

The multiwayvcov file below is an R package providing a function that implements Cameron, Gelbach, & Miller (2011) multi-way clustering of variance-covariance matrices.

The package also provides a function implementing pairwise, residual, and wild cluster bootstrapping. See the package's help page for cluster.boot for details.

It plays well with any model the sandwich package's estfun() and bread() functions work on (so anything built into R and most major contributed packages), but the statistical validity is up to you. The cluster.boot function does not depend upon estfun(), however.

Expect the package to be officially deprecated once sandwich includes all of the same functionality in the CRAN version. Current versions on R-Forge are already superior to multiwayvcov.

The code was adapted initially from Mahmood Arai's R code for two way clustering described here. Very little remain's of Arai's code, but some computations are similar. The equivalent functions in in the sandwich package no longer has any of Arai's code.

Relative to his code, the cluster.vcov function I provide has the following enhancements:

  • Transparent handling of observations dropped due to missingness
  • Compatibility with the subset option in lm() via a formula-based notation thanks to David Hugh-Jones
  • Full multi-way (or n-way, or n-dimensional, or multi-dimensional) clustering, following Cameron, Gelbach, & Miller (2011)
  • Automatic use of White HC0 standard errors when appropriate (see Mitchell Petersen's suggestion or Ma (2014) for a description); selectable
  • Basic support for parallelization across model variables using the parallel package provided with R; there are no gains when there is only one right-hand-side variable
  • Configurable degrees-of-freedom correction (SAS clustering results can be reproduced by setting df_correction=FALSE)
  • Optionally force the variance-covariance matrix to be positive definite, following Cameron, Gelbach, & Miller (2011); special thanks to Björn Hagströmer for contributing his implementation

By default, one way and two way clustering should match the results from Stata and Mitchell Petersen's Stata code, respectively.

A copy of Mitchell Petersen's test data set is also included in the package by permission, and can be accessed using data(petersen).

The package is available on CRAN; install by running install.packages("multiwayvcov") at the R command line (recommended)

OR

by downloading the file below and, in R, running the command install.packages("C:/Downloads/multiwayvcov_1.2.3.tar.gz", repos = NULL) (or whatever your download path is).

Using the package goes something like this:

library(multiwayvcov)

reg <- lm(y ~ x, data = dat)

# Old way

reg$vcov <- cluster.vcov(reg, cbind(dat$cluster_var1, dat$cluster_var2, dat$cluster_var3))

# New way

reg$vcov <- cluster.vcov(reg, ~ var1 + var2 + var3)

library(lmtest)

coeftest(reg, reg$vcov)

NEWS (latest version):

multiwayvcov v1.2.3 (2016-05-05):

=================================

* Formula interface now correctly handles NA values

* df_correction now allows corrections for one-way clustering

* stata_fe_model_rank argument added to reproduce some Stata results