Mark (Shuai) Ma - Two-Way Cluster-Robust Standard Errors and SAS code

Two-Way Cluster-Robust Standard Errors and SAS code

This page includes

1) A SAS code (Download) that you can use to calculate finite-sample estimates of standard errors robust to two-way clustering for OLS regressions.

My code allows you to obtain two-way clustered standard error based on the formula in Petersen (2008), Cameron and Miller(2011) and Thompson (2011). If you use this code, please add a footnote "To obtain unbiased estimates in finite samples,the clustered standard errors are adjusted by (N-1)/(N-P)× G/(G-1), where N is the sample size, P is the number of independent variables, and G is the number of clusters." For details, please see my note.

To obtain the test results, you need to run the macro code first, and then you run the command " %REG2DSE(y=DV, x=INDV, firm=firmid, time=timeid, multi=0, dataset=A.data, output=A.results); ". See the code for details.

2) A research note (Download) on finite sample estimates of two-way cluster-robust standard errors.

The note explains the estimates you can get from SAS and STATA. Petersen (2009) and Thompson (2011) provide formulas for asymptotic estimate of two-way cluster-robust standard errors. But, to obtain unbiased estimated, two-way clustered standard errors need to be adjusted in finite samples (Cameron and Miller 2011). Finite sample estimates of two-way cluster-robust standard errors could possibly result in very different significance levels than do the unadjusted asymptotic estimates. However, researchers rarely explain which estimate of two-way clustered standard errors they use, though they may all call their standard errors “two-way clustered standard errors”. My note explains the finite sample adjustment provided in SAS and STATA and discussed several common mistakes a user can easily make.

and

3) Answers to a few questions I have received about Cluster-Robust Standard Errors .

Q i) How to obtain R-square using your SAS code?

A: I have updated my code. The current July 2014 version could automatically report r-square in the output.

Q ii) How to test difference in coefficients or, in other words, conduct joint tests of coefficients ?

A: Wald's F-test is provided by the contrast function under "proc surveyreg". Add the following example code after your model statement: Contrast "Joint test" accural 1 cashflow -1/e; Then, you could test the hypothesis that accural*1+cashflow*-1=0. But, based on my understanding, this is only good for one-way cluster-robust SEs. I have no idea that any existing code could perform this test for two-way cluster-robust SEs. More work needs to be done!

Q iii) Do I still need industry and year fixed effects when I already use two-way clustered standard errors?

A: Yes. Fixed Effects move the mean of the regression residuals to zero. But fixed effects do not affect the covariances between residuals, which is solved by clustered standard errors.

Q iv) Should I cluster by month, quarter or year ( firm or industry or country)?

A: The author should cluster at the most aggregated level where the residual could be correlated. This would depend on the specific question the author is looking at. For example, if you cluster the standard errors by month, you implicitly assume that residuals from the same month are correlated, and that residuals from different months are not correlated. If you think residuals from different months might be correlated, you would need to further cluster the standard error by a longer period, such as quarter, year or even two years. The similar logic applies to the decision regarding whether to cluster by firm or industry or country.

Q v) Why are the standard errors and t-statistics reported as "." (missing) for a few variables?

A: The reason is possibly that the standard error is too small. SAS only recognizes the a certain number of digits (e.g., 8 digit) after decimal. So, if the standard error is too small, the SAS output files can not recognize it. Then, you might divide the dependent variable by 1,000 and rerun the analyses. Thus, you can get the correct statistics for standard errors and t-value. However, in order to report the correct coefficients, you need to divide the new coefficients by 1,000.

Page updated

Google Sites

Report abuse