The list of Stata commands below is continuously growing and improving thanks to very useful comments by users. I always welcome and appreciate comments, suggestions, and especially error reports concerning the program and help files.
Stata Commands: Table of Contents
Slides and FAQs about Clustering
Download the Slides on "Nonrobustness of the conventional cluster-robust inference: with three robust alternatives" presented at the 2023 Stata Symposium (November 9, 2023).
Download the Slides on "Clustering" used for the lecture at the Institute of Economics, Academia Sinica (November 30)
Some frequently asked questions regarding clustering
Q.1. Why can the cluster-robust (CR) standard errors (e.g., cluster() & vce(cluster), etc.) be non-robust?
A. Because of the large clusters, e.g., California (=10% of US). In particular, it is not robust with U.S. 51 states. See the slides.
Q.2. How can I check if the conventional CR standard errors are robust given my data set?
A. option 1: log-log plot; option 2: Hill plot; option 3: ssc install testout. See the slides.
Q.3. If the conventional CR standard errors (e.g., cluster() & vce(cluster), etc.) are not robust with my data, then what can I do?
A. option 1: assume within-cluster CLT; option 2: reweighting; option 3: score subsampling. Also, see the slides.
Q.4. If I have included two-way fixed effects, then I do not need to compute two-way cluster standard errors, right?
A. You do still need to compute two-way cluster standard errors even if you have included two-way fixed effects. Also see the slides.
Stata Commands for Average Treatment Effects
Stata Command for Average Treatment Effects
Download the manuscript and package forthcoming in The Stata Journal
robustate.ado : Under the limited overlap or a weak satisfaction of the common support condition, the naive inverse propensity score estimation method suffers from large variances (if not a lack of the consistency or the asymptotic normality). This command executes estimation and inference for the average treatment effect (ATE) robustly against the limited overlap.
Installation: ssc install robustate
Reference: Sasaki, Y & T. Ura (2022) Estimation and Inference for Moments of Ratios with Robustness against Large Trimming Bias. Econometric Theory, 38 (1), pp. 66-112. Paper.
Frequently Asked Questions:
Q1. How does the "robustate" command compare with the existing IPW estimator such as the "teffects ipw" command?
A. "effects ipw" tends to produce larger standard errors than "robustate". If the overlap is severely limited (i.e., if the tail index of the inverse propensity score is above 0.5), then the standard error for "effects ipw" is not guaranteed to exist while that of "robustate" still exists.
Q2. How does the "robustate" command compare with the IPW estimation with trimming/truncating small propensity scores?
A. Trimmed and truncated estimators are biased for the average treatment effects (ATE), while the "robustate" estimator is de-biased and its standard error accounts for the effects of the de-biasing.
Q3. How does the "robustate" command compare with the matching estimators such as "teffects pamatch" and "teffects nnmatch" commands?
A. The matching estimators tend to be biased for the average treatment effects (ATE) when the overlap is limited, while the "robustate" estimator is de-biased and its standard error accounts for the effects of the de-biasing.
Q4. How does the "robustate" command compare with the overlap weighting approaches?
A. The "robustate" estimates the average treatment effects (ATE), while the overlap weighting approaches estimate only weighted averages of treatment effects and hence in general fail to estimate the ATE.
Stata Commands for Cluster Robust Standard Errors
Stata Command for Cluster Robust Standard Errors
crhdreg.ado : This command executes double/debiased machine learning estimation of regression models and IV regression models under clustering. The cluster sampling environments accommodated by this command include the i.i.d. sampling, one-way cluster sampling, and two-way cluster sampling.
Installation: ssc install crhdreg
Reference: Chiang, H.D., K. Kato, Y. Ma & Y. Sasaki (2021) Multiway Cluster Robust Double/Debiased Machine Learning. Journal of Business & Economic Statistics, 40 (3), pp. 1046-1056. Paper.
Stata Command for Cluster Robust Standard Errors
xtregtwo.ado : This command executes estimation of linear panel regression models with standard errors robust to two-way clustering and untruncated serial correlation in common time effects.
Installation: ssc install xtregtwo
Reference: Chiang, H.D., B.E. Hansen, and Y. Sasaki (2022) Standard Errors for Two-Way Clustering with Serially Correlated Time Effects. arXiv:2201.11304. Paper.
Stata Commands for Dynamic Panel
Stata Command for Dynamic Panel
cdecompose.ado : executes estimation of canonical permanent-transitory state space models: Yit=Uit+Vit, where Uit is a permanent component that follows Uit=Uit-1+Wit, and Vit is a transitory component that follows Vit = ρ1Vit-1 + ... + ρpVit-p + G(εit,...,εit-q). Use it to estimate the variance, skewness and kurtosis of the distributions of Uit and Vit.
Installation: ssc install cdecompose
Reference: Hu, Y., R. Moffitt, and Y. Sasaki (2019) Semiparametric Estimation of the Canonical Permanent‐Transitory Model of Earnings Dynamics. Quantitative Economics, 10 (4), pp. 1495-1536. Paper.
Stata Command for Dynamic Panel
npss.ado : executes nonparametric estimation of heteroskedastic state space models: Yit=Uit+Vit and Uit=Uit-1+Wit, where Yit is observed (e.g., log earnings), Uit is unobserved (e.g., permanent component) and Vit is unobserved (e.g., transitory component). Use it to draw densities of Uit and Vit, and the conditional skedastic function of Uit+1 given Uit.
Installation: ssc install npss
Reference: Botosaru, I. and Y. Sasaki (2018) Nonparametric Heteroskedasticity in Persistent Panel Processes: An Application to Earnings Dynamics. Journal of Econometrics, 203 (2), pp. 283-296. Paper.
Stata Command for Dynamic Panel
Download the manuscript and package forthcoming The Stata Journal
xtusreg.ado : Estimation of fixed-effect dynamic panel data models under unequally spaced time periods in panel data. Use it when your data are collected from surveys conducted with unequal time intervals (or when some variables of your interest are available only with unequal time intervals).
Installation: ssc install xtusreg
Reference: Sasaki, Y. and Y. Xin (2017) Unequal Spacing in Dynamic Panel Data: Identification and Estimation. Journal of Econometrics, 196 (2), pp. 320-330. Paper.
Stata Commands for Instrumental Variables
Stata Command for Instrumental Variables
testex.ado : A statistical test of the exclusion restriction of an instrumental variable (IV). Use it when you consider running an IV regression and want to test the exclusion restriction of the IV to ensure that it is a valid IV.
Installation: ssc install testex
Reference: D'Haultfoeuille, X., S. Hoderlein, & Y. Sasaki. (2021) Testing and Relaxing the Exclusion Restriction in the Control Function Approach. Journal of Econometrics, forthcoming. Paper
Stata Commands for Measurement Error
Stata Command for Measurement Error
dkdensity.ado : executes deconvolution kernel density estimation and a construction of its uniform confidence band. Use it when you have repeated measurements, X1i and X2i, of an unobserved latent variable Xi, and you want to draw the density function fX of Xi together with its uniform confidence band.
Installation: ssc install dkdensity
Reference: Kato, K. and Y. Sasaki (2018) Uniform Confidence Bands in Deconvolution with Unknown Error Distribution. Journal of Econometrics, 207 (1), pp. 129-161. Paper.
Stata Command for Measurement Error
kotlarski.ado : Executes deconvolution kernel density estimation and a robust construction of its uniform confidence band. Use it when you have repeated measurements, X1i and X2i, of an unobserved latent variable Xi, and you want to draw the density function fX of Xi together with its uniform confidence band without relying on the assumption of completeness or symmetric error distribution.
Installation: ssc install kotlarski
Reference: Kato, K., Y. Sasaki, and T. Ura (2021) Robust Inference in Deconvolution. Quantitative Economics, 12 (1), pp. 109-142. Paper
Stata Command for Measurement Error
npeivreg.ado : executes estimation of nonparametric errors-in-variables (EIV) regression and construction of its uniform confidence band. Use it when you have repeated measurements, X1i and X2i, of an unobserved independent variable Xi, and you want to draw the nonparametric regression function of Yi on Xi, together with its uniform confidence band.
Installation: ssc install npeivreg
Reference: Kato, K. and Y. Sasaki (2019) Uniform Confidence Bands for Nonparametric Errors-in-Variables Regression. Journal of Econometrics, 213 (2), pp. 516-555. Paper.
Stata Command for Measurement Error
reporterror.ado : Eestimation of the probability masses of an unobserved discrete random variable using two measurements with possibly nonclassical and nonseparable measurement errors. Use it when there are two measurements (e.g., self-report and sibling-report of years of education) that are discrete.
Installation: ssc install reporterror
Reference: Hu, Y. and Y. Sasaki (2017) Identification of Paired Nonseparable Measurement Error Models. Econometric Theory, 33 (4), pp. 955-979. Paper.
Stata Commands for Outliers
Stata Command for Outliers
Download the manuscript and package in preparation for The Stata Journal (not submitted yet)
testout.ado : Diagnostic testing of outliers by statistical tests of the bound first- and second-moment conditions for consistency and root-n asymptotic normality, respectively. A rejection of the test of the consistency implies that the point estimates reported by regress or ivregress are incredible. A rejection of the test of the root-n asymptotic normality implies that the standard errors reported by regress or ivregress are incredible.
Installation: ssc install testout
Reference: Sasaki, Y. and Y. Wang (2021) Diagnostic Testing of Finite Moment Conditions for the Consistency and Root-N Asymptotic Normality of the GMM and M Estimators. Journal of Business & Economic Statistics, forthcoming. Paper
Stata Commands for Production Function
Stata Command for Production Function
Download the manuscript and package in preparation for The Stata Journal (submitted)
robustpf.ado : Estimation of production functions robustly against errors in proxy variables. Use it when you suspect possible errors in proxy variables. Errors in proxies pave ways for solving the problem of identification failure of traditional estimators as pointed out by ACF. Hence, this robust estimation command overcomes the identification problem.
Installation: ssc install robustpf
Reference: Hu, Y., G. Huang, and Y. Sasaki (2020) Estimating Production Functions with Robustness Against Errors in the Proxy Variables. Journal of Econometrics, 215 (2), pp. 375-398. Paper.
Stata Commands for Quantiles & Percentiles
Stata Command for Quantiles & Percentiles
ecic.ado : Estimation and inference for quantile treatment effects (QTE) at extreme quantiles via changes in changes (CIC). Use it when you want to compute QTEs at extremal quantiles, such as the 0.1-th percentile or the 99.9-th percentile, which cannot be estimated well by the existing CIC estimator.
Installation: ssc install ecic
Reference: Sasaki, Y & Y. Wang (Forthcoming) Extreme Changes in Changes. Journal of Business & Economic Statistics . Paper.
Stata Command for Quantiles & Percentiles
exquantile.ado : Estimation and inference for (conditional) extremal quantiles. Use it when you want to compute (conditional) extremal quantiles, such as the (conditional) 0.1-th percentile or the (conditional) 99.9-th percentile, which cannot be estimated well by the standard quantile methods.
Installation: ssc install exquantile
Reference: Sasaki, Y & Y. Wang (2021) Fixed-k Inference for Conditional Extremal Quantiles. Journal of Business & Economic Statistics, 40 (2): 829-837. Paper.
Stata Command for Quantiles & Percentiles
itvalpctile.ado : Estimation of interval-valued percentiles (quantiles) for interval-valued data. Use it when you have interval-valued data as is often the case with survey and questionnaire responses and want to estimate its interval-valued percentiles (quantiles) together with their confidence sets.
Installation: ssc install itvalpctile
Reference: Beresteanu, A. and Y. Sasaki (2021) Quantile Regression with Interval Data. Econometric Reviews (Special Issue Honoring Cheng Hsiao), 40 (6): 562-583. Paper
Stata Commands for RDD & RKD
Stata Command for Quantile RDD
rdqte.ado : Estimation and robust inference for quantile treatment effects (QTE) in the regression discontinuity designs (RDD). Use it when you consider a sharp or fuzzy regression discontinuity design and you are interested in analyzing heterogeneous treatment effects of a binary treatment. The method is robust against large bandwidths and arbitrary functional forms.
Installation: ssc install rdqte
Reference: Chiang, H.D., Y.-C. Hsu, and Y. Sasaki (2019) Robust Uniform Inference for Quantile Treatment Effects in Regression Discontinuity Designs. Journal of Econometrics, 211 (2), pp. 589-618. Paper.
Stata Command for Quantile RKD
qrkd.ado : Estimation and robust inference for heterogeneous causal effects in the quantile regression kink designs (Quantile RKD). Use it when you consider a regression kink design and you are interested in analyzing heterogeneous causal effects of a continuous treatment. The method is robust against large bandwidths and arbitrary functional forms.
Stata Command for Quantile RKD
rkqte.ado : Estimation and robust inference for quantile treatment effects (QTE) in the regression kink designs (RKD). Use it when you consider a regression kink design and you are interested in analyzing heterogeneous treatment effects of a binary treatment. The method is robust against large bandwidths and arbitrary functional forms.
Stata Command for RDD
rdboot.ado : Estimation and inference for treatment effects in the sharp/fuzzy regression discontinuity designs (RDD) based on multiplier bootstrap and bias correction. Use it for estimation and inference under the sharp/fuzzy RDD, perhaps as a robustness check in addition to other alternative methods.