ivolsdec: Stata command for decomposing the IV-OLS gap

Overview

ivolsdec is a Stata command that decomposes the gap between OLS and IV estimates. It accounts for potential nonlinearity and heterogeneity in the causal relationship between treatment and outcome variables, based on the methodology developed in Ishimaru (2024).

The command provides insights into why IV and OLS estimates differ by decomposing their gap into three components:

Covariate weight difference: The difference in how the IV and OLS coefficients weight the covariates.
Treatment-level weight difference: The difference in how they place weight on treatment levels.
Marginal effect difference: The difference between the marginal effects identified by IV and OLS, which typically originates from endogeneity bias.

Reference: Ishimaru, S. (2024). Empirical Decomposition of the IV-OLS Gap with Heterogeneous and Nonlinear Effects. The Review of Economics and Statistics, 106(2), 505–520.

How to Use

Open Stata and install the package:

ssc install ivolsdec

Basic syntax:

ivolsdec y (x = z1 z2) w1 w2 w3, xnbasis(tx1 tx2)

Inputs:

y: outcome variable
x: treatment variable
z1, z2: instruments (one or many)
w1, w2, w3: covariates (one or many)
tx1, tx2: nonlinear transformations of x (one or many)

To access the help file with detailed instructions and examples:

help ivolsdec

FAQ

How should I choose the basis function options?

The command allows you to specify three types of basis functions:

xnbasis: Captures nonlinear effects of the treatment variable.
- For continuous treatment: Consider using polynomials (e.g., price^2, price^3).
- For discrete treatment: Consider using polynomials or categorical indicators (e.g., i.educ).
wbasis: Specifies covariates that interact with the treatment and its nonlinear transformations.
xibasis: Defines treatment variable transformations that interact with covariates.

The choice of basis functions should reflect your understanding of potential nonlinearity and heterogeneity in your specific context. I recommend checking robustness across different specifications, since the command can capture nonlinearity and heterogeneity only to the extent specified by the basis functions.

Can I have multiple treatment (endogenous) variables?

No, the method is designed for a single treatment variable.

Does the command work with a binary treatment variable?

Yes. For binary treatments, use the binary option instead of specifying xnbasis(). Note that the treatment-level weight difference will be zero by construction.

Can I include fixed effects?

Yes. You can include fixed effects either through:

1. Factor-variable notation (e.g., i.city_id), or
2. Direct inclusion of dummy variables, since fixed effects regression is mechanically equivalent to dummy variable regression.
My regression has too many fixed effects!

If you're working with many fixed effects, you might encounter memory issues. Here are some strategies:

First, try increasing Stata's matrix size:
- set matsize 11000
Consider the interaction structure. By default, the command accounts for the full interaction between the treatment variable and all covariates (including fixed effects). This might be:
- Computationally demanding due to large matrix size.
- Statistically challenging if you don't have enough observations per fixed effect level.

Here's an example of managing city fixed effects:

ivolsdec y (x=z1 z2) i.city_id w1 w2, xnbasis(tx1 tx2) // Default: Full interaction with city dummies (might be too demanding)

ivolsdec y (x=z1 z2) i.city_id w1 w2, xnbasis(tx1 tx2) wbasis(w1 w2) // Option 1: Disable interaction with city dummies

ivolsdec y (x=z1 z2) i.city_id w1 w2, xnbasis(tx1 tx2) wbasis(w1 w2 i.region_id) // Option 2: Use region-level interaction instead

Additional Resources

Paper: A link to the published version of the paper, which provides the theoretical foundation and empirical applications.
Preprint: A link to the ungated preprint version in arXiv.
Package: A link to the package information. You can get .ado and .sthlp files for offline installation.

Page updated

Google Sites

Report abuse

ivolsdec: Stata command for decomposing the IV-OLS gap

Overview

How to Use

FAQ

How should I choose the basis function options?

Can I have multiple treatment (endogenous) variables?

Does the command work with a binary treatment variable?

Can I include fixed effects?

My regression has too many fixed effects!

Additional Resources