combinedata-code
Data and computer programs for “Optimally combining censored and uncensored datasets”
The Stata dataset is called “rev_data.dta” (download zipped dataset). It contains the following variables:
1. birthpl = State of Birth
2. yearat14 = Year in which the person turned 14
3. req_sch = Minimum Schooling Required Before Dropping out
4. ca = Required Schooling (accounting for age requirements)
5. enrolage = Age by which child must start school
6. drop_age = Earliest age at which can drop out of school
7. cl = Minimum Schooling required for a Work Permit
8. year = Census Year
9. perwt = Person Weight
10. age = Age at Census (in years)
11. agemarr = Age at first Marriage (in years)
12. educh = Years of Education
13. censored = Indicator variable = 1 if age at first marriage censored
14. female = Indicator variable = 1 if female
15. white = Indicator variable = 1 if white
16. r_age = Age at Census (in quarters)
17. r_agemar = Age at First Marriage (in quarters)
See the data-appendix in the paper for details about how “r_age” and “r_agemar” were created.
Stat Transfer was employed to convert “rev_data.dta” to a Gauss dataset named “rev_data.dat”. The subsamples for the application in the paper were created from “rev_data.dat” by the following programs (download zipped programs).
Dataloop (Gauss) code to create subsamples
Female subsample
r-women-60-80.prg (enriched dataset where the master sample is from 1960 and the refreshment sample is from 1980)
r-women-70-80.prg (enriched dataset where the master sample is from 1970 and the refreshment sample is from 1980)
r-women-80-only.prg (refreshment sample)
r-women-60-80-s55.prg (enriched dataset for robustness check where age at first marriage for unmarried individuals in the refreshment sample is imputed to be 55)
r-women-60-80-s65.prg (enriched dataset for robustness check where age at first marriage for unmarried individuals in the refreshment sample is imputed to be 65)
r-women-70-80-s55.prg
r-women-70-80-s65.prg
Male subsample
r-men-60-80.prg
r-men-70-80.prg
r-men-80-only.prg
r-men-60-80-s55.prg
r-men-60-80-s65.prg
r-men-70-80-s55.prg
r-men-70-80-s65.prg
Gauss code for the GMM60 and OLS80 estimators
GMM70 is implemented exactly like GMM60 except that it uses “r-men-70-80.dat” and “r-women-70-80.dat” for the male and female subsamples, respectively. Same applies for the robustness checks.
Female subsample
r-women-log-cadum-hausman.prg (GMM60)
r-ols-women-80-only.prg (OLS80)
r-women-log-cadum-s55-hausman.prg (robustness check)
r-women-log-cadum-s65-hausman.prg
Male subsample
r-men-log-cadum-hausman.prg
r-ols-men-80-only.prg
r-men-log-cadum-s55-hausman.prg
r-men-log-cadum-s65-hausman.prg
Stata code for Tobit
tobit.prg
Matlab code for simulations (download zipped programs)
c fixed (needs master_neg_tobit_loglik.m)
cfixed6_20p.m (n_R / n_M = 20%)
cfixed6_80p.m (n_R / n_M = 80%)
c random (needs master_neg_tobit_loglik_crandom.m)
crandom4_20p.m
crandom4_80p.m