Data and the American Dream

  • The Data file* and 14 analysis files are here. A guide to the files is here.

  • Backmatter of the book available here includes: Appendix A (describes how to download the data from IPUMS-USA, install free R and R Studio software, and use the R scripts and data linked to above). The back also contains a glossary, references to studies that use ACS microdata, and more.

  • Frontmatter of the book available here includes: table of contents and preface.

* Please cite the data as: Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2020.

Many more links and files can be found on this page below.

Term papers are ideal assignments for introductory econometrics courses, with virtues for both learning and assessment. My approach is to give students carefully-selected examples of published research that use the methods they are learning. I provide them with well-organized R scripts and Stata do files that verify the analyses. Once they can explain the methods used in the studies and the findings, they can carry out reanalysis, reproductions, extensions, or original research inspired by the published research in the term paper assignment. It is learning econometrics by doing. (Here's a video of a presentation I made on this approach to teaching.)

I illustrate this approach in my book, Data and the American Dream: Contemporary Social Controversies and the American Community Survey. It will be published by Palgrave Macmillan in 2021, but I am sharing the R scripts, Stata do files and data sets needed to run the replications on this page already. My hope is that by sharing these files, an instructor can include some of the articles that are case studies in the book on their syllabi and use these files in their teaching. (For additional introductory econometrics teaching resources I've created, see here and here.)

In many cases I did not replicate the entire study, and sometimes it was just one or two key results. One of my hopes with this book is that it will used by students and independent learners who will improve upon what I have done and carry out their own replications of studies that use IPUMS-USA data. I have created this form where readers can share their files with me, and after I get a few submissions there, I will update this page with links to the replications produced by the community of users of this book.

This is thus a dynamic webpage; I continue to update the files until the book is published, and after it is published the community of users will produce more replication files to add to the list of published research studies that can be used in the classroom to illustrate best practices in applied econometric research.

Matt Holian

March 23, 2021

Below I include links to the studies, both published and working paper versions (when available), the replication files, and I also list the course topics and learning objectives that I use each study to illustrate.


Study #1:

Winters, John V. "Is economics a good major for future lawyers? Evidence from earnings data." The Journal of Economic Education 47, no. 2 (2016): 187-191.

Working paper: Available.

Additional info on this replication: available.

Relates to the following Course Topics:

a. Descriptive statistics: counts, proportions, averages.

b. Hypothesis testing: the difference in means test

c. Regression: equivalence between difference in means tests and bivariate regression with a binary independent variable.

d. Best empirical practices: inflation adjustments and sample weighting

Additional files:

R Studio Cloud

Data and code here. Video on using this program here.


Analysis file (Do File). Data File (DTA)


Study #2:

Costa, Dora L., and Matthew E. Kahn. "Electricity consumption and durable housing: understanding cohort effects." American Economic Review, 101, no. 3 (2011): 88-92.

Working paper: Available.

Additional info on this replication: Available.

Course Topics

a. Regression control: Using control variables to estimate causal effects

b. Fixed effects

c. Clustered standard errors

d. Logarithmic transformations

Stata files: do file, and data files

New: R Studio Cloud program and data


Study #3:

Holian, Matthew J. "The impact of building energy codes on household electricity expenditures." Economics Letters 186 (2020): 108841.

Working paper: Available

Replication files: here

Course Topics

a. Regression control:

b. Difference in differences

c. Replicating and extending previously published research


Study #4:

Orrenius, Pia M., and Madeline Zavodny. "The Impact of Temporary Protected Status on Immigrants' Labor Market Outcomes." American Economic Review 105, no. 5 (2015): 576-80.

Working paper: Available

Course Topics

a. Basic difference-in-differences

b. Basic difference-in-differences with control variables

c. Polynomial models

d. Fixed effects

Stata files: do file, data file

*New: R Studio Cloud program and data


Study #5:

Bailey, James, and Dhaval Dave. "The effect of the Affordable Care Act on entrepreneurship among older adults." Eastern Economic Journal 45, no. 1 (2019): 141-159.

Working paper: Unavailable

Course Topics

a. Basic difference-in-differences model

b. Fixed-effect difference-in-differences model


Study #6:

Comolli, Chiara Ludovica, and Fabrizio Bernardi. "The causal effect of the great recession on childlessness of white American women." IZA Journal of Labor Economics 4, no. 1 (2015): 21.

Working paper: (published version is open access)

Course Topics

a. Basic difference-in-differences

b. Basic difference-in-differences with control variables


Study #7:

Holian, Matthew J. "The impact of urban form on vehicle ownership." Economics Letters 186 (2020): 108763.

Working paper: Available

Course Topics

a. Instrumental variables


DARPA Big Data