Problems - Ch1

  1. Create the following simulated data.

    1. N = 100 (100 data points)

    2. x ~ U[0,1] (uniformly distributed between 0 and 1)

    3. u ~ N(0,2) (normally distributed with a variance of 2)

    4. y = a + b*x + u, where a = -2, b = 3.

  2. Run ordinary least squares on the data created in (1). Present estimates for a and b. Discuss why they are or are not close to the true values.

  3. Repeat (2) with N = 1000. Discuss how the estimates of a and b are different (or the same) from (2).

  4. Download the data for Using Geographic Variation in College Proximity to Estimate Returns to Schooling by David Card (http://www.nber.org/papers/w4483). The data is available here: http://davidcard.berkeley.edu/data_sets.html. Also available here: Google Sheets

    1. Plot log wages in 1976 on number of years of education in 1976.

    2. Run OLS of log wages in 1976 on number of years of education in 1976.

    3. Plot log wages in 1976 on work experience (measured as age less years of education plus 6 years).

    4. Run OLS of log wages in 1976 on experience in 1976.

    5. If we are trying to understand how much years of education affects wages, should we worry that some people have more or less work experience?