L06 Contingency

Measuring Association & Testing for Independence in Contingency Tables

VIDEO of LECTURE (on my website)

YouTube Video: PIDE App Ect 2014 L06 Measuring Association

Lecture shows that regression is not a good way to test for association, because its assumptions are often violated by the data. A new method for testing for assoication is presented, which is currently not in use anywhere, to the best of my knowledge. This method is based on using Fishers Exact Text in the context of testing for independence in contingency tables. This can also be done using a standard Chi Square test if there are large numbers of observations in each cell, but this procedure does not work well in small samples.

HOMEWORK:

Select THREE series from the World Bank WDI data set -- all of them should be for the same year, so that you have a CROSS SECTION data set (not time series). Data on ONE variable for ONE year for ALL countries. X1 X2 X3

Now make three plots -- X1 vs X2, X2 vs X3, X1 vs X3. Assess whether any of the three pairs are related to each other intuitively. Compute the correlation and run the regression. Draw the regression line on the graph with the points. Assess whether there are OUTLIERS, or NONLINEARITIES, or CLUSTERS in the data set. Assess whether the regression line is the good or bad description of the data. Is there a CAUSAL relationship between any of the pairs? That is, changes in Y would lead to changes in Y? To answer this question will require some knowledge of real world mechanisms which link the two variables.

Finally use Fisher test in a two by two contingency table to assess independence of the pair. Also divide the data into thirds (33% in each group) and use a 3 x 3 contingency table to perform the Fisher exact test. THIS needs to be done in ONLY ONE case out of the three, just to check that you understand the concepts of this lecture.

BACKGROUND MATERIALS:

Lecture 16 of Econometrics for Muslims: Correlations Spurious and Genuine -- PPT Slides attached below. VIDEO LECTURE (in Urdu) available as lecture 16 on Econometrics for Muslims 2013. Please watch this as an alternative and more elementary presentation of materials in this lecture; which will provide useful background for Current lecture, which is more advanced.

Lecture 15 of Introduction to Statistics for Muslims: Central Values and Unexpected Values. Explains how critical values are chosen. This lecture covers in detail what Lecture 6 covers very briefly. Please watch this lecture to acquire more detailed background information about an important topic covered in this lecture (L6 of applied econometrics)

Lecture 13 of Intro Stats covers Binomial Probabilities in Detail, for anyone unfamiliar with these basics.

PIDE App Ect L06: Measuring Association - YouTube Video Lecture: how to measure association between two variables, and the use of contingency tables in this context.

E4M L16 Correlations: Spurious & Genuine - Interpret and Evaluate Correlations -- explanation of common confusions regarding correlations. Reasons why these can be misleading

BE L11: Chi-Square and Contingency Tables - Lecture 11 in Bayesian Econometrics covers many aspects of contingency tables -- more advanced and complete.