Module 14

Chi-square and tests for proportions

Introduction

  • Chi-square analysis is about testing the association between two nominal variables (or, categorical variables), for example, if there's any association between gender (male or female) and favourite colour (e.g., red/yellow/green, etc.)

  • If there's no association between two categorical variables, the allocation of each combination should be even (or weighted even). If there's an association, the actual, observed allocation of each combination will deviate from the even one. The test of such difference is the chi-square test.

1. Chi-square

1.1 What is chi-square?

  • The chi-square (χ2) statistic is a measure of the difference between the observed and expected frequencies of the outcomes of a set of events or variables.

  • The data used in calculating the chi-square statistic must be random, raw, mutually exclusive, drawn from independent variables and a large enough sample.

  • There are two types of chi-square tests - the goodness of fit test and the test of independence.

1.2 Goodness of fit test

The goodness of fit test is applied when you have one categorical variable with two or more values from a single population.

The null hypothesis (H0 ) assumes that there is no significant difference between the observed and the expected value. The alternative hypothesis (H1) assumes that there is a significant difference between the observed and the expected value.

1.3 Example 1: Hostel

Some people said that students' preferences for living hostel shared the same proportion (i.e. 1:1:1:1 ratio). We use the goodness of fit test to examine this hypothesis.

Q: Do students' preferences for living hostel have the consistent proportions as expected?

A: We use the “χ2 Goodness of fit" under “Frequencies” in jamovi.

Module 4 Example 1.3 Chi-squared Hostel.mp4

1.3.1 Result Interpretation

Conclusion/ Interpretation (APA format):

Students' preferences for living hostel was not equally distributed in the population, X2 (3, N = 1000) = 266, p < .001.

1.4 Test of independence

The test of independence compares two sets of data in a contingency table to see if there is an association. It can only assess the associations, but cannot provide any inferences about causation.

The data must meet the following requirements:

  • Large random sample size

    • The expected frequency of each category must be at least 5

  • Two categorical variables

  • Two or more groups for each variable

  • Independence of observations

    • There is no relationship between the subjects in each group.

    • The categorical variables are not "paired" in any way (e.g. pre-test/post-test observations)

The null hypothesis (H0 ) assumes that two variables are not associated with each other. The alternative hypothesis (H1) assumes that two variables are associated with each other.

Expected counts = Column totals X Row totals / Grand total for each of the cells

1.5 Example 2: Faculty and Relationship status

Some students believe that the faculty you belong to can determine your relationship status. To find out whether it is true or not, we can use the test of independence.

Q: Are faculty and relationship status associated with each other?

A: We use the “χ2 test of association” under “Frequencies” in jamovi to examine.

Module 4 example 1.5 Chi-squared Faculty & RelStatus.mp4

1.5.1 Results Interpretation

Conclusion/ Interpretation (APA format):

There is no association between faculty and relationship status, X2 (2, N = 1000) = 1.01, p = .603.

Module Exercise (4% of total course assessment)

Complete the exercise!

    • Now, if you think you're ready for the exercise, you can check your email for the link.

    • Remember to submit your answers before the deadline in order to earn the credits!