3. Two sample categorical analysis

Mastery Quiz Prep

Introductory video

  • Find and interpret a confidence interval for a difference of two proportions using StatKey.
    • Enter data as a series of (Treatment, Result) pairs.  Generate bootstrap samples and find the middle __%.
    • Interpret: I am _(percent)_% confident that _(bigger group)_ _(has/succeeds)_ _(low end of interval)_% to _(high end of interval)_% more _(response variable units)_ than _(smaller group)_..
    • For example: I am 95% confident that NFC teams win 4.5% to 9.2% more games than AFC teams.
    • OR plus-minus form: I am 95% confident that NFC teams win 6.8% more games than AFC teams with a margin of error of ±2.3%.

Answer the following for each of the problems below:
a)    What is the explanatory variable in this scenario. What are its options (treatments)?
b)    What is measured for each individual (the response variable in this scenario)? Is it quantitative or categorical?
c)    If it is quantitative, is there matched pairs or two distinct samples?
d)    Based on the last two responses, what type of interval/test will you perform in StatKey?
e)    What is the null hypothesis in this scenario? Use the correct symbols (μ or p) and use subscripts so it is clear what each symbol means.
f)    What is the alternative hypothesis in this scenario?  Re-read the problem to see if there is an intended direction.
g)    What is the p-value of your test?
h)    Is this an observational study or experiment?  Based on this and your p-value, what can you conclude?
i)    What is the estimated difference between the two groups (95% interval)?
j)    Convert this interval to plus-minus form.
k)    Interpret the confidence interval of the difference in a sentence.  Use plus-minus form because it is often far more readable in a sentence.

1. While watching the study take place, a couple members of the local robotics teams decided to setup their own trial.  One team had a frisbee shooter that fired using a straight track.  The other had a circular frisbee shooter.  They each had their robot take 40 shots.  Results: the track robot made 16 and the circular robot made 23.  Neither team was assumed to be better before running the test.

2. In a food testing experiment, students found 50 volunteers.  They randomly blinded 25 taste-testers as they ate a piece of toast with margarine.  They asked the participant whether they just ate butter on their toast.  16 out of 25 said yes.  Then they gave the other randomly selected 25 taste-testers a piece of toast with butter and asked the same question.  19 out of 25 said yes.

Free Response Prep

Explain how StatKey simulates tests of independence for two categorical variables (same question, all versions of quiz):
Discuss explanation in class.

Practice solutions

1: Robot Frisbee: 
        a) Explanatory Variable: Type of method used by the robot
            Treatment Groups: Circular vs Straight Track
        b) the individual is the shot, not the robot, like it would have been for #1 and #2.  thus, the variable is whether the shot is made or not (categorical)
        c) N/A
        d) Test for a difference in proportions because we have two proportional values to work with
        e) H0: ps = pc
        f) HA: p pc
        g) p= .19
        h) Not a proper experiment because each treatment "group" has one individual (no repetition) and there is no random assignment.  Regardless, you cannot reject the null since p > .05
        i) -.375 to .05
        j)  -.1625 ± .2125 more for straight  (.1625 ± .2125 more for circular)
        k) I am 95% confident that the circular shooter makes 16% ± 21% more shots than the straight track shooter.

2. Food testing:
        a) Explanatory Variable: What is on the toast
            Treatment Groups: Butter or Margarine
        b) Whether or not they think they just ate butter; categorical
        c) n/a
        d) difference in proportions because we are working with two values that are not numerical
        e)H0: pm = pb
        f)    HA: p≠ pb
        g) p= 0.52
        h) Experiment, it is not statistically significant and we cannot reject the null
        i) -.360 to .120
        j) -.12 ± .24  (.12 ± .24 more butter)
        k) I am 95% confident that testers believe they are eating butter 12% ± 24% more often when they are actually eating butter than when they are secretly given margarine.