Take the following survey. Tomorrow, we will find our movie-watching buddies based on preference correlations: Movie survey
In class, we will watch the Anne Milgrim TED Talk (below) on statistics used to make predictions in the criminal justice system. On the next day, we will watch the Nate Silver TEDx Talk on race and voting habits. While we watch, you will go to https://todaysmeet.com/statsted. Use your name so I can give you credit for your posts. Everyone is expected to have at least:
Make your observation during the video. You will have time to think through comments and reply at the end of the video.
When communicating online in a professional / academic forum, expected behaviors are different than standard social media use. This is similar to the difference you notice in language and style when giving a speech vs. talking to friends. Appropriate behavior depends more on context than the tool / medium used
For problems 1-5, answer the following:
a) List the 2 variables and whether they are categorical or quantitative.
b) Which section would you use in StatKey to create a chart / graph?
c) Which variable is likely the cause and which is likely the response? If neither, what might a lurking variable be that connects these two? Which input leads to which output?
1. Premium gasoline (89 octane) gives cars better gas mileage than regular gasoline (87 octane).
2. The weekly grocery bill is associated with the number of family members.
3. Taking a recently developed pill each day will reduce the number of headaches experienced over the next 3 months compared to another brand.
4. Professional sports team’s winning percentage is associated with the team’s average salary.
5. A classroom poll asked students if they liked math or not based on what class they were enrolled in
6. Explain the difference between independence, dependence, and causation. How can you prove causation?
7. Explain the difference between independent variables, dependent variables, and lurking variables
The goal of analyzing the relationship between two variables is different than the goal when working with only one variable at a time. Explain.
See first video. In 1-variable analysis, you are searching for a summary of the current state of the situation. In 2-variable analysis, you are trying to find a link between the variables. You want to know if one variable can predict the other.
What is the difference between correlation and causation? What is needed to prove causation?
See last video. Correlation is a link / dependent relationship. Causation means that one of the variables is the reason for the other. The best way to prove a cause is an experiment because it eliminates all of the lurking variables. Outside of the statistics realm, sometimes you can prove causation with a very strong understanding of the mechanism behind how something works.
Why is it so incredibly useful in nearly every job to identify dependent relationships in data? Give an example of dependent variables and explain why it knowing this relationship helps somebody do their job better.
See in-class TED Talks. If you know an end result is predictable, you can use the predictors to change behavior before the end result happens. Anne Milgrim did this for judges with the criminal risk factors. Nate Silver is starting to do this with racism based on city design.
E. Predict It >