Class Eighteen: Your Graphs part Two, Simpson's Paradox
Medical Helicopters, redux
Median Average Salary in US
Working Phoneathon
Looking at, suggesting, and creation of data.
What worked well, what didn't?
Class Sixteen: On Data, Chi Squared, and How to Get People to SEE
Data Collection Finishing Touches
Check the Survey Monkey Issue
Can You Access Your Answers?
Paper We Read
What Works? What doesn't?
Reading for Idea
Reading for the Stats
Chi Squared Testing
What is it? Old Page I made for it (contains some r coding, you can ignore that)
http://www.quantpsy.org/chisq/chisq.htm looks like a good tool; I use r.
Some examples:
Binge Drinking on Campus
Medical Helicopters and Saving Lives
Class Fifteen: Let's Build Better Surveys
Group work
Select one of the four surveys.
SURVEYOR:
describe briefly what you are attempting to do with the survey
discuss what you plan to do with the results
what hypotheses do you have?
what graphs can you make
what information is necessary
TAKERS OF SURVEY:
take the survey.
outline any issues or questions you have about the survey
address any problems with the data
IF WE HAVE TIME
probabilities of the same birthday
probabilities of rolling a 1
probabilities of rolling boxcars
HW on the page (fix survey, take surveys, read a paper)
Class Fourteen: Creating Our Focus of Understanding
List of Information in the Class (my partial one on Class Fourteen)
Creation of Survey by the Class
Clarifications on t-tests vs. z-tests (and why the real world uses t-tests more often)
Rolling Some Dice
Building a Powerball Model
Class Twelve: More Hypothesis Testing
homework on the appropriate page.
Class Eleven: Hypothesis Testing
set null hypothesis
always equal to. always AGAINST what you want to prove.
set alternate hypothesis
one or two tail test?
set alpha level
what percent sure do we want to be that we are correct?
check data (if possible)
outliers with appropriate methods
check shape of data, etc.
bimodal is a big problem.
run analysis (get z-score, p-score)
compare and analyze
Class Nine: Testing our Hypothesis
Start: How old are our pennies?
Go over our HW
four problems
what did we make with our sleep data?
Newt Salve, 2000
the difference between hypothesis testing and the way that you think about the world
one tailed vs. two tailed ideas
Class Eight: What's in a Question
Start: HW
what does CI mean?
Powerpoint of Questions for Surveys and Issues that are in there
what questions can we ask
where should we ask them?
Class Seven: The Central Limit Theorem
Start: M&Ms have formed groups against us.
blind look
choose ten 'representative of the population'
choose ten at 'real' random
choose twenty at real random
Start: Is this Coin a Fair Coin?
establish a value that we can see as 'fair'
difference between heads and 'fair'
one sided vs. two tailed test
a binomial distribution is different than a normal curve, but the ideas will work the same for us when it comes to being confident.
Next: Sampling a population
blind look
choose ten 'representative of the population'
choose ten at 'real' random
choose twenty at real random
Finally, a word about choices, in terms of March Madness.
HW: Naked Statistics: Chapters 6, 7, 8.
Class Five: Some Clean Up, Probability
z-scores for comparison
SAT N(1000, 200)
ACT N(18, 6)
so what does this do for us?
ok, so imagine we want to make a program that allows us to get better at SAT
get 10 people. have them take the SAT after doing our program
yes, all kinds of problems here
compare what they did to the average:
did they do better?
how much better?
is it significant?
What Went Wrong in Michigan?
fivethirtyeight on the issue
Alex dug in and found this pretty awesome piece of information:
Ended calls with anyone who responded "not sure" when asked which party's primary they were voting in.
And this quote from their methodology:
Federal law only permits us to call land lines using automated polling. Because likely Primary voters are older, 52% are 60 or older and 76% are older than 50, we believe there are sufficient land line voters to get an accurate sample. We do not have to make any assumptions of likely voter turnout.
redditor /u/_supernovasky_:
I'm going to break rank as a stats guy here: The polls were tremendously wrong. My model will end up reflecting this result and look poorly on many polls. Demographics > polls is what this race tells me, and this has been an ongoing story of the race for Clinton AND for Sanders.
Kent County is what flipped this entire race for Sanders. Kent county is highly non-white. Black and Hispanic votes are big here. It was hit very hard by the trade deals and lost SO many manufacturing jobs over time. I said that if Sanders criticism on trade deals was compelling, Kent County would show it. She was favored to win by 9%. She is losing by 30% here. I would say they were compelling.
Probability (Naked Stats)
Standard Deviations
coin flipping experiment
four choices
six choices
Big Ideas:
things happen by chance all the time. our job is to try and figure out--is it by chance or by something else?
gambler's fallacy swings both ways, but only exists because we KNOW the true outcomes.
what happens with a die that's weighted? How do we know that a die is weighted?
real=c(1,2,3,4,5,6)
sample(real,10,replace=T)
weight=c(1,2,2,3,3,4,4,5,5,6,6,6)
sample(weight,10,replace=T)
HW:
take the State Data (found on this class day link) and make something interesting that fits what we know about making good graphs
Naked Statistics, Chapter 5 1/2: The Monty Hall Problem (which we will discuss in class)
Argument over the data we collected
overall goal -- what tips and tricks allow us to change the views of others using only the numbers
what the differences between percentages and actual numbers in terms of large amounts
what can we do with graphs that can change our views of the way that these things work?
When asking a question, how can we frame it to have a useful discussion?
Idea of mean and standard deviation
use r to visualize it a bit.
let's get some examples and some normalized scores
let's talk about what a normal curve looks like
lets talk percentages in those normal curves
correlation an causation?
this may come up, it may not
HW:
Naked Statistics, Chapter 5: Basic Probability
WSJ: Chapter 3: Ready Reference.
familiarize yourself with the information within. make sure you can go back to this if you have a question with some stats numbers later.
Standard Deviations: Introduction and Chapter 1: Patterns, Patterns, Patterns
Class Three: Preparing Ourselves for Real World Data (03.04)
Goals:
Discuss the pie chart
Talk a bit more about univariate data, specifically why the sd matters as much as the mean
68-95-99.7 as a rule
100% of the data underneath the curve
where does it all fit?
Pets data: we can see the trend, but how do we explain it?
y = mx + b
differences in the line
Manatee Data
time to boats
time to manatee deaths
boats to manatee deaths
Data to be released tomorrow:
The employment situation: new one should be available 0830, March 4, 2016
find your preferred place to talk about spin on this data
most news sites: CNN, Fox News, MSNBC, Politico
some right leaning sites: Brietbart, townhall.com
some left leaning sites: dailykos.com alternet.org
Your goal: be able to talk to EITHER side of the argument;
the economy got better over the past month;
the economy has gotten worse over the past month
you can use other data, but it should tie back to this.
be on the look out for people using super slimy ways of doing this.
Homework:
Naked Stats: Chapter 4, Correlation
Work with Data Released on Unemployment Data
Class Two: Describing Our Univariate Data (02.29)
Goals:
Create List of Stats Terms Used in Our Readings
Evaluate Our Two Different Graphs
standard deviation, IQR
What happens as data changes over time
mean and sd -- when to use, when to think
median and IQR -- when to use and when to think
Histogram vs. Bar Plot
HW:
WSJ: Chapter 2 "Chart Smart"
Naked Statistics: Chapter 3 "Deceptive Description"
Stephen Few: Save the Pies for Dessert
Use the 'pet' data to create some type of graphic for us to look at/use (on class day page)
argument using statistics somewhere in your literature
what are the numbers used
what's the source
what other information can you find on that topic
Class One: Introduction (02.25)
Goals:
What can statistics do for us
How can we Present Statistics
Basics of Univariate Data -- mean, median, mode, normal, skew, spread
Basics of Types of Data -- Qualitative and Categorical
Activities:
Dice
Information
Homework:
WSJ Chapter 1: The Basics
Naked Statistics: Chapters 1 and 2
Using those, and the information collected in class, create a graphic for each of the following:
favorite colors of the class
shoe sizes of class
guess for distance to Jennings
DATA AVAILABLE ON 01. INTRO PAGE
---
Class Three: Preparing Ourselves for Real World Data (03.04)
Goals:
lines of best fit
extrapolation and the dangers thereof