Today we begin talk of t-tests.
z.tests and t.tests are VERY similar. The only REAL difference is whether or not we have the standard deviation of the sample.
In all of the examples we've done so far, we had that information -- we would say something along the lines of "assume that the experiment did not change the standard deviation." Well ... we can't always do that.
Speeding on Seminary Avenue:
I sampled 23 cars driving along Seminary Avenue and collected their speeds as they went by:
29 34 34 38 30 29 38 31 29 34 32 31 27 37 29 26 24 34 36 31 34 36 21
What is the average speed of driver on that road?
Issues:
I will never know the TRUE MEAN of all people driving Seminary Ave.
I will never know the TRUE SD of all people driving on this road.
... yet I still want to be able to make a statement about their speeds.
I cannot use the z-test ... why, and what changes?
Let's look at the normal curve:
https://www.mathsisfun.com/data/images/normal-distrubution-large.gif
Compared to these:
http://www.battaly.com/stat/geogebra/t_curve/
New term: df "Degrees of freedom". degrees of freedom are defined df = (n - 1)
We can do all of the tests we did before, but the tails will be longer when we do the t.test as opposed to the z.test.
Let's do this one ...
https://www2.palomar.edu/users/rmorrissette/Lectures/Stats/ttests/ttests.htm <-- t chart at the bottom ... scroll all the way down!
... and then let's cheat after. (find out all things that r can do with this information ... what can we (and should we) change when doing a t.test?
NEW DATA:
Chips Ahoy! ... 1000 chips in each 18 ounce package.
1219 1214 1087 1200 1419 1121 1325 1345 1244 1258 1356 1132 1191 1270 1295 1135
NEW DATA:
yogurt from handout.
---
NEW DATA:
maze, question 545. see pages sent.
What can t.test do for us in r?
get a confidence interval under t.test methods (uses sample mean and sd)
can do hypothesis testing for us.
can get sizes of specific tails.
use help method in rStudio to help you figure out what you need to change.
-----
2 Sample t.test
NEW DATA: Baseball
because this data is two separate leagues, we cannot subtract one from the other. We must find their mean, sd, etc. etc. and set up "2 sample t.test".
which r does well.
NEW DATA: Summer School
june=c(54, 49, 68, 66, 62, 62)
sep=c(50, 65, 74, 64, 68, 72)
if we do this as a 2 Sample test, we are going to check true means ....
but we can do this as a patched pairs test, and see whether or not it makes a DIFFERENCE
this will give smaller p-values and can be more accurate, but only if it truly is matched pairs data.