We're going to cover many different things in this data set. Some are old. Some are new. It is your perogative to figure out which is which and use the appropriate methods.
1) One set of data is roughly N(20, 2). the other set of data is roughly N(20,5). Compare and contrast these two sets of data.
2) Data is roughly N(12, 4).
between what two numbers (roughly) is the middle 95% of the data?
the middle 99.7% ?
3) What steps should we do to ensure that our data is roughly normal?
4) IQ scores for adults are roughly N(100,15). What percent of adults have an IQ below 80?
5) attached is the growth of an icicle (in cm) over time (in minutes). Find the line of best fit, the correlation, and tell me how long the icicle would be after 240 minutes.
***you do not need to make fancy graphs for this****
6) What does the regression plot tell us about bivariate data, and when should we use it?
7) What is changes occur to our hypothesis testing when we use an alpha of 0.10 as opposed to 0.05?
***please make sure to give me more than "one is a 95% confidence and the other is a 90% confidence"***
*****
Longer form questions.
These questions are going to be phrased a little differently than you are used to. Instead of me asking for all of the information specifically, at this point you should be able to tell ME what information is important. You should be able to pick the important pieces, work with them, and give me results back. So don't think that because there aren't 10 parts that there isn't a lot of work to be done.
...there is.
Question Eight: Null Hypothesis are forever (unless they are rejected due to proper statistical use)
Nitrogen in Diamonds.
A sample of diamonds is tested for the amount of nitrogen in them. This helps scientists figure out how they were created. Below is the data--give me a 95% confidence interval for what you think the average nitrogen content is. (the numbers below are parts per million):
487 1430 60 244 196 274 41 54 473 30 98 41 273 94 69 262 120 302 75 242 115 65 311 61
You are not given the true population standard deviation, so use the sample standard deviation instead
Question Nine: Pining for an Answer
Aleppo Pine needles (cm):
10.2 7.2 7.6 9.3 12.1 10.9 9.4 11.3 8.5 8.5 12.8 8.7 9.0 9.0 9.4
Torrey Pine needles (cm):
33.7 21.2 26.8 29.7 21.6 21.7 33.7 32.5 23.1 23.7 30.2 29.0 24.2 25.5 26.6 28.9 29.7
Summarize numerically and make a graphic that shows the difference in the lengths of the two trees. If I handed you five needles of the following lengths, which type of tree do you think they would belong to? Explain the choice you make for each.
12.4
6.1
17.3
33
Question Ten: Credit where Credit is Due
A statistics professor at a large university believes that the students take an average of 15 credit hours per term. He samples 24 students in his class. We do not know the sd of the population, so use the sd of the class. Here are the results:
12 13 14 14 15 15 15 16 16 16 16 16 17 17 17 18 18 18 18 19 19 19 20 21
Analyze the data, and see whether or not the professors claim is reasonable. Include both a hypothesis test and a confidence interval. At the end of the question, include any possible biases or problems with the data collection.
Question Eleven: "I'll have a Coke!".
Your friend one day says "I can tell the difference between Coke and Pepsi. What's more, I can drink some of either and tell you which is which simply by the taste."
You think your friend is full of it. So you decide to set up an experiment to see whether or not your friend can tell the difference.
Your job is to completely outline, from first piece to last result, an experiment that will test your friends claim.
Things I need:
Materials List: what are all of the things you will need in order to run this experiment. I will only buy the things on this list and nothing else.
Cautions: what are some problems you want to make sure to watch for
possible examples:
want to make sure the person doesn't end up drinking 40 oz of soda in 20 minutes
can't see the labels as we pour them
flat soda
Procedure: This should outline what each person will be doing and how the experiment will go from the first piece to the last piece
Results: What will your results look like when you are finished (obviously you will not have the results yet, but you should know exactly what type of responses, results you will get)
Conclusion: What will be considered a success or a failure?
We will talk about these experiments on Tuesday, but I need your material list by Monday.
Question 12:
We are curious about Facebook use on Penn State Campus. Here are some numbers collected about two different places to live on campus. Note these are different groups--a person who counts for once a day is not included in the once a week data. In addition, we were only interested in people who used facebook, not non-users.:
3) What percent of University Park users say they use facebook at least once a week?
4) What percent of people who use facebook once a day are on the Commonwealth campus?
5) Find the expected values for each of these cells.
6) Find the chi-squared values for each of these cells. Which one is the largest? What does that mean? Which one is the smallest? What does that mean?
7) Find the sum of the chi-squared values.
8) Is there a difference between facebook users on one campus as opposed to the other? Explain how you know.