Mid-term questions:
All of these questions should be finished by Friday, April 20. These questions should be completed on your own. Show your work for any and all questions. If you use r for a question, be sure to show what you entered, what graphs you created, and the overall information you are trying to show.
Question One:
Below is a list of three different types of hot dogs: meat, beef, and poultry. Your goal is to use the appropriate methods and examine whether the average hot dog of any of the types has a different calorie count compared to any of the other types. Give appropriate values, make sure to check for outliers, etc. etc.
HOTDOGS! (calories per dog)
Beef
186 181 176 149 184 190 158 139 175 148 152 111 141 153 190 157 131 149 135 132
Meat
173 191 182 190 172 147 146 139 175 136 179 153 107 195 135 140 138
Poultry
129 132 102 106 94 102 87 99 170 113 135 142 86 143 152 146
Question Two:
For each of the following, give an example and explain your choice. This may be from a set of data we have used or a more abstract idea or data set you have seen or used. Each example should be different for each piece (in other words, don't repeat).
a. A situation where you would want to use mean and standard deviation.
b. A situation where the five point summary would be a good description of the data.
c. A situation where looking at the mean and standard deviation would hide the true nature of the pattern or data.
d. A situation where the use of a normal quantile plot would illuminate a situation and help us make better choices as to outliers or data decisions than we would otherwise.
Question Three:
Find table 1.5 on page 76 (you can read about the information in question 1.167)--and the CD in the back of your book actually has the data in excel form on it as well if you don't want to type all of them in. The book states “…they found the length of the cycle is not always 24 hours.” While this may be the case, can we claim that the population mean of the biological clock is different from 24 hours?
a. Set up a null hypothesis, alternate hypothesis, and a significance level.
b. Run a statistical analysis on the biological clock data. Remove any outliers (if any). Make sure to make appropriate graphs. Explain what this analysis tells you about biological clocks for this plant.
c. Write a brief paragraph explaining what we can claim based upon the data and our hypotheses.
d. What do you think might cause variation in the different plant biological clocks?
Question Four:
A group of radon detectors are exposed to a radiation level of 105 picocuries. Here are the readings given by the radon detectors:
91.9 97.8 111.4 122.3 105.4 95.0 103.8 99.6 96.6 119.3 104.8 101.7
Based on this sample, do we believe that the radon detectors are working accurately? Do an analysis of the experiment and the data, and turn in a concise, clear, and complete report. In this case, be sure to include a 95% confidence interval of what the radon detectors will read when exposed to 105 picoliters.
Question Five:
Your final project is going to comprise of several pieces:
· Collecting your data, whether through experiment or through other sources.
· Using statistical methods to analyze that data for a specific goal that you are interested in
· Create a visual for your data that will allow others to see the information you have worked on/are collecting.
This project is what you are going to be working on in labs for the final three labs (basically the month of May). This means that I’m expecting a good amount of work for it. As such, we need to outline what it is we are going to be doing.
a) Create a goal and hypothesis for your project.
b) Will you need to do an experiment? If so, outline the procedure and cautions you will need to do for this.
c) If no to part b, where are you going to get your data? Be specific on what you can get and how—this is sometimes the hardest part of the entire project.
d) Once you have your data, what analyses are you going to run? I know we haven’t talked about bivariate data (one set of data affecting the other). But you know that it is possible and we will be talking about it starting 4/17/2012.
e) Once you have run your analysis and compared it to your goal and your hypothesis, how are you going to show it to other people? What is it that you are going to make that is going to show your data to others and make them interested in it?
Question Six:
Here's the open question:
a. Find a data set from somewhere that can be used to answer a univariate question. Specify where you got the data, how the data was collected, what the relevance of the data is, and any possible bias that might have existed in the collection of the data.
b. Analyze the data in an appropriate manner. This can either mean set up a null and alternate hypothesis, or develop a confidence interval that is useful for looking at the data or answering a specific question about the data
c. Having analyzed the data, what conclusions can you reach?
d. Create a graphic of some type that shows the information you analyzed cleanly, clearly, and concisely. Make it something you would want to show others (what this also means is that you should pick a data set that is interesting to others as well: http://lib.stat.cmu.edu/datasets/Arsenic this data on arsenic in toenails may not be the best choice =\. Then again . . .).
If you're looking for data sets, there are several websites that I’ve posted at: sites.google.com/site/mundtmath/creation-of-statistics . I will be adding more of them as this week continues. If you know more of them, please pass them along.