This assignment will be due midnight, November 5. Again, you may work with others, but make sure to turn in something you can call your own. Give appropriate code, information, etc for each question. Good luck, and if you have questions, please let me know.
Question One:
Your final project is going to comprise of several pieces:
· Collecting your data, whether through experiment or through other sources.
· Using statistical methods to analyze that data for a specific goal that you are interested in
· Create a visual for your data that will allow others to see the information you have worked on/are collecting.
This project will focus on data that you collect either through an experiment or through a survey. For now, these are the questions that we need answered:
a) What is the data you would like to collect? What is your hypothesis about this information?
b) What is your plan to collect this data? How long will this take you?
c) Make a list of things that you will need to be aware of when collecting your data (for example, avoid sampling in front of the dining hall when talking about food on campus)
d) Once you have your data, how do you plan on presenting it to the world?
Question Two:
On page 188, Table 7.1 compares brain response in a monkey when listening to a "pure tone" as opposed to a monkey call. It is suggested that a monkey's response to a monkey call should be higher (stronger) than that of a "pure tone".
a) Create a 95% confidence interval for both the tone data and the call data and compare them. As we do not have the population standard deviation for either, use instead the sample standard deviation. Make sure to check for outliers in both sets of data, etc. etc.
b) It is suggested that a high response in the monkey call should correlate with a high response in the pure tone. Create a graph and run the appropriate tests to either confirm or deny that suspicion.
Question Three:
When taking the SAT, they say the phrase "if you cannot eliminate an answer, it is better to leave a question blank than to guess". The reasoning is:
if you leave a question blank, you get a score of 0 on it.
if you get a question wrong, you get a score of -0.25 .
if you get a question right, you get a score of 1.
There are 5 possible responses to each multiple choice question, only one of which is right.
My question to you:
a) If one were to guess on 20 multiple choice questions in a row, what are your possible scores, and what are the chances of each of those outcomes?
b) Based upon this information, do you think it would be better to leave those questions blank or to guess on all 20? Explain your reasoning using the information you got from part a as well as any other information you would like to pull in. Feel free to make graphs to help you out as well.
Question Four:
If a random number generator truly is random,if it sampling from 0 to 100, it should tend towards having a mean of 50 and a population sd of 28.87.
Use r or another random number generator and create a list of 50 random numbers. (feel free to use one from the internet that you think may or may not have a claim to being truly random)
Use those numbers to determine whether or not you think the generator is truly random or not. Make sure to follow our hypothesis testing order and be sure to explain each step as you do it.
Question Five:
In the book, complete questions 16.44 and 16.46. You may ignore where it says "follow the four step process for confidence intervals". Instead, answer the questions using the process and methods we have been using (check for outliers, find the shape, use the formulas, etc. etc.)