Questions from the book:
1.90, 1.91
Question 1:
Below are the survival times of 72 guinea pigs after they were injected with an infectious bacteria in a medical experiment.
guinea pigs: 43, 45, 53, 56, 56, 57, 58, 66, 67, 73, 74, 79, 80, 80, 81, 81, 81, 82, 83, 83, 84, 84, 88, 89, 91, 91, 92, 92, 97, 99, 99, 100, 101, 102, 102, 102, 103, 104, 107, 108, 109, 113, 114, 118, 121, 123, 126, 128, 137, 138, 139, 144, 145, 147, 156, 162, 174, 178, 179, 184, 191, 198, 211, 214, 243, 249, 329, 380, 403, 511, 522, 598
a) get the data into r. Describe the shape of the data--what does it tell us about the infected indiviuals?
b) Decide whether to check for outliers using the IQR method or the sd method. Explain your steps and reasoning for your decisions, including any needed graphs.
c) what points have you decided are outliers that should be removed? Did you find any possible outliers you chose to keep? Give your reasoning, and include any graphs or computations that you did to arrive at your conclusion.
Question 2:
The data for this question can be found on page 49, question 1.86
a) Search each of the data sets for possible outliers, using the methods we have discussed. Allow me to follow your steps in the write up.
b) Compare and contrast the differences in the data sets.
c) Create some type of graphic that allows us to compare all of the data sets at once.
consider the following:
boxplot(redflower, at=0.2, boxwex=0.2, col="red")
boxplot(yellowflower, at=0.4, boxwex=0.2, col="yellow", add=TRUE)
that's not the only way to do it, but it gives you a few more commands to play with.
Question 3:
From your classmates in the room, find out one piece of (appropriate) quantitative data. For those of you not in the classroom, use 25 people around you.
a) Give me the data set as you typed it into r
b) Were there any problems with the collection of your data? Any surprises in the data collected? Is there any reason to believe that the data you collected is not accurate?
c) Using the appropriate method, check your data for influential points. If you believe them to be outliers, remove them from your data set.
d) Create some type of appropriate visual that will allow us to see your data.