Question One: Midterms
I was looking at some of my old midterms and was trying to determine if they were connected to how people did in the class as a whole (I certainly would hope so).
Below are the grades in the class and the grades on the midterm in the .csv file.
1) Find the correlation between the sets of data.
2) Do a full analysis of the data, creating appropriate graphs, lines, and anything else that’s necessary.
3) Are there any possible influential points in this data? Create a graph that has all possible influential points a different color than the ones you do not suspect.
4) Remove the possible influential point and get your new correlation and line. If the correlation is better (and the line looks better in our data), you should consider removing the influential point.
5) If you removed the point, get a new line of best fit to answer questions 7, 8, and 9. If you did not choose to remove the point, use your old line of best fit and explain why you think that was the best course of action.
6) If a student had a 70 in my class, what do I expect them to get on the midterm?
7) If a student received a 80 on the midterm, what do I expect the grade to be in the class?
8) Give the residual for the point (93.345, 81).
Question Two: Hot Diggety Dog!
Below is a list of three different types of hot dogs: meat, beef, and poultry. Your goal is to use the appropriate methods and examine whether the average hot dog of any of the types has a different calorie count compared to any of the other types.
1. Check each data set for outliers using appropriate methods and remove values you see fit. Explain why you did so.
2. Compare the mean, sd and any other information to describe the spread of calorie count in each type of hot dog.
3. Create a graph that shows the differences between dog types.
HOTDOGS! (calories per dog)
Beef
186 181 176 149 184 190 158 139 175 148 152 111 141 153 190 157 131 149 135 132
Meat
173 191 182 190 172 147 146 139 175 136 179 153 107 195 135 140 138
Poultry
129 132 102 106 94 102 87 99 170 113 135 142 86 143 152 146
Question Three: z-scores <--if you did your homework, you already did this...
A set of data (call it DATA A) is N(100, 10).
1) Give the z-score if a data point is at 131.
2) Give the data point value if the z-score is -2.15
3) Give the data point value is 22% of the data is lower than our data point.
4) Give the data point value if 13% of the data is higher than our data point.
5) Between what values is the middle 50% of our data?
6) What percent of the data is between 70 and 107?
A second set of data (DATA B) is N(95, 15).
1) What score is better: 115 in DATA A or 120 in DATA B?
2) A data point has value 119 in DATA A. What value is equivalent in DATA B?