Question 1:
Your job in the groups I've assigned, is to select seven questions from the surveys you all created, polish them so they are perfect, and create a one page survey. The survey should focus on this class, specifically on whether learning is happening, whether the division of class into labs and lectures is useful, and how to better improve the class for the future.
You may use questions that are not on your surveys, but these will get you started.
Once you have the survey perfected in questions and typography print a copy. We will take these a little late in class and you will use these and collect results.
Question 2:
Below is a .csv file for all MLB batters who had at least 200 plate appearances in 2009. Your goal is to do the following:
Select one set (column) of data that appears normal when considering all of the players. Give a summary, run all necessary tests for outliers, and make a good looking graph that you would want to show other people.
Select one set of data (column) that is skewed in some way. Give a summary, run all necessary steps for outliers and make a good looking graph that you would want to show other people.
If there were any outliers, identify the person who represents that number using the .csv file.
Question 3:
Select two players at random from the data from question two.
Explain how you randomly selected these two baseball players.
to get a random number from r, try: sample(data, 1) . This selects one object from your list or set of numbers, whatever you have in data.
Using the two sets of data you used above, give me both Z and p scores for these two players for the two categories. show me your work.
Comparing these two players you selected at random, convince me using the numbers available which player is better. You may use all of the data from those players or just those that you used in question 2. This part is fairly open ended. If you do not know baseball, find someone who does.
Question 4:
Is there anything you would change about your surveys now that you have given them and received the data back?
Run an analysis on the questions and the results you have gotten. This is open ended for a reason. Use everything you know to generate your results. Check the extension for what you will be using this analysis for.
Question 5:
I went through and took each of the boxes we used in class Monday an turned it into a list in r for you:
boxes<-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,12,4,5,12,8,4,4,16,4,5,10,4,5,4,10,16,16,8,6,4,5,9,10,3,12,4, 10,12,10,6,16,16,8,4,5,18,4,3,9,12,16,3,6,8,4,2,5,18,4,12,4,12,8,3,16,5,9,6,10,3,18,8,10,16,6,15,8,4, 18,10,4,2,5,8,16,6,9,12,4,9,18,8,8,8)
We want to be able to take a sample of these boxes, like we did, but we want to use it in an easier way:
sample(boxes,10)
This will give you a random sampling of 10 boxes from the data set above.
mean(sample(boxes,10)) . Describe what this code does.
I want to collect a lot of different means and compare them. In order to do this, I wrote a little bit of code for you:
samplebox<-c()
for(i in 1:1000){samplebox<-c(samplebox,mean(sample(boxes,10)))}
samplebox
this creates a new type of code: a for loop. It will do anything in the brackets once for each time in the parenthesis. In other words, this for loop will find the mean of 10 boxes, put that into the new list and then do that again and again and again. 1000 times.
find the summary for the list samplebox. What is this a summary of and what does it tell us?
create a histogram for samplebox, and describe the shape.
Look for any outliers using appropriate methods.
compare the following:
mean(boxes)
mean(samplebox)
Extension:
Create an executive summary for the data from your seven questions. In other words, Imagine you have a fairly stats illiterate boss and you want to explain your findings. Simple graphs and outlines of numbers are a great place to be. Give a recommendation for the class based off of each of the questions you asked.
All of the information you want to give SHOULD fit on to one page.