1. Single Question Statistics

      • [3 pts] Use Google Forms to create a 10 question survey in groups of three.  Use at least one of each of the following types of questions: text response (say anything), text response (respond with a number of something), scale, multiple choice, and true/false.
        • After surveying at least 12 people, discuss the spreadsheet of results with teammates.  Identify each row as a person who took the survey (an subject/individual) and each column as a categorical variable (limited options), a quantitative variable (numbers), or a qualitative variable (such as everyone different with a text response)
        • Look over the Google Forms summary page.  Note how different types of data are displayed.  Scale questions use a bar to show the results in order from 1-5 (or whatever numbers your scale uses), multiple choice and true/false questions use a pie graph to show which percent each of the options received, and the open text responses did not have any graphs.  Note that numerical responses can be turned into graphs, but Google Forms does not yet support this.
        • The more open the response, the larger variety of possible answers, but it is hard to summarize the data.  The more closed the response, the easier it is to analyze the data, but the more likely it is that some people will not agree with any of the options or will agree with a couple options equally.
      • DISCUSSION: 
        • Why different types of questions have different purposes
        • How data is organized in a spreadsheet
        • Why they each are visualized in different graphs (and why some can't be visualized).

      Free Response Prep
            Describe the pros and cons of open-ended survey responses and when this response type should be used.

            Consider the types of answers that people may give to an open response (blank text box) vs. closed response (scale, multiple choice) question.  Are the responses what you expect?  How do you analyze the data that comes in?

            A spreadsheet is a powerful way to organize survey results.  Explain how it is setup using the terms "individual" and "variable" in your response.  Why is this a useful tool?

            A spreadsheet is a two dimensional array for organizing data.  What goes across in each row?  What goes down each column?  What could you do with survey results collected into a single document like this?

            See the video below for more on individuals and variables and their different forms.  Try the practice problem to make sure you understand it.

            1.  Label each of the following as a categorical variable, quantitative variable, individual, or subject:
            • a. A dog
            • b. Weight
            • c. Number of friends
            • d. Your friend Sam
            • e. Preferred political party
            • f. You

            What kind of data is represented in a bar graph or pie chart?  Give an example.  What distinguishes bar and pie charts from each other (use "relative frequency" in your explanation)?

            See the video below for more on bar and pie charts, their differences, and relative frequency.  Complete the practice problems to solidify understanding.

            2.  A new club has 3 freshmen, 5 sophomores, 11 juniors, and 4 seniors.
            • a. Create a frequency and relative frequency chart of these results
            • b. Create a bar graph of the data using frequency as the vertical axis label
            • c. Create a pie chart of the data
            • d. What do you think each graph best emphasizes?  Explain.
            • e. Imagine that the bar graph started with 2 as the lowest frequency instead of zero.  How would this change the perception of the graph?
            3.  I took a survey on people’s favorite type of chocolate.  Here is the raw data I received back:
            Dark, milk, milk, milk, dark, caramel, caramel, almonds, milk, dark, dark, almonds, dark, milk, caramel, dark, dark, caramel, dark, milk, caramel
            • a. Create a frequency chart of this data
            • b. Add a column for relative frequency
            • c. Create a bar graph of the data using relative frequency as the vertical axis label
            • d. Explain the difference between a histogram and a bar graph (they are NOT the same)
            • e. Create a pie chart of the data
            • f. If you want to persuade others that dark is more popular than milk chocolate, which graph is more convincing?  Explain. 
            Practice solutions
                1. Label each:
                • a) individual (a dog is a noun)
                • b) quantitative variable (describes an individual, is a number)
                • c) quantitative variable (describes an individual, is a number)
                • d) subject (a noun, specifically a human, thus a subject)
                • e) categorical variable (describes a subject, is a set of options)
                • f) subject (a noun, specifically a human, thus a subject)

                2. A new club:
                • a) Note how I decided to put these categories NOT in order from most to least frequent.  This is because we associate these categories as having an inherent order (9th grade then 10th grade and so forth).  Thus, we logically expect to see graphs with these kinds of order categories in their normal order.
                   Class Frequency Relative frequency
                   Freshmen 3 3/23 = .13 or 13%
                   Sophomores 5 5/23 = .22 or 22%
                   Juniors 11 11/23 = .48 or 48%
                   Seniors 4 4/23 = .17 or 17%

                • b) Be aware of the required titles/labels, as a graph without meaning is quite useless.  Also notice the order of the categories -- this is important because there is an inherent order in classes (grades 9-12).

                • c) Again, be aware of the required titles/labels.  Also notice that for the pie chart, I did put the categories in order of greatest to least.

                • d) Bar graph: juniors are the largest group by a bit, then sophomores, then seniors, then freshmen -- the specific order is more clearly noticeable and the counts of numbers of students are easy to see
                  Pie chart: nearly half of the club is juniors and the other classes make up roughly equal parts of the remaining space
                • e) If you made the bottom of the bar graph start at 2, it would look like there are twice as many seniors as freshmen, 3 times as many sophomores, and 9 times as many juniors.  This deceives the reader.  Unless the goal is to zoom into a graph or present a biased picture, graphs should start at zero.

                3. Mmmmm chocolate:
                • a/b) 
                   Type of chocolate  Frequency  Relative frequency
                   Dark  8  8/21 = .38 or 38%
                   Milk  6  6/21 = .29 or 29%
                   Caramel  5  5/21 = .24 or 24%
                   Almonds  2  2/21 = .10 or 10%
                • c) Notice the label "relative frequency" on the left, the label "chocolate type" on the bottom, the option title below each bar, and the large graph title at the top.  All are necessary.

                • d) A bar graph is for categorical data, while a histogram is for quantitative data.  In one of the videos I describe the ordered bar graph for the data from a 1-5 scale question as "almost" a histogram because it is almost quantitative data.  Visually, a bar graph has spaces between bars and a histogram does not.  Finally, something you will learn in the next section is that there are many possible histograms for a set of data depending on how you choose to set it up.
                • e) Notice the labels of each option inside the circle -- you may either do this or create a legend on the side.  It is also recommended to mark the percentage in the circle.

                • f) Though not the only acceptable answer, I think the bar graph makes a stronger case than the pie chart for highlighting differences.  In a pie chart, it is harder to see small differences.