E. Predict It‎ > ‎

### 2. Graphical Association

In class, we will look at a TV special on hurricanes that has a few words omitted.  We will try to predict the topic of discussion.  Then, you will be assigned to a group to run the study on our own class.  Once assigned, open your PDF: Group 1 or Group 2.  Read the paragraph and answer the question on a piece of paper without showing a neighbor.

After our study is complete, we will re-watch the video with all parts included.  Then we will use the historical hurricane data (Excel file) to run our own study that verifies or denies the results of the TV clip.  You will work with neighbors to write your justification in a Google Doc.  Each group should make one copy of this doc and share it with your teacher and each other.  Points breakdown and expectations are embedded in the template document.

Mastery Quiz Prep

#### Using StatKey to graph two categorical variables (raw data)

For problems 1-7, answer the following:
a) List the 2 variables and whether they are categorical or quantitative
b) Using the appropriate chart/graph in StatKey, enter the data and observe the patterns.
c) Does there appear to be a relationship between the variables (are they dependent on each other) based on the graph/chart?  In the case of one categorical and one quantitative variable, does the average change based on the category?  EXPLAIN.
d) Take your best guess at the explanatory variable (the cause), and if not present, a possible lurking variable.

1. A server who suggests the most popular appetizer to customers at Restaurant A will make more appetizer sales than a server who asks “can I start you out with an appetizer?”  They tested this and here is the data:
Appetizer sales for servers who suggested the most popular appetizer over a weekend shift: 3, 3, 0, 5, 2, 1, 2, 0, 2
Appetizer sales for servers who asked “can I start you out with an appetizer?”: 4, 1, 0, 0, 2, 3, 0, 0, 2

2. Does gender affect grades?  A teacher had 6 boys with A's, 3 boys with B's, 10 girls with A's, and 5 girls with B's.

3. Does the age at which a person ceased smoking impacts his or her cumulative risk of lung cancer?  A study was performed to explore the link.  Data is listed as (age, risk of lung cancer), one person per ordered pair:
(0,0.2)  (30, 1.1)  (40, 2.6)  (50, 5.6)  (60, 11.1)  (75 15.7)

4. Does your favorite subject relate to whether or not you eat meat?
 Math English Social Studies Total Meat 32 41 33 106 No meat 8 19 17 44 Total 40 60 50 150

5. A student group decided to compare how well players did using two different strategies of building a card tower.  The subjects were first instructed on a specific method they needed to use for their tower and told them it was required to use this strategy.  Since the group didn’t want players to mix strategies, they tested two completely separate groups of people.  People volunteered to play and were randomly assigned to a strategy on the day of the experiment.  The results:
Strategy 1 (seconds required to build the tower): 33, 42, 59, 68, 73, 91, 33, 45
Strategy 2 (seconds required to build the tower): 73, 33, 49, 62, 65, 48, 47, 66

6. Does the annual average price of milk relate to the US GDP?  In 2003, milk cost \$2.76 and GDP was 2.3. In 2006, milk was \$2.56 and GDP was 2.2.  In 2009, milk was \$3.78 and GDP was 5.6.  In 2012, milk was \$3.70 and GDP was 5.7.

7. Some people think that the owner's gender is a good predictor of whether or not a car has new speakers or the standard (stock) speakers.  Some sample data is below:
 New speakers Stock speakers Total Male 18 34 52 Female 8 30 38 Total 26 64 90

Practice solutions
1. a) what the server says (categorical) vs. number of appetizer sales per server (quantitative)
b) use "one quantitative and one categorical"
c) yes -- the average changes quite a bit from the group with a suggestion to the group without
d) the suggestion is probably causing the sales to increase (but a possible lurking variable is the quality of the server -- it may both cause more sales and cause the person to suggest an appetizer)

2. a) gender (categorical) vs. grade (categorical)
b) use "two categorical variables", and create your table like so:
 Male Female A 6 10 B 3 5
c) no -- there is a .667 chance of getting an A in general, a .667 chance of getting an A if you are a boy, and a .667 chance of getting an A if you are a girl, so gender has no effect.
d) not relevant -- no linkage exists

3. a) when a person stops smoking (quantitative) vs. cumulative risk of lunch cancer (quantitative)
b) use "two quantitative variables" -- note the pattern in the data
c) yes!  There is a clear pattern formed in the scatter plot (upwards)
d) age of quitting explains risk of cancer, and the sooner you quit, the lower your risk of lung cancer

4. a) favorite subject (categorical) vs. if you eat meat (categorical)
b) use "two categorical variables"
c) sort of -- it is not exactly the same chances when broken down one subject at a time, but at least visually it doesn't seem convincing that the differences are more than randomness.
d) there really isn't any strong reason for a linkage, but maybe people who are more logical than feeling will both like math and not be opposed to eating meat

5. a) which strategy was used (categorical) vs. time to build the tower (quantitative)
b) use ."one quantitative and one categorical"
c) the means are almost identical, so the center doesn't really move, so we would usually say "no relationship". If you dig deeper, the two have a huge difference in spread -- strategy 2's standard deviation is much lower and its box plot is much more squished.
d) the strategy causes the differences in time -- we can conclusively say this because it was a carefully designed experiment (which you will learn about in a future module).
6. a)     Two quantitative variable
b)    Use the test of descriptive statistics for two quantitative variables
c)     Yes, there appears to be a relationship between the price of milk and the GDP based on the pattern on the scatterplot
d)    Looking at the graph, it seems that the GDP causes the price of milk to fluctuate. However, there could be a lurking variable such as inflation or scarcity of milk, so we cannot be 100% sure that the GDP explains the price of milk.

7. a)      Two categorical variables
b)      Use the test of descriptive stats for two categorical variables
c)       Yes, there appears to be a weak relationship between gender and type of speaker
d)      It seems that what gender you are could determine what kind of speakers you have in your car. It isn't strong enough to say that for sure, so we would have to do more research.

Practice quiz (we will go over answers in class as a group)
1. What graph/chart do you use to visualize two categorical variables?  What will it look like when two categorical variables are independent?  Dependent?
2. What graph/chart do you use to visualize two quantitative variables?  What will it look like when two quantitative variables are independent?  Dependent?
3. What graph/chart do you use to visualize the relationship between a categorical variable and a quantitative variable?  What will it look like when they are independent?  Dependent?

For problems 4-6, answer the following:
a) List the 2 variables and whether they are categorical or quantitative
b) Using the appropriate chart/graph in StatKey, enter the data and observe the patterns.
c) Does there appear to be a relationship between the variables (are they dependent on each other) based on the graph/chart?  EXPLAIN.
d) Take your best guess at the explanatory variable (the cause), and if not present, a possible lurking variable.

4. A study of runs scored in Major League Baseball looked at the number of runs each team scored in the NL vs. the AL.  Results are below (2014 regular season, stats from ESPN).  Is there a link between runs scored and league?
 AL: 773 757 729 723 715 705 669 660 651 637 634 634 633 629 612 NL: 755 718 686 682 665 650 645 629 619 619 615 614 595 573 535

5. Two baseball teams compared their number of home runs before and after the all-star break.  The Brewers had 94 before and 56 after, the Twins had 70 before and 58 after.  Is there a link between the team and the half of the season for home runs?

6. A group of students were asked their favorite single digit number (0 to 9) and their current math grade in percent.  The results are below.  Is there a link between favorite digit and math score?
 Student # 1 2 3 4 5 6 7 Fav Digit: 3 3 0 1 9 7 7 Math % 89 75 55 96 91 81 88

Notes

Future transition to Tuva: https://tuvalabs.com/k12/
Ĉ
Andy Pethan,
Oct 3, 2015, 1:58 PM
Ċ
Andy Pethan,
Oct 3, 2015, 1:58 PM
Ċ
Andy Pethan,
Oct 3, 2015, 1:58 PM