Learning Targets
I can describe ways to get a random sample from a population.
I know that selecting a sample at random is usually a good way to get a representative sample.
A sample is selected at random from a population if it has an equal chance of being selected as every other sample of the same size. For example, if there are 25 students in a class, then we can write each of the students' names on a slip of paper and select 5 papers from a bag to get a sample of 5 students selected at random from the class.
Other methods of selecting a sample from a population are likely to be biased. This means that it is less likely that the sample will be representative of the population as a whole. For example, if we select the first 5 students who walk in the door, that will not give us a random sample because students who typically come late are not likely to be selected. A sample that is selected at random may not always be a representative sample, but it is more likely to be representative than using other methods.
It is not always possible to select a sample at random. For example, if we want to know the average length of wild salmon, it is not possible to identify each one individually, select a few at random from the list, and then capture and measure those exact fish. When a sample cannot be selected at random, it is important to try to reduce bias as much as possible when selecting the sample.
For each situation, discuss:
Would the different methods for selecting a sample lead to different conclusions about the population?
What are the benefits of each method?
What might each method overlook?
Which of the methods listed would be the most likely to produce samples that are representative of the population being studied?
Can you think of a better way to select a sample for this situation?
Lin is running in an election to be president of the seventh grade. She wants to predict her chances of winning. She has the following ideas for surveying a sample of the students who will be voting:
Ask everyone on her basketball team who they are voting for.
Ask every third girl waiting in the lunch line who they are voting for.
Ask the first 15 students to arrive at school one morning who they are voting for.
A nutritionist wants to collect data on how much caffeine the average American drinks per day. She has the following ideas for how she could obtain a sample:
Ask the first 20 adults who arrive at a grocery store after 10:00 a.m. about the average amount of caffeine they consume each day.
Every 30 minutes, ask the first adult who comes into a coffee shop about the average amount of caffeine they consume each day.
What are some important things to consider when getting a sample?
People often have biases that may lead them to over- or under-represent some groups in their sample whether the biases are obvious or not. Example: Sending a survey for people to respond to questions… you may not reach people who do not have email addresses!
Due to the (sometimes hidden) biases, the best method for selecting samples is to remove as much of the personal selection as possible.
In the rest of this lesson, we will explore methods for generating samples that avoid biases.
There were a total of 35 straws in the bag. Suppose we put the straws in order from shortest to longest and then assigned each straw a number from 1 to 35. For each of these methods, decide whether it would be fair way to select a sample of 5 straws. Explain your reasoning.
Select the straws numbered 1 through 5.
Write the numbers 1 through 35 on pieces of paper that are all the same size. Put the papers into a bag. Without looking, select five papers from the bag. Use the straws with those numbers for your sample.
Using the same bag as the previous question, select one paper from the bag. Use the number on that paper to select the first straw for your sample. Then use the next 4 numbers in order to complete your sample. (For example, if you select number 17, then you also use straws 18, 19, 20, and 21 for your sample.)
Create a spinner with 35 sections that are all the same size, and number them 1 through 35. Spin the spinner 5 times and use the straws with those numbers for your sample.
A representative sample would have more of the common lengths, and there is also a higher probability of selecting these lengths, so a random selection should be a good way to select a representative sample.
A random sample does not guarantee a representative sample, but it avoids methods that might over- or under- represent items of the population. Since we do not know the data for the population, a random sample usually provides the best opportunity to get a representative sample.
While it is the most ideal method, it is not always possible to generate a random sample. For example, if you wanted to know the average size of salmon in the wild, it is impossible to know how many there are, to identify them individually, select a few randomly, and capture and measure them. In these cases, it is important to try to intentionally reduce bias as much as possible when selecting a sample.
A public health expert is worried that a recent outbreak of a disease may be related to a batch of spinach from a certain farm. She wants to test the plants at the farm, but it will ruin the crop if she tests all of them.
If the farm has 5,000 spinach plants, describe a method that would produce a random sample of 10 plants.
Why would a random sample be useful in this situation?
What makes a sample selected at random the best way to select individuals for a sample?
It avoids biases that might be introduced using other methods.