S4 Specifying Confidence Levels and Interval Sizes

By increasing the sample size, we can reduce the size of the confidence intervals, but the size goes down in proportion to 1/sqrt(N). Quadrupling the sample size reduces the interval by 50%. How much precision should we seek in a survey? How large a Standard Error is acceptable?

The answer depends heavily on pragmatic real world considerations. The first and most important question is: WHY are we carrying out the survey? What do we hope to gain from the knowledge that we acquire in the survey? The sample size must be chosen in order to fulfill the real world objectives. Below we discuss issues that must be taken into consideration in making such choices.

FIRST BEST, IDEAL SOLUTION: The first best, fully ideal, outcome is to have perfect precision: Exact knowledge of the true value with 0 error and 100% confidence. This first best CAN be achieved by a COMPLETE survey -- getting exact knowledge of every member of the population. When this is feasible, then this is the best thing to do.

IMPORTANT NOTE: Students are told so many good things about random samples that they sometimes want to take a random sample even when the total population is quite small. For example, if there are 30 banks, then one SHOULD NOT take a random sample of 15 -- one should carry out a survey of the ENTIRE population. This is always first best, and should be done when possible. When the population is small, then random sampling is not very useful.

COMPROMISES: When the population is large, and sampling requires effort, it will usually not be practically possible to carry out a complete survey of all members. Then random samples can be used to obtain approximate results with relatively high levels of confidence. Since such compromises are forced on us by pragmatic consideration of the impossibility (or the extremely high expense) of carrying out a full survey, what we choose as a compromise also depends on pragmatic considerations like the cost of carrying out a survey.

SURPRISING FACT: The accuracy of a random sample depends on the size N of the random sample, but it DOES NOT depend on the size of the population. Our sample size of 400 will produce with 95% confidence, estimates within 5% of the true value if the population has 1000 people and also if the population has 1 Million people. This seems intuitively surprising; one would think that a larger population would need a larger sample for equal accuracy. As David Freedman explains it, one can learn the composition of a liquid mixture of chemical from one drop of the solution, provided that the solution is WELL MIXED so that every drop is representative of the whole. In sampling from a large population, the difficult part is to ensure that EVERY member is included with equal probability -- this means that the difficulty DOES INDEED increase with the sample size -- but the difficulty is associated with the drawing of a random sample: HOW to ensure equal probabilities for all members. Once the sample is drawn, the accuracy with which the sample represents the population does not depend on the population size, but only on the sample size (and the accuracy of the random sampling process in ensure equal probability for all, thereby creating a well mixed solution).

Some practical advice from Tools 4 Dev: See link at bottom of page for MORE DETAILS and REFERENCES:

The minimum sample size is 100

Most statisticians agree that the minimum sample size to get any kind of meaningful result is 100. If your population is less than 100 then you really need to survey all of them.

A good maximum sample size is usually 10% as long as it does not exceed 1000

A good maximum sample size is usually around 10% of the population, as long as this does not exceed 1000. For example, in a population of 5000, 10% would be 500. In a population of 200,000, 10% would be 20,000. This exceeds 1000, so in this case the maximum would be 1000.

Even in a population of 200,000, sampling 1000 people will normally give a fairly accurate result. Sampling more than 1000 people won’t add much to the accuracy given the extra time and money it would cost.

Choose a number between the minimum and maximum depending on the situation

Suppose that you want to survey students at a school which has 6000 pupils enrolled. The minimum sample would be 100. This would give you a rough, but still useful, idea about their opinions. The maximum sample would be 600, which would give you a fairly accurate idea about their opinions.

How to choose a Sample Size? — Pragmatic rules of thumb, without analytics and formulae -- references to heavier stuff provided

Sample Sizes: A Rough Guide — More detailed analytic description of how to choose a sample size for a survey.

Survey Research Handbook — A Complete Guide to Survey Research (reviewed)