... Like people with rare conditions (e.g. diseases) or attributes (e.g. nomads, homeless persons, etc.) that constitute less than 1 percent of the population.
The sample sizes for individuals with rare conditions are usually small and cannot be identified from any comprehensive sampling frame. Thus, a sampling frame may have inadequate coverage of individuals with rare conditions and a direct estimation of the characteristics of that domain may not be possible. Conventional survey sampling techniques can definitely not handle situations like this. Conventional techniques can only be used if there is a separate sampling frame for rare population with up-to-date information and complete coverage. Even special probability sampling methods may not be able to handle these situations.
If we know of sampling frame on which these are comprehensively covered, standard screening methods may be used if it is not too costly to ascertain that an individual has a rare condition. Face-to-face, telephone and mail surveys are some of the cost-effective screening methods. Data collection is usually implemented in two or more phases. Multistage cluster sampling can be used. This approach begins with an imperfect classification and an initial screening of survey questionnaires, followed by a more accurate identification of individuals with rare conditions. In order to be cost-effective, screening by proxy (focused enumeration) can also be used in areas with low concentrations of individuals with rare conditions to inquire from the sampled respondents about the presence of individuals with rare conditions in their vicinity (n addresses to the left and right).
Multiple frames can be used to improve the coverage of individuals with rare conditions. Under this approach, a supplementary sampling frame that covers a sizable proportion of individuals with rare conditions is added to the original sampling frame. In this situation, the original sampling frame has good coverage but low proportions of individuals with rare conditions and the supplementary sampling frame may have incomplete coverage but a high proportion of individuals with rare conditions.
Disproportionate sampling and network / multiplicity sampling can be used if individuals having the rare conditions can be easily identified within the sampling frame. Because the sampling fraction for individuals with rare conditions may be very small and very different from individuals in other groups, disproportionate sampling is appropriate because it can handle situations involving the error / approximations that are inherent in complex / multistage design. As such, it can be used to oversample individuals with rare conditions in order to ensure sufficient number of cases in each sample subpopulation and to see how the results from each subgroup relate to the general population being sampled. Individuals with rare conditions can be added to the original sampling frame through network / multiplicity sampling. Under this approach, a counting rule is used to link individuals to a specific group (e.g. household, organization, etc.). Proxy respondents are used to supply the screening information for individuals with rare conditions that are linked to them in one way or another (e.g. spouse, parents, siblings, grandparents, grandchildren, etc.).
Location sampling can also be used if individuals having the rare conditions are concentrated in designated locations or if the individuals are highly mobile and have no fixed dwellings (e.g. nomads, homeless individuals, air passengers, etc.).
Panel data and repeated surveys can be used to examine how certain events (e.g. births, abortion, divorce, diagnosis, medical procedure) among individuals with rare conditions change over time. Individuals with various kinds of rare attributes/conditions can also be identified from individuals who are recruited into the panel.